JPH08251540A

JPH08251540A - Video summarizing method

Info

Publication number: JPH08251540A
Application number: JP7046970A
Authority: JP
Inventors: Shin Yamada; 伸山田; Katsuhiro Kanamori; 克洋金森
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1995-03-07
Filing date: 1995-03-07
Publication date: 1996-09-27
Anticipated expiration: 2016-12-25
Also published as: JP3240871B2

Abstract

PURPOSE: To summarize automatically video by solving a problem that a desired content cannot be grasped when the video summarized by the user is viewed so as to provide efficiently a content among the operation of an object with respect to the method summarizing and displaying the video to be used for a device supporting retrieval, editing, processing and glance of the video. CONSTITUTION: The frame image of a video signal received by a video disk device 1 or a VTR 2 is received by a frame memory 3, a 1st computer 5 gives a control signal to the frame memory 3 to receive a frame image and the video is summarized by processing a time series frame image and date are stored in a file server 6. A 2nd computer 7 calls a summarized video from the file server 6 on request by the user and reproduces the video.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、映像の検索、編集、加
工、早見などを支援する方法に係り、特にビデオテープ
やビデオディスクに格納された映像を要約して、再生ま
たは表示をする映像要約方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for supporting search, editing, processing, and quick viewing of videos, and more particularly to videos that are reproduced or displayed by summarizing videos stored on videotapes or videodiscs. It is about the summarization method.

【０００２】[0002]

【従来の技術】近年、計算機などを応用して、映像の検
索、編集、加工、早見などを支援する方法に関する研究
が盛んになってきている。その一例として、ビデオ、映
画の映像の内容を短時間に把握するために、映像の一部
分や映像全体を短時間で再生する映像要約方法がある。2. Description of the Related Art In recent years, researches on methods for assisting retrieval, editing, processing, and quick viewing of images by applying a computer have become popular. As an example, there is a video summarization method of reproducing a part of the video or the entire video in a short time in order to grasp the content of the video or the video of the movie in a short time.

【０００３】従来の方法としては、早送り再生法、特開
平４−２３７２８４に記載の映像内容圧縮表示処理方
法、ショット毎可変速再生法（大辻、外村、「動画像高
速ブラウジングの主観評価」、１９９３年電子情報通信
学会春季大会、ＳＤ−９−３）、ショット毎ラッシュ再
生法が提案されている。なお、ショットとは、映像編集
などの分野でしばしば使われる映像の単位であり、映像
内容の最小単位に近い。As a conventional method, a fast-forward reproduction method, a video content compression display processing method described in Japanese Patent Laid-Open No. 4-237284, a variable speed reproduction method for each shot (Otsuji, Tonomura, "Subjective evaluation of high-speed moving image browsing", 1993 Spring Meeting of the Institute of Electronics, Information and Communication Engineers, SD-9-3), a rush reproduction method for each shot is proposed. Note that a shot is a unit of video often used in fields such as video editing, and is close to the minimum unit of video content.

【０００４】従来の早送り再生法は、フレーム画像を一
定時間間隔で間引いて再生する映像要約方法である。The conventional fast-forward reproduction method is a video summarization method in which frame images are thinned out and reproduced at regular time intervals.

【０００５】映像内容圧縮表示方法は、時系列のフレー
ム画像間の変化量を用いて、各フレーム画像の表示重要
度を判定し、重要度の高いフレーム画像ほど表示に多く
の時間を割く映像要約方法である。実用化のためには、
フレーム画像の表示重要度を判定する部分が重要であ
る。The video content compression display method uses the amount of change between time-series frame images to determine the display importance of each frame image, and the video summary that spends more time displaying the frame images with higher importance. Is the way. For practical use,
The part that determines the display importance of the frame image is important.

【０００６】隣接フレーム画像間の画素単位の輝度変化
量は、画面内の物体の動きに敏感である。そこで、動き
の少ないところを飛ばし、動きの多いところはゆっくり
みたい場合に用いる表示重要度判定方法として、この画
素単位の輝度変化量を上記表示重要度とみなす方法が提
案されている。The amount of change in luminance in pixel units between adjacent frame images is sensitive to the movement of an object on the screen. Therefore, as a display importance level determination method used when skipping areas with little motion and slowing areas with many motions, a method has been proposed in which the brightness change amount in pixel units is regarded as the display importance level.

【０００７】一方、隣接フレーム画像間のフレーム単位
の輝度変化量は、ショット内での物体の移動には比較的
鈍感で、ショットが変化するときのように、フレーム全
体の輝度分布傾向が変化するような場合に大きな値が出
る。そこで、ショットの変化を注意して見たい場合に用
いる表示重要度判定方法として、このフレーム単位の輝
度変化量を上記表示重要度とみなす方法が提案されてい
る。On the other hand, the amount of change in brightness in frame units between adjacent frame images is relatively insensitive to the movement of an object within a shot, and the tendency of brightness distribution in the entire frame changes as when the shot changes. In such a case, a large value is given. Therefore, as a display importance level determination method used when it is desired to watch changes in shots carefully, a method has been proposed in which the brightness change amount in units of frames is regarded as the display importance level.

【０００８】従来のショット毎可変速再生法は、各ショ
ットの表示時間を一定にするように再生速度を制御しな
がら、映像を再生する映像要約方法である。The conventional variable speed reproduction method for each shot is a video summarization method for reproducing a video while controlling the reproduction speed so as to keep the display time of each shot constant.

【０００９】また、ショット毎ラッシュ再生法は、ショ
ットの先頭部分を標準速度で次々に再生する映像要約方
法である。The shot-by-shot rush reproduction method is a video summarization method in which the head portion of a shot is reproduced one after another at a standard speed.

【００１０】最後に、ショットについて説明を補足して
おく。一つのビデオカメラで時間的に連続して撮影され
た部分をショットと呼ぶ。ショットは、前述したよう
に、映像内容の最小単位に近い。また、編集でつないだ
部分、ビデオカメラの撮影を中断した部分が「ショット
の変化」となる。Finally, the shot will be supplemented. A portion that is continuously shot with one video camera is called a shot. The shot is close to the minimum unit of the video content, as described above. In addition, the portion that is connected by editing and the portion where the video camera has stopped shooting are "changes in shot".

【００１１】ただし、パン、ズームなどのカメラ操作に
よって映像内容が変化する部分を、例外として、「ショ
ットの変化」とみなす場合もある。この場合には、例外
を考慮しない場合に比べて、ショットが映像内容の最小
単位に近くなる。However, there is a case where a portion where the image content changes due to a camera operation such as pan and zoom is regarded as a "shot change" as an exception. In this case, the shot is closer to the minimum unit of the video content as compared with the case where the exception is not considered.

【００１２】映像を自動的にショットに分割する方法と
しては、映像変化モデル法（山田、藤岡、金森、松島、
坂内、「編集効果を含む映像のシーンチェンジ検出方
法」、マルチメディアと映像処理シンポジウム'94）な
どが提案されている。As a method of automatically dividing an image into shots, an image change model method (Yamada, Fujioka, Kanamori, Matsushima,
Sakauchi, "Scene change detection method of video including editing effect", multimedia and video processing symposium '94), etc. are proposed.

【００１３】[0013]

【発明が解決しようとする課題】しかしながら上記の早
送り再生法では、映像内容に関係なく一定速度で再生す
るので、「主観的にみて再生速度が速い部分、遅い部
分」が存在し、内容把握が困難で、かつ、ユーザーが疲
れやすいという課題を有していた。However, in the fast-forward reproduction method described above, since reproduction is performed at a constant speed regardless of the video content, there are "subjectively high and low reproduction speed portions", and it is difficult to grasp the content. The problem is that it is difficult and the user is tired easily.

【００１４】また、上記の映像内容圧縮表示方法では、
表示重要度の判定方法として、隣接フレーム画像間の画
素単位の輝度変化量を用いる方法と、フレーム単位の輝
度変化量を用いる方法しか提案されていなかった。前者
を用いた場合には、画素単位の輝度変化量が主観的な動
き評価に不適当であるため、「主観的にみて再生速度が
速い部分、遅い部分」が存在し、内容の把握が困難で、
かつ、ユーザーが疲れやすいという課題を有していた。
また、後者を用いた場合には、各ショットの先頭フレー
ム画像だけが表示されるため、被写体の動作を中心とし
て内容を把握したいときに使えないという課題を有して
いた。Further, in the above-mentioned video content compression display method,
As a method of determining the display importance, only a method of using a luminance change amount in pixel units between adjacent frame images and a method of using a luminance change amount in frame units have been proposed. When the former is used, the amount of luminance change in pixel units is unsuitable for subjective motion evaluation, so there are "subjectively high and low playback speed portions" and it is difficult to understand the contents. so,
Moreover, there is a problem that the user is easily tired.
Further, in the case of using the latter, only the first frame image of each shot is displayed, so there is a problem that it cannot be used when it is desired to grasp the contents centering on the motion of the subject.

【００１５】上記のショット毎可変速再生法では、ショ
ットの時間長によって再生速度が決まるので、ユーザー
が一部のショットの内容を把握できず、映像に含まれる
内容の時間的な流れを知りたいときに使えないという課
題を有していた。In the above-mentioned variable speed reproduction method for each shot, since the reproduction speed is determined by the time length of the shot, the user cannot grasp the contents of some shots and wants to know the temporal flow of the contents included in the video. There was a problem that it could not be used sometimes.

【００１６】内容の時間的な流れを知りたいときには、
内容の単位である各ショットの内容を把握することが重
要であり、ショットの見落としを避けるために、ショッ
トの変化を予測できることが重要となる。しかしショッ
ト毎可変速再生法では、「編集でつないだ部分と、ビデ
オカメラの撮影を中断した部分でショットが変化する」
場合を除くと、ショットの変化を予測できない。同一の
カメラ操作で撮影された部分を上記ショットとみなして
扱うときや、シナリオの一つのシーンを上記ショットと
みなして扱うときなどでは、ショットの変化を予測でき
ず、ショットの見落としが発生するという課題を有して
いた。[0016] If you want to know the temporal flow of contents,
It is important to understand the content of each shot, which is a unit of content, and it is important to be able to predict shot changes in order to avoid overlooking shots. However, in the variable speed playback method for each shot, "the shot changes depending on the part that is connected by editing and the part where the video camera shooting is interrupted."
Except in some cases, shot changes cannot be predicted. When a part shot by the same camera operation is treated as the above shot, or when one scene of a scenario is treated as the above shot, the change of the shot cannot be predicted and the shot may be overlooked. Had challenges.

【００１７】映像では、複数のショットの組み合わせ
が、シーンのようなショットよりも高次の内容の単位と
なる。上記のショット毎可変速再生法では、すべてのシ
ョットを少しずつ再生するので、類似した内容の部分を
連続して再生することになり、できるだけ異なる内容の
部分を効率よく見たいときに使えないという課題を有し
ていた。In a video, a combination of a plurality of shots becomes a unit of contents of higher order than a shot such as a scene. With the above-mentioned variable speed playback method for each shot, all shots are played back little by little, so parts with similar contents are played back in succession, and it is not possible to use parts with different contents as efficiently as possible. Had challenges.

【００１８】本発明は上記従来技術の課題を解決するも
ので、ユーザーが所望の内容を把握できる映像要約方法
を提供することを目的とする。The present invention solves the above-mentioned problems of the prior art, and an object of the present invention is to provide a video summarizing method by which a user can grasp desired contents.

【００１９】[0019]

【課題を解決するための手段】この目的を達成するため
に、第１に、高速で動く物体を含む画像の時間区間を検
出して高速動作区間とみなす。また、一定時間以上にわ
たって類似した画像が続く時間区間を検出して長時間類
似区間とみなす。そして、高速動作区間を、高速動作区
間以外の時間区間に比べて低速で再生し、長時間類似区
間を、長時間類似区間以外の時間区間に比べて高速で再
生する。さらに、映像を内容ごとにまとめて複数のショ
ットに分割してから、各ショットの表示時間長に下限を
設ける。In order to achieve this object, firstly, a time section of an image containing an object moving at high speed is detected and regarded as a high speed operation section. In addition, a time section in which similar images continue for a certain time or more is detected and regarded as a long-time similar section. Then, the high-speed operation section is reproduced at a lower speed than the time section other than the high-speed operation section, and the long-time similar section is reproduced at a higher speed than the time section other than the long-time similar section. Furthermore, after the video is grouped by content and divided into a plurality of shots, a lower limit is set for the display time length of each shot.

【００２０】第２に、映像を内容ごとにまとめて、ショ
ットに分割してから、予め設定した周期のリズムを想起
させながら、時系列のショットを次々に再生する。ただ
し、各ショットの表示時間長に上限を設ける。また、次
の２つの条件の少なくとも一方を満足するように再生速
度を決定する。Secondly, the images are grouped by content and divided into shots, and then time-series shots are reproduced one after another while recalling a rhythm of a preset cycle. However, an upper limit is set for the display time length of each shot. Further, the reproduction speed is determined so as to satisfy at least one of the following two conditions.

【００２１】リズム条件：ショットの境界と、予め設定
した周期のリズムが相関をもつ。Rhythm condition: A shot boundary and a rhythm of a preset cycle have a correlation.

【００２２】内容条件：「再生速度決定の基準である
速度上限値以下の速度で再生した部分は必ず内容を把握
できる」ということを保証するように速度上限値が設定
されているとき、各ショットの少なくとも一部分の再生
速度が速度上限値以下になる。Content condition: When the speed upper limit value is set so as to guarantee that "the content reproduced at a speed equal to or lower than the speed upper limit value which is the criterion for determining the reproduction speed can be grasped", each shot The playback speed of at least a part of is below the upper speed limit.

【００２３】第３に、要約対象の映像をショットに分割
する。そのあとで、時系列のショットの間の相関を用い
て、類似ショットを統合してショット群を作成する。そ
して、各ショット群から一つずつの部分動画像を選択
し、部分動画像を次々に再生する。Thirdly, the video to be summarized is divided into shots. After that, similar shots are integrated to create a shot group by using the correlation between the shots in time series. Then, one partial moving image is selected from each shot group, and the partial moving images are reproduced one after another.

【００２４】[0024]

【作用】これらの構成によって、第１に、高速で動く物
体が存在しなくなり、かつ、各内容が一定時間以上表示
されるので、「主観的に再生速度が速い部分」が消滅す
る。また、映像が短時間で次々に変化するので、「主観
的に再生速度が遅い部分」が消滅する。従って、従来法
に比べてユーザーの疲労が軽減される。また、この方法
は、主観的にみて、再生速度が許容範囲内に入るように
しながら映像全体を再生するので、被写体の動作を中心
とした内容を把握することができる。With these configurations, firstly, since there is no object moving at high speed and each content is displayed for a certain time or longer, the "subjectively high reproduction speed" disappears. Also, since the video changes one after another in a short time, the "subjectively slow playback speed" disappears. Therefore, user fatigue is reduced as compared with the conventional method. In addition, this method subjectively reproduces the entire video while keeping the reproduction speed within the allowable range, so that it is possible to grasp the contents centering on the motion of the subject.

【００２５】第２に、同じ内容の部分をショットにまと
め、それぞれを短時間で再生する。従って、リズム条件
を満足するように映像を要約するとき、ショットの変化
を予測できるので、すべてのショットを見落とさずに見
ることができる。Secondly, portions having the same contents are collected into shots, and the shots are reproduced in a short time. Therefore, when summarizing the video so as to satisfy the rhythm condition, changes in shots can be predicted, so that all shots can be viewed without being overlooked.

【００２６】一方、内容条件を満足するように映像を要
約するとき、ユーザーは、すべてのショットにおいて、
少なくともその一部分の内容を把握することができる。
同じ内容の部分をまとめたものがショットになっている
ので、すべてのショットの内容を把握できることにな
る。On the other hand, when summarizing the video so that the content conditions are satisfied, the user
At least a part of the contents can be grasped.
A shot is a collection of parts with the same contents, so you can understand the contents of all shots.

【００２７】なお、この手段は、同じ内容の部分をまと
めてショットを作成し、これらを次々に再生するので、
映像に含まれる内容の時間的な流れを知りたい場合に用
いる。ただし、リズム条件を満足しない場合には、ショ
ットの変化を予測できないので、ショットの見落としが
発生する。内容条件を満足しない場合には、一部のショ
ットの内容を把握できない。リズム条件と内容条件を同
時に満足することが望ましい。Since this means creates shots by collecting the parts having the same contents and reproduces them one after another,
It is used when you want to know the temporal flow of the contents included in the video. However, if the rhythm condition is not satisfied, the change of the shot cannot be predicted, so that the shot is overlooked. If the content condition is not satisfied, the content of some shots cannot be grasped. It is desirable to satisfy both the rhythm condition and the content condition at the same time.

【００２８】第３に、時系列の類似する内容の部分をま
とめてショット群を作成し、これらの一部分を次々に再
生するので、できるだけ異なる内容の部分を効率よく見
ることができる。Thirdly, since a group of shots having similar contents in time series is created and these shot portions are reproduced one after another, it is possible to efficiently see the portions having different contents as much as possible.

【００２９】[0029]

【実施例】以下、本発明の一実施例について、図面を参
照しながら説明する。An embodiment of the present invention will be described below with reference to the drawings.

【００３０】図１は、本発明の一実施例における映像要
約装置の全体システム図である。図１において、１、２
は処理対象となる映像（以下、処理対象映像と呼ぶ）の
入力装置であって、１はビデオディスク装置、２はＶＴ
Ｒである。また、３はビデオディスク装置１やＶＴＲ２
から出力される映像信号のフレーム画像を取り込むフレ
ームメモリである。４は、ビデオディスク装置１やＶＴ
Ｒ２から出力される映像信号を圧縮する映像圧縮装置で
ある。５は第１の計算機であり、ビデオディスク装置１
やＶＴＲ２やフレームメモリ３や映像圧縮装置４を制御
する。また、フレームメモリ３に制御信号を送ってフレ
ーム画像を取り込み、時系列のフレーム画像を処理する
ことで映像を要約する。６は映像圧縮装置４で圧縮され
た映像や第１の計算機５から送られるデータやフレーム
画像を記憶するファイルサーバーである。７はユーザー
の要求に応じて、要約映像、処理対象映像をファイルサ
ーバー６から呼び出し、その映像を再生する第２の計算
機である。FIG. 1 is an overall system diagram of a video summarizing device according to an embodiment of the present invention. In FIG. 1, 1, 2
Is an input device for an image to be processed (hereinafter referred to as an image to be processed), 1 is a video disk device, and 2 is a VT
R. Further, 3 is a video disc device 1 and a VTR 2
It is a frame memory that captures the frame image of the video signal output from the. 4 is a video disc device 1 or VT
This is a video compression device that compresses the video signal output from R2. Reference numeral 5 is a first computer, which is a video disk device 1
Controls the VTR 2, the frame memory 3, and the video compression device 4. In addition, a video is summarized by sending a control signal to the frame memory 3 to capture a frame image and processing the time-series frame image. Reference numeral 6 is a file server that stores the video compressed by the video compression device 4, the data and the frame image sent from the first computer 5. Reference numeral 7 denotes a second computer which calls up the summary video and the processing target video from the file server 6 in accordance with the user's request and reproduces the video.

【００３１】以上のように構成された映像要約装置につ
いて、図２に示すフローチャートを用いてその全体の動
作を説明する。The overall operation of the video summarizing device configured as described above will be described with reference to the flowchart shown in FIG.

【００３２】手順２０１では、図１における第１の計算
機５が、ビデオディスク装置１とＶＴＲ２とフレームメ
モリ３を制御しながら時系列のフレーム画像を処理する
ことで、映像を要約する。In step 201, the first computer 5 in FIG. 1 summarizes the video by processing the time-series frame images while controlling the video disk device 1, the VTR 2 and the frame memory 3.

【００３３】手順２０２では、第１の計算機５が、ビデ
オディスク装置１とＶＴＲ２と映像圧縮装置４を制御し
ながら、要約した映像と処理対象映像とを圧縮してファ
イルサーバー６に記憶する。In step 202, the first computer 5 compresses the summarized video and the video to be processed and stores them in the file server 6 while controlling the video disk device 1, the VTR 2 and the video compression device 4.

【００３４】なお、要約映像の記憶方法については、映
像を直接圧縮する方法だけでなく、他の方法が各種考え
られる。例えば、図３に示すように、処理対象映像のフ
レーム番号を用いてファイル形式で要約映像を表現でき
る場合には、要約映像の代わりに、フレーム番号の情報
をファイルサーバー６に記憶してもよい。また、ビデオ
ディスク装置１とＶＴＲ２とフレームメモリ３を制御し
ながら、後述する各ショット群の先頭フレーム画像を取
り込み、要約映像の代わりにファイルサーバー６に記憶
してもよい。ただし、この場合には、記憶した先頭フレ
ーム画像を縮小して一覧表示することが、映像の要約を
呈示することに相当する。As a method of storing the summary video, various methods other than the method of directly compressing the video can be considered. For example, as shown in FIG. 3, when the summary video can be expressed in a file format using the frame number of the processing target video, the frame server information may be stored in the file server 6 instead of the summary video. . Further, while controlling the video disk device 1, the VTR 2 and the frame memory 3, the first frame image of each shot group described later may be fetched and stored in the file server 6 instead of the summary video. However, in this case, reducing the stored first frame image and displaying the list is equivalent to presenting a summary of the video.

【００３５】手順２０３では、第２の計算機７が、ユー
ザーの要求に応じて、要約映像、処理対象映像をファイ
ルサーバー６から呼び出し、その映像を再生する。In step 203, the second computer 7 calls the summary video and the video to be processed from the file server 6 in response to the user's request and reproduces the video.

【００３６】以下では、図２における手順２０１の具体
的動作である、第１の計算機５の映像要約処理について
説明する。The video summarization process of the first computer 5, which is a specific operation of step 201 in FIG. 2, will be described below.

【００３７】図４は、図１における第１の計算機５の映
像要約処理の一実施例のフローチャートである。FIG. 4 is a flow chart of an embodiment of the video summarization process of the first computer 5 in FIG.

【００３８】まず、手順４０１では、映像要約処理に用
いる様々なパラメータの初期化、しきい値の設定を実行
する。また、後述する手順で「フレーム画像間の類似
度」や高速動作画素や静止構成色や動構成色や静止共通
色や動共通色などの特徴量を計算するときに用いるため
に、フレームメモリ３から処理対象フレーム画像Ｉn
（ｎは１以上の自然数）を取り込み、このフレーム画像
を記憶しておく。First, in step 401, various parameters used in the video summarization process are initialized and threshold values are set. Further, the frame memory 3 is used to calculate the feature amount such as “similarity between frame images”, high-speed operation pixel, still constituent color, moving constituent color, still common color, and moving common color in the procedure described later. To the frame image In to be processed
(N is a natural number of 1 or more) is fetched and this frame image is stored.

【００３９】次に、手順４０２では、処理対象となる映
像が終了したかどうかを判定し、映像が終了した場合に
は、映像要約処理を終了する。映像が終了していない場
合には、手順４０３に進む。Next, in step 402, it is determined whether or not the video to be processed has ended. If the video has ended, the video summarization processing ends. If the video has not ended, the procedure proceeds to step 403.

【００４０】手順４０３では、現在の処理対象フレーム
画像の次のフレーム画像Ｉn+1を新しい処理対象フレー
ム画像Ｉnとみなす。そして、ビデオディスク装置１や
ＶＴＲ２に制御信号を送って、更新後の処理対象フレー
ム画像Ｉnを再生し、フレームメモリ３からフレーム画
像Ｉnを取り込む。In step 403, the frame image In + 1 next to the current frame image to be processed is regarded as a new frame image In to be processed. Then, a control signal is sent to the video disk device 1 and the VTR 2 to reproduce the updated frame image In to be processed, and to fetch the frame image In from the frame memory 3.

【００４１】手順４０４では、一つのビデオカメラで時
間的に連続して撮影された部分であるショットの末尾を
検出する。ショットの末尾を検出した場合には、このシ
ョットを新しい処理対象ショットSHk（ｋは１以上の整
数）とみなしてから、手順４０５に進む。ショットの末
尾を検出しなかった場合には、何もせずに手順４０５に
進む。なお、手順４０５に進んでから、この手順４０４
が再び実行されるのは、次の処理対象フレーム画像Ｉn+
1を取り込んだあとである。従って、ショットの末尾を
検出できない場合には、次処理対象フレーム画像Ｉn+1
を取り込んだ後に、ショットの末尾の検出を再度試みる
ことになる。In step 404, the end of a shot, which is a portion captured by one video camera continuously in time, is detected. When the end of the shot is detected, this shot is regarded as a new processing target shot SHk (k is an integer of 1 or more), and then the procedure proceeds to step 405. If the end of the shot is not detected, the process proceeds to step 405 without doing anything. After proceeding to step 405, this step 404
Is executed again for the next frame image In + to be processed.
After capturing 1. Therefore, when the end of the shot cannot be detected, the next frame image to be processed In + 1
After capturing, you will try again to detect the end of the shot.

【００４２】なお、シナリオの一つのシーンなどをショ
ットとみなしてもよい。したがって、手順４０４以降の
説明で用いている「ショット」は、すべて「シナリオの
一つのシーン」におきかえてもよい。Note that one scene or the like of the scenario may be regarded as a shot. Therefore, all the "shots" used in the explanations after step 404 may be replaced with "one scene in the scenario".

【００４３】また、ショットの末尾の検出方法として、
従来の技術で紹介した映像変化モデル法が提案されてい
る。さらに、本実施例では、映像を自動的にショットに
分割しているが、ユーザーが映像などを見ながら分割し
てもよい。映像が予めショットに分割されている場合に
は、手順４０４を省略してもよい。As a method of detecting the end of a shot,
The image change model method introduced in the related art has been proposed. Furthermore, in the present embodiment, the video is automatically divided into shots, but the user may divide the shot while viewing the video. If the video is divided into shots in advance, step 404 may be omitted.

【００４４】手順４０５以降、手順４０８までが、早送
り映像の主観評価結果をもとにたてた次の３つの仮定を
用いる映像要約方法であり、「主観的にみて再生速度が
速い部分、遅い部分」を消滅させるように再生速度を決
定する。From step 405 to step 408 is the video summarization method using the following three assumptions based on the subjective evaluation result of the fast-forward video. The playback speed is determined so that the "part" disappears.

【００４５】・長時間類似した画像が続く部分（以下、
長時間類似区間と呼ぶ）で、再生速度が遅いと感じる。A portion where similar images continue for a long time (hereinafter,
It is called a similar section for a long time), and the playback speed is slow.

【００４６】・高速で移動する物体を含む画像の部分
（以下、高速動作区間と呼ぶ）で、再生速度が速いと感
じる。It is felt that the reproduction speed is high in the part of the image including the object moving at high speed (hereinafter referred to as the high speed operation section).

【００４７】・映像を内容ごとにまとめてショットに分
割したとき、表示時間が短いショット（短時間長ショッ
トと呼ぶ）で、再生速度が速いと感じる。When an image is divided into shots according to contents, it is felt that a shot having a short display time (called a short shot) has a high reproduction speed.

【００４８】なお、この方法によって作成された要約映
像を区間変速要約映像と呼ぶことにする。また、本実施
例では、長時間類似区間、高速動作区間以外の時間区間
を標準区間と呼ぶことにする。さらに、標準区間の再生
速度を４倍速に設定する。The summary video created by this method will be referred to as a section shift summary video. Further, in the present embodiment, a time section other than the long-time similar section and the high-speed operation section is referred to as a standard section. Further, the reproduction speed of the standard section is set to 4 times speed.

【００４９】手順４０５は、長時間類似区間を検出する
処理である。４倍速早送り再生の映像が、４フレーム間
隔の画像から構成されるので、４フレーム間隔の画像を
サンプリングして、その画像間の類似度を調べること
で、この検出処理を実行する。手順４０５における具体
的な動作を図５を用いて述べる。Step 405 is a process for detecting a long-term similar section. Since the video of the 4 × fast-forward reproduction is composed of images at intervals of 4 frames, the detection processing is executed by sampling the images at intervals of 4 frames and checking the similarity between the images. A specific operation in step 405 will be described with reference to FIG.

【００５０】手順５０１では、まず、処理対象フレーム
画像Ｉnのフレーム番号を調べる。次に、このフレーム
画像が、４フレーム間隔の画像列に含まれるかどうか判
定する。４フレーム間隔の画像だけを用いて長時間類似
区間を検出するので、「画像列に含まれる」と判定され
た場合には、長時間類似区間を検出するために手順５０
２に進む。「画像列に含まれない」と判定された場合に
は、手順４０５を終了して、図４の手順４０６に進む。In step 501, first, the frame number of the frame image In to be processed is checked. Next, it is determined whether or not this frame image is included in the image sequence of 4 frame intervals. Since the long-time similar section is detected using only the images at 4-frame intervals, if it is determined to be "included in the image sequence", the procedure 50 is performed to detect the long-term similar section.
Go to 2. If it is determined that it is not included in the image sequence, the procedure 405 is ended and the procedure proceeds to the procedure 406 in FIG.

【００５１】手順５０２では、処理対象フレーム画像Ｉ
nと、上記画像列で一つ前の画像列のフレーム画像Ｉn-4
との間の類似度Ｓ（ｎ、ｎ−４）を計算する。ただし、
要約映像を見るユーザーが主観的に無視できる動きとの
相関がなくなるように、フレーム画像間の類似度の計算
方法を決めた。In step 502, the processing target frame image I
n and the frame image In-4 of the previous image sequence in the above image sequence
Compute the similarity S (n, n-4) between and. However,
The calculation method of the similarity between frame images was decided so that there is no correlation with the motion that can be subjectively ignored by the user viewing the summary video.

【００５２】ここで、フレーム画像間の類似度の計算方
法について簡単に述べる。χ2検定法（長坂、田中、
「ビデオ作品の場面変わりの自動検出法」、情報処理学
会第４０回全国大会、１Ｑ−５、１９９０年）など各種
の方法が考えられるが、本実施例では、共通色比率法
（山田、藤岡、金森、松島、「部分領域ごとの共通色に
注目したシーンチェンジ検出方法の検討」、テレビジョ
ン学会技術報告,１９９３年９月、Vol.17,No.55,pp1-
6）と同様の方法を用いる。Here, a method of calculating the similarity between frame images will be briefly described. χ2 test method (Nagasaka, Tanaka,
Various methods are conceivable such as "Automatic detection of scene change in video work", 40th National Convention of Information Processing Society of Japan, 1Q-5, 1990). In this embodiment, the common color ratio method (Yamada, Fujioka) is used. , Kanamori, Matsushima, "Study on Scene Change Detection Method Focusing on Common Color of Partial Areas", Technical Report of Television Society, September 1993, Vol.17, No.55, pp1-
Use the same method as 6).

【００５３】静止画像、主観的に無視できる動きだけを
含む映像では、画面上の物体が４フレーム時間に画面上
で移動する距離は、画面の幅に比べて非常に小さくなる
はずである。筆者らの経験によれば、この距離は画面の
幅の２％以下になる。そこで、図６に示すようにフレー
ム画像Ｉnを部分領域Ｒ（ｊ、ｎ）、（ただし、ｊは１
以上１６以下の整数）に分割する。このとき、静止画
像、主観的に無視できる動きだけを含む画像Ｉn、Ｉn-4
の対応する部分領域Ｒ（ｊ、ｎ）、Ｒ（ｊ、ｎ−４）の
間では、色のヒストグラムがほとんど変化しない。な
お、本実施例では部分領域の数を１６としたが、必ずし
も１６である必要はない。In the case of a still image or an image containing only subjectively negligible movements, the distance that an object on the screen moves on the screen in 4 frame time should be much smaller than the width of the screen. In our experience, this distance is less than 2% of the width of the screen. Therefore, as shown in FIG. 6, the frame image In is divided into partial regions R (j, n), (where j is 1
The above is an integer of 16 or less). At this time, still images, images In and In-4 including only motions that can be subjectively ignored
The color histogram hardly changes between the corresponding partial regions R (j, n) and R (j, n-4). Although the number of partial areas is 16 in the present embodiment, the number of partial areas does not have to be 16.

【００５４】一方、フレーム画像Ｉn、Ｉn-4の間でショ
ットの変化などが発生すると、Ｉn、Ｉn-4の対応する部
分領域の間では、被写体が変化して部分領域を構成する
色が変化する。そこで、対応する部分領域の間で新しく
出現した色、消滅した色の面積に応じて、対応する部分
領域の類似度Sp（ｊ、Ｉn、Ｉn-4）が減少するように、
この類似度を計算する。On the other hand, when a shot change or the like occurs between the frame images In and In-4, the subject changes between the corresponding partial areas of In and In-4, and the colors forming the partial areas change. To do. Therefore, the similarity Sp (j, In, In-4) of the corresponding partial region is reduced according to the area of the color that has newly appeared or disappeared between the corresponding partial regions,
This similarity is calculated.

【００５５】そして、フレーム画像Ｉn、Ｉn-4の対応す
る部分領域の類似度の平均を計算し、フレーム画像の類
似度Ｓ（ｎ、ｎ−４）とみなす。すなわち、Then, the average of the similarities of the corresponding partial areas of the frame images In and In-4 is calculated and regarded as the similarity S (n, n-4) of the frame images. That is,

【００５６】[0056]

【数１】を計算する。[Equation 1] Is calculated.

【００５７】手順５０３は、長時間類似区間の先頭フレ
ーム番号を更新する処理である。長時間類似区間中の画
像は互いに類似するので、時系列の画像の間の類似度Ｓ
（ｎ、ｎ−４）がしきい値θstill以上になるかどうか
を、条件式Ｓ（ｎ−４、ｎ）≧θstill ・・・・・・・・・・・・・・・・（２）を用いて調べる。（２）式が成立する場合に、（ｎ−
４）を長時間類似区間の途中とみなし、先頭フレーム番
号を表す類似区間端Nbが設定されていなければ、Nbに、
（ｎ−４）を代入する。Step 503 is a process for updating the leading frame number of the long-term similar section. Since the images in the long-term similar section are similar to each other, the similarity S between the time-series images is S.
Whether or not (n, n-4) becomes equal to or larger than the threshold value θstill is determined by the conditional expression S (n-4, n) ≧ θstill (2) To find out. When the expression (2) is established, (n-
4) is considered to be in the middle of the similar section for a long time, and if the similar section end Nb representing the head frame number is not set, Nb is
Substitute (n-4).

【００５８】手順５０４は、長時間類似区間の候補を検
出する処理である。類似区間端Nbのフレーム画像と、処
理対象フレーム画像Ｉnとの間で、４フレーム間隔の画
像の類似度が上記しきい値θstill以上になるかどうか
を、条件式Ｓ（Nb+(ｉ-１)×４、Nb+ｉ×４）≧θstill、１≦ｉ≦(n-Nb)/４・・（３）を用いて調べる。（3）式が成立するとき、類似区間端
のフレーム画像ＩNbと上記処理対象フレーム画像Ｉnと
の間を長時間類似区間の候補とみなして、手順５０５に
進む。（３）式が成立しないとき、長時間類似区間の候
補を決定できないので、図４の手順４０６に進む。Step 504 is a process for detecting long-term similar section candidates. Whether or not the degree of similarity between images at 4-frame intervals is equal to or greater than the threshold value θstill between the frame image at the similar section end Nb and the processing target frame image In is determined by the conditional expression S (Nb + (i-1) × 4, Nb + i × 4) ≧ θstill, 1 ≦ i ≦ (n-Nb) / 4 (3). When the formula (3) is satisfied, the portion between the frame image INb at the end of the similar section and the processing target frame image In is regarded as a long-term similar section candidate, and the procedure proceeds to step 505. When the expression (3) is not satisfied, the candidate for the long-term similar section cannot be determined, and therefore, the procedure proceeds to step 406 of FIG.

【００５９】手順５０４で求めた候補には、「類似した
画像が続く部分」の他に、図７に示すような「ゆるやか
な映像変化をする部分」が含まれる。手順５０５では、
まず、長時間類似区間候補の先頭フレーム画像ＩNbと、
他のフレーム画像との間の類似度がしきい値θratio以
上になるかどうかを、条件式Ｓ（Nb、Nb＋ｉ×４）≧θratio、１≦ｉ≦(n-Nb)/４・・・・・（４）を用いて調べる。次に、（４）式が成立する場合には、
この候補の時間長が、長時間類似区間の最低時間長Ｔst
illに比べて長いかどうかを、条件式ｎ−Nb ≧ Ｔstill ・・・・・・・・・・・・・・・・・・・・・（５）を用いて調べる。ただし、（５）式中の最低時間長Ｔst
illは、図４の手順４０１の実行時に設定しておく。The candidates obtained in step 504 include "a portion where similar images continue" and "a portion where a gradual image change occurs" as shown in FIG. In step 505,
First, the first frame image INb of the long-term similar section candidate,
Whether or not the degree of similarity with other frame images is equal to or greater than the threshold value θratio, conditional expressions S (Nb, Nb + i × 4) ≧ θratio, 1 ≦ i ≦ (n-Nb) / 4 ...・ Check using (4). Next, when the expression (4) is satisfied,
The time length of this candidate is the minimum time length Tst of the long-term similar section.
Whether or not it is longer than ill is examined using the conditional expression n−Nb ≧ Tstill (5). However, the minimum time length Tst in equation (5)
ill is set when executing the procedure 401 in FIG.

【００６０】（４）、（５）式が同時に成立する場合に
は、手順５０４で求めた候補を、長時間類似区間とみな
してから、以上に述べてきた手順４０５を終了する。When the expressions (4) and (5) are satisfied at the same time, the candidates obtained in the procedure 504 are regarded as the long-term similarity section, and then the procedure 405 described above is terminated.

【００６１】（４）式が成立し、かつ、（５）式が成立
しない場合には、処理対象フレーム画像Ｉn以降の画像
を処理しなければ、手順５０４で求めた候補が長時間類
似区間の一部かどうか判定できない。そこで、何もせず
に手順４０５を終了する。If the expression (4) is satisfied and the expression (5) is not satisfied, the candidates obtained in step 504 are the long-term similar sections unless the images after the processing target frame image In are processed. I can't judge whether it is a part. Therefore, the procedure 405 is terminated without doing anything.

【００６２】（４）式が成立しない場合には、この候補
を、「ゆるやかな映像変化をする部分」とみなす。ま
た、（２）式が成立するので、フレーム番号（ｎ−４）
以降を長時間類似区間の候補とみなす。長時間類似区間
の候補の先頭フレーム番号を表す類似区間端Nbに、（ｎ
−４）を代入する。If the expression (4) is not satisfied, this candidate is regarded as a "portion where a gradual image change occurs". Further, since the expression (2) is established, the frame number (n-4)
The following is considered as a long-term similar section candidate. At the similar section end Nb that represents the leading frame number of the long-term similar section candidate, (n
-4) is substituted.

【００６３】なお、本実施例では、要約映像を見るユー
ザーが主観的に無視できる動きとの相関がなくなるよう
に、フレーム画像間の類似度の計算方法を決めた。しか
し、ユーザーの主観は、ユーザーによってばらつくの
で、ユーザーが注目する物体の動きを重視して、類似度
の計算方法を決めてもよい。In this embodiment, the calculation method of the similarity between frame images is determined so that there is no correlation with the motion that can be subjectively ignored by the user who views the summary video. However, since the subjectivity of the user varies depending on the user, the calculation method of the degree of similarity may be determined by emphasizing the movement of the object noticed by the user.

【００６４】また、本実施例では、（３）、（４）式を
用いて「類似する画像が続く部分」を求めた。すなわ
ち、時系列の画像の間の類似度と、類似区間の先頭フレ
ーム画像を基準とした類似度とを用いた。しかし、類似
区間の最後のフレーム画像を基準にする方法などを用い
ても同様の効果がえられるので、（３）、（４）式以外
の方法を用いて「類似する画像が続く部分」を計算して
もよい。また、（４）式だけを用いて「類似する画像が
続く部分」を計算してもよい。Further, in this embodiment, the "portion where similar images continue" was obtained using the equations (3) and (4). That is, the similarity between images in time series and the similarity based on the first frame image of the similar section are used. However, the same effect can be obtained by using a method that uses the last frame image of the similar section as a reference. Therefore, by using a method other than equations (3) and (4), the "portion where similar images continue" is used. You may calculate. Alternatively, the “portion of similar images” may be calculated using only equation (4).

【００６５】図４の手順４０５終了後、手順４０６に進
む。手順４０６は、「高速で移動する物体を含む画像の
部分」を検出する処理である。手順４０６における具体
的な動作を図８を用いて述べる。After the procedure 405 in FIG. 4 is completed, the procedure proceeds to procedure 406. Step 406 is a process of detecting “a part of the image including an object moving at high speed”. A specific operation in step 406 will be described with reference to FIG.

【００６６】手順８０１は、処理対象フレーム画像Ｉn
と、その直前の画像Ｉn-1を用いて、高速で動く物体上
の画素を検出する処理である。この処理の原理を、図９
を用いて説明する。The procedure 801 is the frame image In to be processed.
And the image In-1 immediately before that is used to detect pixels on an object that moves at high speed. The principle of this processing is shown in FIG.
Will be explained.

【００６７】図９に示すように、フレーム画像Ｉn-1、
Ｉnの間で、位置ｐ（ただし、ｐはベクトル）上の物体
が高速移動したとき、距離Lmin以内ではもとの物体が既
に存在しないと仮定する。このとき、物体の移動によっ
て、Ｉn-1上で、位置ｐの画素の輝度ｆ（ｐ、ｎ−１）
と、Ｉn上で同じ位置ｐを中心とする半径Lminの円内の
画素の輝度ｆ（ｐ＋ｄ、ｎ）（ただし、ｄはLmin以下の
長さをもつベクトル）との間の輝度差が、すべて予め設
定したしきい値θw1以上になり、｜f(ｐ、n-1)−f(ｐ＋ｄ、n)｜≧θw1、｜ｄ・ｄ｜≦Lmin×Lmin ・・（６）が成立する。ただし、（６）式において、｜ａ｜は、ス
カラー量ａの絶対値を表し、（ｄ・ｄ）は、ベクトルｄ
の内積を表す。As shown in FIG. 9, the frame image In-1,
When the object on the position p (where p is a vector) moves at high speed between In and In, it is assumed that the original object does not already exist within the distance Lmin. At this time, due to the movement of the object, the brightness f (p, n-1) of the pixel at the position p on In-1.
, And the luminance f (p + d, n) of pixels within a circle of radius Lmin centered on the same position p on In (where d is a vector having a length of Lmin or less), It becomes equal to or greater than the preset threshold value θw1, and | f (p, n-1) -f (p + d, n) | ≧ θw1, | d · d | ≦ Lmin × Lmin ··· (6) holds. However, in Expression (6), | a | represents the absolute value of the scalar amount a, and (d · d) is the vector d.
Represents the dot product of.

【００６８】さらに、Rminを半径とする円よりも大きい
物体のみを検出することで、ノイズを除去する。同一輝
度領域を同一物体とみなし、図９に示したように、Ｉn-
1上で位置ｐの画素の輝度ｆ（ｐ＋Ｄ、ｎ−１）（ただ
し、ＤはRmin以下の長さをもつベクトル）との間の輝度
差がしきい値θw1未満になる条件式｜f(ｐ、n-1)−f(ｐ＋Ｄ、n-1)｜＜θw1、｜Ｄ・Ｄ｜≦Rmin×Rmin ・・（７）が成立するとき、位置ｐで（６）式が成立するかどうか
調べる。Further, noise is removed by detecting only an object larger than a circle having a radius of Rmin. Assuming that the same luminance region is the same object, as shown in FIG.
Conditional expression | f (where the brightness difference with the brightness f (p + D, n−1) of the pixel at position p on 1 above (where D is a vector having a length of Rmin or less) is less than the threshold θw1 p, n-1) -f (p + D, n-1) | <θw1, | D ・ D | ≤ Rmin × Rmin ・・ (7) Whether or not expression (6) is satisfied at position p Find out.

【００６９】以上のことをふまえて、図８の手順８０１
では、（６）、（７）式を同時に満たす画素ｐを検出し
てから、この画素を、「高速で動く物体上の画素」とみ
なし、高速動作画素として登録する。Based on the above, step 801 in FIG.
Then, after detecting a pixel p that simultaneously satisfies the expressions (6) and (7), this pixel is regarded as a "pixel on an object moving at high speed" and registered as a high-speed operation pixel.

【００７０】なお、処理を簡単にするために、画素間引
きをする場合や、８画素×８ライン分の画素の平均を一
つの画素とみなす場合などがある。このように、複数の
画素の内容を一つの画素に代表させる場合には、モザイ
クのように、１画素あたりの面積が大きくなる。この場
合には、図１０に示すように、（６）式だけを満たす画
素を高速動作画素とみなしてもよい。In order to simplify the processing, there are cases where pixels are thinned out, and the average of pixels of 8 pixels × 8 lines is regarded as one pixel. In this way, when the contents of a plurality of pixels are represented by one pixel, the area per pixel becomes large like a mosaic. In this case, as shown in FIG. 10, pixels satisfying only equation (6) may be regarded as high-speed pixels.

【００７１】また、高速で動く物体を検出する方法とし
て、例えば、動きベクトルを用いた方法が考えられる。
次の手順のように、動きベクトルを用いて高速動作画素
を検出してもよい。As a method of detecting an object moving at high speed, for example, a method using a motion vector can be considered.
The motion vector may be used to detect the fast-moving pixel as in the following procedure.

【００７２】[動きベクトルを用いて、高速で動く物体
を検出する手順の例] 手順１）画像を複数のブロックに分割してから、時系列
の２枚の画像を用いて、ブロックごとの動きベクトルを
求める。[Example of Procedure for Detecting High-Speed Moving Object Using Motion Vector] Procedure 1) Divide an image into a plurality of blocks, and then use two time-series images to perform motion for each block Find the vector.

【００７３】手順２）動きベクトルの大きさが一定値以
上になるブロックを求める。手順３）手順２で求めたブロックに含まれる画素を、高
速動作画素とみなす。Procedure 2) A block in which the magnitude of the motion vector is equal to or larger than a fixed value is obtained. Procedure 3) The pixels included in the block obtained in Procedure 2 are regarded as high-speed operation pixels.

【００７４】さらに、（６）、（７）式では、画素ｐと
比較する画素を、特定の円に含まれる画素としたが、図
１１に示すように、画素ｐを含む四角形に含まれる画素
としてもよい。Further, in the equations (6) and (7), the pixel to be compared with the pixel p is the pixel included in the specific circle. However, as shown in FIG. 11, the pixel included in the quadrangle including the pixel p is used. May be

【００７５】また、ユーザーの注目する領域を予め想定
しておき、想定した領域に含まれる画素の中から、高速
移動画素を検出してもよい。例えば、図１２に示すよう
に、ユーザーが常時画面の中央付近に注目すると想定し
て、高速移動画素を検出してもよい。It is also possible to assume a region of interest to the user in advance and detect a high-speed moving pixel from the pixels included in the assumed region. For example, as shown in FIG. 12, the fast moving pixel may be detected on the assumption that the user always pays attention to the vicinity of the center of the screen.

【００７６】手順８０２では、フレーム画像Ｉn-1に含
まれる高速動作画素の総数を調べる。そして、高速動作
画素の総数が、ノイズを除去するために設定したしきい
値θmに比べて大きい値になるとき、このフレーム画像
Ｉn-1と次のフレーム画像Ｉnの間を、高速動作区間候補
とみなす。In step 802, the total number of high-speed operation pixels included in the frame image In-1 is checked. Then, when the total number of high-speed operation pixels becomes a value larger than the threshold value θm set for removing noise, a high-speed operation section candidate is provided between this frame image In-1 and the next frame image In. To consider.

【００７７】早送り映像の主観評価の結果では、高速で
動く物体が存在しても、一瞬しか動かなければ、主観的
に無視される。そこで、手順８０４は、高速動作区間候
補の中から、一瞬の動きによって候補になった区間を除
外する。この処理の実現方法は、いくつか考えられる
が、ここでは、その一方法について述べる。As a result of the subjective evaluation of the fast-forward video, even if there is an object moving at a high speed, it is subjectively ignored if it moves only for a moment. Therefore, in step 804, from the high-speed operation section candidates, the section that becomes a candidate due to a momentary motion is excluded. There are several possible methods for realizing this processing, but here, one method will be described.

【００７８】本実施例では、４倍速早送り再生の映像が
基準になっているので、映像を４フレームごとにまと
め、区間単位と呼ぶことにする。手順８０３では、手順
８０２で高速動作区間候補が検出されたかどうかを判定
する。「検出された」と判定した場合には、この高速動
作区間候補を含む区間単位をＵN（Ｎは１以上の整数）
と表記することとし、手順８０４に進む。「検出されな
い」と判定した場合には、高速動作区間候補が存在しな
いので、手順４０６を終了する。In the present embodiment, since the video of the quadruple speed fast-forward reproduction is used as a reference, the video is grouped into four frames and referred to as a section unit. In step 803, it is determined whether or not a high-speed operation section candidate is detected in step 802. When it is determined that "detected", the section unit including this high-speed operation section candidate is UN (N is an integer of 1 or more)
Will be written, and the process proceeds to step 804. If it is determined that “there is no detection”, there is no high-speed operation section candidate, and the procedure 406 is terminated.

【００７９】手順８０４は、高速動作区間候補を含む区
間単位が予め設定した下限値Ｎmove以上連続するかどう
かを判定する処理である。また、区間単位ＵNと、その
（Ｎmove−１）個前の区間単位ＵN-(Nmove-1)*4との間
にあるすべての区間単位が高速動作区間候補を含むかど
うか調べる。次に、「高速動作区間候補を含む」と判定
されたとき、フレーム画像ＩN-Nmove*4とＩNの間を高速
動作区間とみなす。Step 804 is a process for determining whether the section unit including the high-speed operation section candidate continues for a preset lower limit value Nmove or more. Further, it is checked whether or not all the section units between the section unit UN and the section unit UN- (Nmove-1) * 4 which is (Nmove-1) units before the section unit include the high speed operation section candidate. Next, when it is determined that “a candidate for the high-speed operation section is included”, the area between the frame images IN-Nmove * 4 and IN is regarded as the high-speed operation section.

【００８０】手順８０４の終了後、以上に述べてきた手
順４０６を終了する。そして、図４の手順４０６の終了
後、手順４０７に進む。After the step 804 is finished, the procedure 406 described above is finished. Then, after the end of step 406 of FIG. 4, the process proceeds to step 407.

【００８１】手順４０７は、区間変速要約映像のための
映像再生速度を決定する処理である。最初に、高速動作
区間の再生速度を２倍速に決定する。次に、長時間類似
区間の中から、途中に高速動作区間を含まないものを取
り出し、その再生速度を８倍速に決定する。さらに、再
生速度が決定していない残りの区間を標準区間とみな
し、その再生速度を４倍速に決定する。Step 407 is a process for determining the video reproduction speed for the section shift summary video. First, the reproduction speed in the high speed operation section is determined to be double speed. Next, from the long-time similar sections, a section that does not include a high-speed operation section in the middle is taken out, and the reproduction speed thereof is determined to be 8 times speed. Further, the remaining section in which the reproduction speed is not determined is regarded as the standard section, and the reproduction speed is determined to be 4 times speed.

【００８２】なお、手順４０７では、高速動作区間、標
準区間、長時間類似区間の再生速度を、それぞれ２倍
速、４倍速、８倍速に設定したが、高速動作区間の再生
速度が標準区間に比べて遅く、長時間類似区間の再生速
度が標準区間に比べて速ければ、他の値に設定してもよ
い。また、長時間類似区間全体を１秒で再生するケース
のように、表示時間を用いて再生速度を決定してもよ
い。In step 407, the reproduction speeds of the high-speed operation section, the standard section, and the long-time similar section are set to 2 times speed, 4 times speed, and 8 times speed, respectively. If it is slower and the reproduction speed in the similar section for a long time is higher than that in the standard section, another value may be set. Further, the display speed may be used to determine the reproduction speed, as in the case of reproducing the entire long-duration similar section in 1 second.

【００８３】手順４０７で決定した再生速度にしたがっ
て、映像を再生したとき、一部のショットで、表示時間
が予め設定したショット長下限値Ｎshot未満になる。そ
こで、手順４０８で、再生速度を修正する。When the video is played back according to the playback speed determined in step 407, the display time becomes less than the preset shot length lower limit value Nshot for some shots. Therefore, in step 408, the reproduction speed is corrected.

【００８４】手順４０８では、まず、手順４０４で求め
た各ショットSHkに対して、再生に必要な時間NSHk（ｋ
は１以上の整数）を求める。次に、ショット長下限値Ｎ
shot未満になるかどうかを、条件式 NSHk ＜Ｎshot ・・・・・・・・・・・・・・・・・・・・・・・・（８）を用いて調べる。（８）式が成立する場合には、ショッ
トSHkを短時間長ショットとみなし、ショットの表示時
間がショット長下限値Ｎshotになるように、再生速度を
決定する。In step 408, first, for each shot SHk obtained in step 404, the time NSHk (k
Is an integer greater than or equal to 1). Next, the shot length lower limit value N
Check if it is less than shot using conditional expression NSHk <Nshot ・・・・・・・・ (8). When the expression (8) is satisfied, the shot SHk is regarded as a short-time long shot, and the reproduction speed is determined so that the shot display time becomes the shot-length lower limit value Nshot.

【００８５】なお、本実施例では、ショット長下限値を
固定値としたが、テクスチャの細かさなどの内容に応じ
てショット長下限値を変化させてもよい。In this embodiment, the shot length lower limit value is fixed, but the shot length lower limit value may be changed according to the details such as the fineness of the texture.

【００８６】手順４０８の終了後、手順４０９に進む。
手順４０９は、ショットの境界と予め設定した周期のリ
ズムが相関をもつように再生速度を決定してから、この
「リズム」を想起させながら時系列のショットを次々に
再生する映像要約方法を実現するための手順である。最
初に、要約映像がショットの境界で予め設定した周期の
リズムをもつように、再生速度を決定する。次に、リズ
ムを保持しながら、ユーザーが各ショットの内容を把握
できるように、再生速度を修正する。なお、手順４０９
で作成された要約映像をリズム呈示要約映像と呼ぶこと
にする。After the step 408 is completed, the procedure advances to the step 409.
Step 409 realizes a video summarization method of reproducing time-sequential shots one after another while deciding the reproduction speed so that the boundary of the shot and the rhythm of the preset cycle have a correlation with each other, and evoking this “rhythm”. It is a procedure for doing. First, the playback speed is determined so that the summary video has a rhythm with a preset cycle at the shot boundaries. Next, the playback speed is modified so that the user can grasp the content of each shot while maintaining the rhythm. Note that step 409
The summary video created in Section 1 will be called the rhythm presentation summary video.

【００８７】手順４０９における具体的な動作を図１３
を用いて述べる。ショットの境界に予め設定した周期の
リズムをもたせるためには、例えば、各ショットの表示
時間を一定にするように、再生速度を決定すればよい。
また、リズムを想起させる方法としては、ショットの境
界で音をならす方法、ショットの境界で静止画像を表示
する方法など、各種の方法が考えられる。A concrete operation in step 409 is shown in FIG.
Will be described using. In order to give a rhythm of a preset cycle to the shot boundaries, for example, the playback speed may be determined so that the display time of each shot is constant.
Various methods are conceivable for evoking the rhythm, such as a method of smoothing sound at the boundaries of shots and a method of displaying a still image at the boundaries of shots.

【００８８】本実施例では、図１４に示す例のように、
ショットの先頭フレーム画像を静止時間Ｔstlの間表示
してから、ショット全体を早送り時間Ｔlenで早送り表
示する。このとき、（Ｔstl＋Ｔlen）間隔のリズムが発
生する。例えば、Ｔstlを0.2秒とし、Ｔlenを0.8秒に設
定すると、１秒間隔のリズムが発生する。In this embodiment, as in the example shown in FIG.
The first frame image of the shot is displayed for the still time Tstl, and then the entire shot is fast-forwarded for the fast-forward time Tlen. At this time, a rhythm with an interval of (Tstl + Tlen) is generated. For example, if Tstl is set to 0.2 seconds and Tlen is set to 0.8 seconds, a rhythm of 1 second interval is generated.

【００８９】図１３の手順１３０１では、処理対象ショ
ットSHkのフレーム数LSHk（ｋは１以上の整数）を求め
る。In step 1301 of FIG. 13, the number LSHk of frames (k is an integer of 1 or more) of the shot SHk to be processed is calculated.

【００９０】手順１３０２では、処理対象ショットSHk
の早送り部分の表示時間が、予め設定した早送り時間Ｔ
lenになるように、Ｖ（ｋ）＝LSHk／Ｔlen ・・・・・・・・・・・・・・・・・・・・（９）によって再生速度Ｖ（ｋ）を決める。In step 1302, the processing target shot SHk
The display time of the fast forward portion of the
The reproduction speed V (k) is determined by V (k) = LSHk / Tlen (9) so that len is obtained.

【００９１】手順１３０３では、ユーザーが処理対象シ
ョットSHkの内容を把握できるように、処理対象ショッ
トSHkの再生速度決定の基準である速度上限値Vmax
（ｋ）を決定する。この速度上限値Vmax（ｋ）の決定方
法は各種考えられる。例えば、手順４０７で決定した再
生速度を、処理対象ショットSHkの中で平均し、この値
を速度上限値Vmax（ｋ）に代入する方法が考えられる。
また、予め設定した固定値をVmax（ｋ）とみなす方法も
考えられる。さらに、手順４０７で決定した長時間類似
区間、標準区間、高速動作区間を利用した次の方法も考
えられる。In step 1303, the speed upper limit value Vmax which is a reference for determining the reproduction speed of the processing target shot SHk so that the user can grasp the contents of the processing target shot SHk.
Determine (k). Various methods of determining the speed upper limit value Vmax (k) can be considered. For example, a method of averaging the reproduction speeds determined in step 407 in the shot SHk to be processed and substituting this value for the speed upper limit value Vmax (k) can be considered.
Also, a method of considering a preset fixed value as Vmax (k) can be considered. Furthermore, the following method using the long-term similar section, the standard section, and the high-speed operation section determined in step 407 can be considered.

【００９２】[長時間類似区間、標準区間、高速動作区
間を利用した速度上限値決定方法] 手順１）長時間類似区間を含み、かつ、高速動作区間を
含まないショットでは、標準区間だけのショットに比べ
て、速度上限値Vmax（ｋ）を大きい値に設定する。例え
ば、８倍速に設定する。[Method for Determining Velocity Upper Limit Value Using Long-Time Similar Section, Standard Section, and High-Speed Operation Section] Procedure 1) A shot including only the standard section in a shot that includes a long-time similar section and does not include a high-speed operation section The speed upper limit value Vmax (k) is set to a larger value as compared with. For example, it is set to 8 times speed.

【００９３】手順２）高速動作区間を含み、かつ、長時
間類似区間を含まないショットでは、標準区間だけのシ
ョットに比べて、速度上限値Vmax（ｋ）を小さい値に設
定する。例えば、２倍速に設定する。Procedure 2) For a shot that includes a high-speed motion section and does not include a long-term similar section, the speed upper limit value Vmax (k) is set to a smaller value than a shot having only a standard section. For example, it is set to double speed.

【００９４】手順３）高速動作区間と長時間類似区間を
両方含むショットと、標準区間だけのショットでは、速
度上限値Vmax（ｋ）を同じ値に設定する。
例えば、４倍速に設定する。Procedure 3) The upper speed limit value Vmax (k) is set to the same value for the shot including both the high speed operation section and the long time similar section and the shot for the standard section only.
For example, it is set to 4 times speed.

【００９５】手順１３０４では、（９）式で決定した再
生速度Ｖ（ｋ）が、手順１３０３で決定した速度上限値
Vmax（ｋ）以上になるかどうかを、条件式Ｖ（ｋ）≧Vmax（ｋ）・・・・・・・・・・・・・・・・・・・・（１０）を用いて調べる。（１０）式が成立する場合には、ユー
ザーに内容を把握させるために、処理対象ショットSHk
の早送り部分の表示時間を延長し、再生速度を修正す
る。この修正処理は、手順１３０５で実行する。In step 1304, the reproduction speed V (k) determined by the equation (9) is equal to the speed upper limit value determined in step 1303.
Whether or not Vmax (k) or more is checked by using the conditional expression V (k) ≧ Vmax (k) ... (10). If the formula (10) is satisfied, the processing target shot SHk is displayed in order to let the user understand the contents.
Correct the playback speed by extending the display time of the fast forward part of. This correction processing is executed in procedure 1305.

【００９６】（１０）式が成立しない場合には、図１５
のショットSH1の部分のように、ショットSHkの早送り部
分の再生速度をＶ（ｋ）に決定してから、以上に述べて
きた手順１３０９を終了する。If the equation (10) is not satisfied, the process shown in FIG.
After the reproduction speed of the fast-forward portion of the shot SHk is determined to be V (k) like the shot SH1 portion of the above, the procedure 1309 described above is ended.

【００９７】すでに述べたように、手順１３０５は、シ
ョットの内容をユーザーに把握させるための処理であ
り、ショットの表示時間を延長し、再生速度を修正す
る。As described above, the procedure 1305 is a process for making the user understand the contents of the shot, and extends the shot display time and corrects the reproduction speed.

【００９８】ショット境界にリズムをもたせるために、
ショットの表示時間を「他のショットの整数倍」にす
る。ただし、各ショットを短時間で次々に再生するため
に、各ショットの早送り部分の表示時間を可能な限り短
くする。そこで手順１３０５では、（１０）式を満足す
るショットの早送り部分の表示時間を、他のショットの
２倍である２×Ｔlenに設定する。In order to give a rhythm to the shot boundary,
Set the shot display time to "an integral multiple of other shots". However, in order to reproduce each shot one after another in a short time, the display time of the fast-forward portion of each shot is made as short as possible. Therefore, in procedure 1305, the display time of the fast-forward portion of the shot satisfying the expression (10) is set to 2 × Tlen, which is twice the display time of other shots.

【００９９】次に、再生速度の決定方法について述べ
る。（１０）式を満たすショットの前半部分の再生速度
が、手順１３０３で決定した速度上限値Vmax（ｋ）以下
になるように修正する。すなわち、図１５のショットSH
2のように、前半部分の再生速度VP（ｋ）を速度上限値V
max（ｋ）以下の値に設定した上で、早送り部分全体の
表示時間を２×Ｔlenにするように後半部分の再生速度V
R（ｋ）を決定する。ただし、後半部分の再生速度VR
（ｋ）はVmax（ｋ）を越えてもよい。また、リズムを保
持するために、早送り部分の前半部分と後半部分の間に
静止部分を入れる。Next, a method of determining the reproduction speed will be described. The reproduction speed of the first half of the shot satisfying the expression (10) is corrected to be equal to or lower than the speed upper limit value Vmax (k) determined in step 1303. That is, the shot SH in FIG.
As shown in 2, set the playback speed VP (k) in the first half to the speed upper limit value V
Set a value equal to or less than max (k), and set the playback speed V in the latter half so that the display time for the entire fast-forward portion is set to 2 x Tlen.
Determine R (k). However, the playback speed VR in the second half
(K) may exceed Vmax (k). Also, in order to maintain the rhythm, a stationary part is inserted between the first half and the second half of the fast-forward part.

【０１００】早送り部分前半の再生速度VP（ｋ）は、The playback speed VP (k) in the first half of the fast-forward portion is

【０１０１】[0101]

【数２】によって決定できる。また、早送り部分後半の再生速度
VR（ｋ）は、[Equation 2] Can be determined by Also, the playback speed in the second half of the fast-forward portion
VR (k) is

【０１０２】[0102]

【数３】によって決定できる。（１１）、（１２）式の計算後、
以上に述べてきた手順４０９を終了する。(Equation 3) Can be determined by After calculating equations (11) and (12),
The procedure 409 described above is ended.

【０１０３】なお、内容を把握できないショットがいく
つか発生してもよい場合には、手順１３０２の終了後に
手順４０９を終了してもよい。また、ショットの変化を
予測できなくてもよい場合には、（９）式を用いずに再
生速度Ｖ（ｋ）を決定してもよい。If some shots whose contents cannot be grasped may occur, the procedure 409 may be terminated after the procedure 1302 is terminated. Further, when it is not necessary to predict the change in shot, the reproduction speed V (k) may be determined without using the equation (9).

【０１０４】図４の手順４０９の終了後、手順４１０に
進む。手順４１０から手順４１２までが、時系列のショ
ットの間の相関を用いて、類似ショットを統合してショ
ット群とみなし、一つのショット群から複数の時系列の
フレーム画像（例えば１秒分）からなる部分動画像を選
択してから、時系列の部分動画像を次々に再生する映像
要約方法を実現するための手順である。最初に、時系列
のショット間の相関を用いて、類似ショットを統合して
ショット群を作成する。次に、各ショット群から一つず
つの部分動画像を選択し、ショット群の特徴を明示する
ように加工する。例えば、類似物体を含むショットを統
合してショット群を作成する場合には、この類似物体が
ショット群の特徴となる。また、時間長がほぼ等しいシ
ョットを統合する場合には、この時間長がショット群の
特徴となる。After the procedure 409 of FIG. 4 is completed, the procedure proceeds to the procedure 410. From steps 410 to 412, similar shots are integrated and regarded as a shot group by using the correlation between shots in time series, and a plurality of time series frame images (for example, one second) are extracted from one shot group. This is a procedure for realizing a video summarization method of sequentially reproducing partial moving images in time series after selecting such partial moving images. First, similar shots are integrated to create a shot group using correlation between shots in time series. Next, one partial moving image is selected from each shot group and processed so as to clearly show the characteristics of the shot group. For example, when shots including similar objects are integrated to create a shot group, this similar object is a feature of the shot group. Further, when the shots having almost the same time length are integrated, this time length becomes a feature of the shot group.

【０１０５】要約映像を再生するときには、加工した部
分動画像を次々に再生する。なお、加工を実行せずに、
選択した部分動画像を直接再生してもよい。When reproducing the summary video, the processed partial moving images are reproduced one after another. In addition, without performing processing,
You may directly reproduce the selected partial moving image.

【０１０６】手順４１０から手順４１２によって作成し
た要約映像を部分選択要約映像と呼ぶことにする。The summary video created in steps 410 to 412 will be referred to as a partial selection summary video.

【０１０７】手順４１０は、ショット群を作成する処理
である。手順４１０における具体的な動作を図１６を用
いて述べる。Step 410 is a process for creating a shot group. The specific operation in step 410 will be described with reference to FIG.

【０１０８】ショットは内容の単位なので、１つのショ
ット内のフレーム画像は１つの共通の特徴をもつ。例え
ば、人物を追尾するショットでは、すべてのフレーム画
像に人物が登場する。したがって、ショット内の一部の
フレーム画像を用いて、ショットの内容を代表させるこ
とができる。本実施例では、処理を簡単にするために、
内容を代表させるフレーム画像の枚数Nrepを予め設定し
ておき、ショット内のNrep枚のフレーム画像（以下、代
表時空間画像と呼ぶ）を用いて、ショットの内容を代表
させる。Since shots are units of content, frame images in one shot have one common feature. For example, in a shot of tracking a person, the person appears in all frame images. Therefore, the content of the shot can be represented by using a part of the frame images in the shot. In this embodiment, in order to simplify the process,
The number of frame images Nrep representing the content is set in advance, and the content of the shot is represented by using Nrep frame images in the shot (hereinafter, referred to as representative spatiotemporal image).

【０１０９】手順１６０１では、ショットSHkを代表す
る画像を決定して、代表時空間画像Ｉk、j、（ｋは１以
上の整数、ｊは１以上Ｎrep以下の整数）とみなす。た
だし、代表時空間画像の決定方法としては、先頭部分の
Ｎrep枚を選択する方法、一定時間間隔の画像を選択す
る方法など、各種考えられる。また、処理量に制限がな
ければ、ショット内の全フレーム画像を代表時空間画像
とみなしてもよい。At step 1601, an image representative of the shot SHk is determined and regarded as the representative spatiotemporal image Ik, j, (k is an integer of 1 or more, j is an integer of 1 or more and Nrep or less). However, as a method of determining the representative spatiotemporal image, various methods such as a method of selecting the Nrep sheets at the head portion and a method of selecting images at fixed time intervals can be considered. If the processing amount is not limited, all frame images in the shot may be regarded as the representative spatiotemporal image.

【０１１０】手順１６０１の終了後、代表時空間画像を
用いて時系列の類似内容のショットを統合し、ショット
群とみなす。類似内容のショットの例としては、類似背
景のショットや類似被写体のショットなどがある。本実
施例では、画面上の物体の色と動きに注目してショット
を統合する。After the step 1601, the shots of similar contents in time series are integrated using the representative spatiotemporal image and regarded as a shot group. Examples of shots of similar content include shots of similar backgrounds and shots of similar subjects. In this embodiment, the shots are integrated by paying attention to the color and movement of the object on the screen.

【０１１１】筆者らの分析によれば、類似内容のショッ
トの代表時空間画像では、各代表時空間画像に共通する
色が存在し、かつ、共通する色をもつ物体が共通の動き
をする。画面に現れる２つの物体が共通の動きをしてい
るかどうかを調べることは難しいが、物体の動きによる
変化は後述する画素変化領域、動きベクトルなどを用い
て簡単に検出できる。そこで、代表時空間画像を処理し
て、画面上の物体の動きによる変化の影響を受ける領域
（以下、動領域と呼ぶ）と、残りの領域（以下、静止領
域と呼ぶ）とに分割する。このとき、類似内容のショッ
トは、次の共通色比率条件を満足する。According to the analysis by the authors, in the representative spatiotemporal images of shots having similar contents, there is a color common to each representative spatiotemporal image, and objects having the common color make common movements. It is difficult to check whether two objects appearing on the screen have a common motion, but a change due to the motion of the object can be easily detected by using a pixel change region, a motion vector, and the like described later. Therefore, the representative spatiotemporal image is processed and divided into an area (hereinafter, referred to as a moving area) affected by a change due to the movement of an object on the screen and a remaining area (hereinafter, referred to as a stationary area). At this time, shots having similar contents satisfy the following common color ratio condition.

【０１１２】共通色比率条件：代表時空間画像の静止領
域を構成する色、動領域を構成する色をそれぞれ静止構
成色、動構成色とよぶとき、代表時空間画像中に共通す
る静止構成色、動構成色が存在し、これらの色をもつ画
素が、各代表時空間画像でしきい値θshot以上の割合を
占める。Common color ratio condition: When a color forming a still area and a color forming a moving area of a representative spatiotemporal image are called a static constituent color and a moving constituent color, respectively, a still constituent color common to the representative spatiotemporal image. , There are moving constituent colors, and pixels having these colors occupy a proportion of the threshold θ shot or more in each representative spatiotemporal image.

【０１１３】なお、共通色比率条件を満足する類似内容
のショットの例を次に示しておく。例１：同じ場所で撮影したショットが続く場合のよう
な、類似背景のショット。図１７に示したプールサイド
の風景の２つのショットSHk-1、SHkの代表時空間画像Ｉ
k-1、1、Ｉk、1では、Ｉk-1、1中の静止構成色Ａの割合＝６０％Ｉk-1、1中の静止構成色Ｂの割合＝２０％Ｉk-1、1中の静止構成色Ｃの割合＝１５％Ｉk-1、1中の静止構成色Ｄの割合＝５％Ｉk、1中の静止構成色Ａの割合＝８０％Ｉk、1中の静止構成色Ｃの割合＝１５％Ｉk、1中の静止構成色Ｅの割合＝５％となり、共通する静止構成色Ａ、ＣのＩk-1、1中の割合の和＝７
５％共通する静止構成色Ａ、ＣのＩk、1中の割合の和＝９５
％となるので、共通する静止構成色が画面全体の７５％以
上を占める。An example of similar shots satisfying the common color ratio condition is shown below. Example 1: A shot with a similar background, such as shots taken at the same location. Representative spatiotemporal image I of two shots SHk-1 and SHk of the poolside landscape shown in FIG.
In k-1, 1, Ik, 1, Ik-1, the proportion of static constituent color A in 1 = 60% Ik-1, the proportion of static constituent color B in 1 = 20% Ik-1, in 1 Proportion of static constituent color C = 15% Ik-1, proportion of static constituent color D in 1 = 5% Ik, proportion of static constituent color A in 1 = 80% Proportion of static constituent color C in Ik, 1 = 15% Ik, the proportion of the stationary constituent color E in 1 = 5%, and the sum of the proportions of the common stationary constituent colors A and C in Ik-1, 1 = 7
5% The sum of the ratios of Ik of common stationary constituent colors A and C and 1 = 95
%, The common static component color occupies 75% or more of the entire screen.

【０１１４】例２：同じ被写体のショットが続く場合の
ような、類似被写体のショット。図１８に示した自動車
の追尾の２つのショットSHk-1、SHkの代表時空間画像Ｉ
k-1、1、Ｉk、1では、Ｉk-1、1中の静止構成色Ｖの割合＝４０％Ｉk-1、1中の静止構成色Ｗの割合＝４０％Ｉk-1、1中の静止構成色Ｘの割合＝１０％Ｉk-1、1中の静止構成色Ｙの割合＝１０％Ｉk、1中の静止構成色Ｚの割合＝４０％Ｉk、1中の静止構成色Ｗの割合＝３５％Ｉk、1中の静止構成色Ｘの割合＝１０％Ｉk、1中の静止構成色Ｙの割合＝１５％となり、共通する動構成色Ｗ、Ｘ、ＹのＩk-1、1中の割合の和＝
６０％共通する動構成色Ｗ、Ｘ、ＹのＩk、1中の割合の和＝６
０％となるので、共通する動構成色が画面全体の６０％以上
を占める。Example 2: A shot of a similar subject, such as when shots of the same subject continue. Representative spatiotemporal image I of two shots SHk-1 and SHk of vehicle tracking shown in FIG.
In k-1, 1, Ik, 1, Ik-1, the proportion of static constituent color V in 1 = 40% Ik-1, the proportion of static constituent color W in 1 = 40% Ik-1, in 1 Ratio of static constituent color X = 10% Ik-1, ratio of static constituent color Y in 1 = 10% Ik, ratio of static constituent color Z in 1 = 40% Ratio of static constituent color W in Ik, 1 = 35% Ik, the proportion of static constituent color X in 1 = 10% Ik, the proportion of still constituent color Y in 1 = 15%, and Ik-1, 1 of common moving constituent colors W, X, Y Sum of proportions =
60% Sum of ratios of common dynamic constituent colors W, X, Y in Ik, 1 = 6
Since it is 0%, the common dynamic constituent color occupies 60% or more of the entire screen.

【０１１５】図１６の手順１６０２では、代表時空間画
像から動領域を検出して、代表時空間画像を動領域と静
止領域に分割する。動きベクトルを用いると、動領域は
図１９に示すようになる。また、２枚のフレーム画像の
間で物体が移動しないとき、同じ位置にある画素の輝度
がほぼ等しいので、次の画素変化領域を動領域とみなす
と、動領域は図２０に示すようになる。In step 1602 of FIG. 16, a moving area is detected from the representative spatiotemporal image and the representative space-time image is divided into a moving area and a still area. Using the motion vector, the motion area becomes as shown in FIG. Further, when the object does not move between the two frame images, the brightness of the pixels at the same position is almost equal, so that if the next pixel change area is regarded as the moving area, the moving area becomes as shown in FIG. .

【０１１６】画素変化領域：代表時空間画像中のフレー
ム画像Ｉk、jと、その１フレーム時間後のフレ
ーム画像との間で、「同じ位置にある画素の輝度差の絶
対値がしきい値 θW1以上となる画素」の集合。Pixel change region: The frame image Ik, j in the representative spatiotemporal image and the frame image one frame time later.
A set of "pixels where the absolute value of the brightness difference between pixels at the same position is greater than or equal to the threshold value θW1".

【０１１７】手順１６０３は、静止構成色、動構成色を
求める処理である。以下、画素の赤色成分、緑色成分、
青色成分をそれぞれＲ、Ｇ、Ｂと呼ぶことにして、手順
を説明する。まず、ショットSHkの代表時空間画像の静
止領域において、ＲＧＢ各８階調の５１２色ｃのヒスト
グラムHS（ｃ、SHk）を求める。同様に、動領域のヒス
トグラムHM（ｃ、SHk）を求める。次に、画素数が少な
い色をノイズとみなして除外するために、構成色の最低
画素数θHを設定し、次式を用いて静止構成色HSVk、動
構成色HMVk（ｋは１以上の整数）を求める。Step 1603 is a process for obtaining a static constituent color and a moving constituent color. Hereafter, the red component, green component, and
The procedure will be described by referring to the blue components as R, G, and B, respectively. First, in the stationary area of the representative space-time image of the shot SHk, a histogram HS (c, SHk) of 512 colors c of 8 gradations of RGB is obtained. Similarly, the histogram HM (c, SHk) of the moving area is obtained. Next, in order to exclude colors with a small number of pixels as noise, set the minimum number of constituent colors θH, and use the following formulas for static constituent colors HSVk and moving constituent colors HMVk (k is an integer of 1 or more). ).

【０１１８】 HSVk＝｛c｜HS(c,SHk)＞θH｝・・・・・・・・・・・・・・・・・（１３） HMVk＝｛c｜HM(c,SHk)＞θH｝・・・・・・・・・・・・・・・・・（１４）以下では、ショット群の先頭のショットをSHtop（topは
１以上の整数）と記述する。手順１６０４から手順１６
０６までは、このショットSHtop以降の時系列のショッ
トを調べて、どこまでのショットが共通色比率条件を満
たすか調べる処理である。HSVk = {c | HS (c, SHk)> θH} (13) HMVk = {c | HM (c, SHk)> θH} (14) In the following, the top shot of the shot group is described as SHtop (top is an integer of 1 or more). Steps 1604 to 16
Up to 06, the processing is performed by examining time-series shots after this shot SHtop to see which shots satisfy the common color ratio condition.

【０１１９】手順１６０４では、ショットSHtop〜SHtop
+mの間で共通する静止構成色、動構成色を、それぞれ静
止共通色HSCtop、m、動共通色HMCtop、m（top、ｍは１以
上の整数）と呼ぶこととし、次式を用いて計算する。In step 1604, shots SHtop-SHtop
The static component color and the dynamic component color that are common between + m are referred to as the static common color HSCtop, m and the dynamic common color HMCtop, m (top, m is an integer of 1 or more), and the following formula is used. calculate.

【０１２０】[0120]

【数４】手順１６０５では、「静止共通色HSCtop、mと動共通色HM
Ctop、mをもつ画素がショットSHkの代表時空間画像中に
占める割合」を画像共通色比率AMC（k,top,m)と呼ぶこ
ととし、次式を用いて計算する。[Equation 4] In step 1605, the "static common color HSCtop, m and moving common color HM
The ratio of the pixel having Ctop, m to the representative space-time image of the shot SHk "is called the image common color ratio AMC (k, top, m), and is calculated using the following formula.

【０１２１】[0121]

【数５】手順１６０６では、共通色比率条件の成立を表す式 AMC（k,top,m）≧θshot、top≦ｋ≦top+m ・・・・・・・・・・・（１８）が、すべてのｋに対して成立するとき、ショットSHtop
〜SHtop+mが共通色比率条件を満足するとみなす。どこ
までのショットが共通色比率条件を満たすか調べるため
に、すべてのｋに対して（１８）式が成立する場合に
は、ｍに１を足してから手順１６０４に戻る。そうでな
い場合には、手順１６０７に進む。(Equation 5) In step 1606, the expressions AMC (k, top, m) ≧ θshot, top ≦ k ≦ top + m (18) representing the establishment of the common color ratio condition are all k Shot SHtop
~ SHtop + m is regarded as satisfying the common color ratio condition. In order to check up to which shots satisfy the common color ratio condition, if the expression (18) is satisfied for all ks, 1 is added to m and the process returns to step 1604. Otherwise, proceed to step 1607.

【０１２２】本実施例では、時系列の２つのショットの
間の画像共通色比率を用いてショット群の末尾を求め
る。まず、手順１６０７で、（１７）式で定義した画像
共通色比率AMC（k,k,1）、AMC（k+1,k,1）（ｋはtop以
上top+m以下の整数）を計算する。前者AMC（k,k,1）は
「２つのショットSHk、SHk+1の静止共通色と動共通色
が、前のショットSHkの代表時空間画像中に占める割
合」を表す。後者AMC（k+1,k,1）は、「２つのショット
SHk、SHk+1の静止共通色と動共通色が、後のショットSH
k+1の代表時空間画像中に占める割合」を表す。In this embodiment, the end of the shot group is obtained using the image common color ratio between two time-series shots. First, in step 1607, the image common color ratios AMC (k, k, 1) and AMC (k + 1, k, 1) defined by equation (17) (k is an integer from top to top + m) are calculated. To do. The former AMC (k, k, 1) represents “the ratio of the static common color and the dynamic common color of the two shots SHk and SHk + 1 to the representative spatiotemporal image of the previous shot SHk”. The latter AMC (k + 1, k, 1) is "two shots
SHk, SHk + 1 static common colors and dynamic common colors are shot later SH
“K + 1 occupies the representative spatiotemporal image”.

【０１２３】手順１６０８では、画像共通色比率の最小
値 SS（k）＝min（AMC(k,k,1)、AMC(k+1,k,1)）・・・・・・・・・・（１９）を計算し、ショットSHk、SHk+1の統合のための優先度と
みなす。In step 1608, the minimum value SS (k) = min (AMC (k, k, 1), AMC (k + 1, k, 1)) of the image common color ratio is ... -Calculate (19) and regard it as the priority for the integration of shots SHk and SHk + 1.

【０１２４】手順１６０９では、共通色比率条件を満た
すショットSHtop〜SHtop+m-1と、その次のショットSHto
p+mの中から、ショット群の末尾を求める。これらのシ
ョットの間の統合の優先度SS（top）〜SS（top+m-1）の
中から、優先度が最小であることを表す式 SS（kmin）≦SS（k）、top≦k≦top+m-1 ・・・・・・・・・・・・（２０）を満たすSS（kmin）を求め、図２１（ａ）及び（ｂ）の
例のように、優先度が最小になるショットSHkminをショ
ット群の末尾とみなす。（図２１（ａ）及び（ｂ）では
２番目のショット）そして、以上に述べてきた手順１６
１０を終了する。At step 1609, the shots SHtop to SHtop + m-1 which satisfy the common color ratio condition and the next shot SHto.
Find the end of the shot group from p + m. Among the integration priorities SS (top) to SS (top + m-1) between these shots, the expression SS (kmin) ≤ SS (k), top ≤ k that represents the lowest priority. ≤top + m-1 ····························· (20) to find the SS (kmin), as shown in the example of FIG. Shot SHkmin is considered to be the end of the shot group. (The second shot in FIGS. 21A and 21B) And the procedure 16 described above
Finish 10

【０１２５】なお、ショット群を求める方法は各種考え
られる。例えば、時系列のショットで、時間長がほぼ等
しい場合には、内容が類似するので、一つのショット群
とみなすことができる。したがって、本実施例と異なる
方法を用いて、ショット群を求めてもよい。Various methods of obtaining the shot group can be considered. For example, in the case of time-series shots, if the time lengths are almost equal, the contents are similar and can be regarded as one shot group. Therefore, the shot group may be obtained using a method different from that of this embodiment.

【０１２６】また、手順１６０６の後、手順１６０７に
進まずに、共通色比率条件を満たすショットの末尾SHk+
m-1をショット群の末尾とみなして、手順４１０を終了
してもよい。After step 1606, the process proceeds to step 1607 and the end of the shot SHk +
The procedure 410 may be terminated by regarding m-1 as the end of the shot group.

【０１２７】図４の手順４１０の終了後、手順４１１に
進む。手順４１１では、各ショット群から一つずつの部
分動画像を選択する。ただし、部分動画像の選択方法
は、ショット群の先頭ショットの先頭部分、ショット群
の内容を代表するショットの先頭部分など、各種の方法
がある。また、部分動画像の時間長の設定方法について
も、固定長にする方法、音声の内容によって時間長を変
化させる方法など、各種の方法が考えられる。After the step 410 of FIG. 4 is completed, the procedure advances to step 411. In step 411, one partial moving image is selected from each shot group. However, there are various methods of selecting the partial moving image, such as the head portion of the head shot of the shot group and the head portion of the shot representing the contents of the shot group. As for the method of setting the time length of the partial moving image, various methods such as a fixed length method and a method of changing the time length depending on the content of the sound can be considered.

【０１２８】なお、手順４１１では、部分動画像を選択
したが、各ショット群から一つずつの静止画像を選択し
てもよい。この場合、選択した静止画像を次々に表示す
ると、要約映像になる。また、選択した静止画像を縮小
して一覧表示をしてもよい。一覧表示結果は、要約映像
と同等に扱うことができる。Although the partial moving image is selected in step 411, one still image may be selected from each shot group. In this case, when the selected still images are displayed one after another, a summary video is displayed. Further, the selected still images may be reduced and displayed in a list. The list display result can be handled in the same manner as the summary video.

【０１２９】手順４１１の終了後、手順４１２に進む。
手順４１２は、ショット群の特徴を明示するための処理
であり、手順４１１で選択した部分動画像を加工処理す
る。After the procedure 411 is completed, the procedure proceeds to the procedure 412.
Step 412 is processing for clearly indicating the characteristics of the shot group, and the partial moving image selected in step 411 is processed.

【０１３０】本実施例では、ショット群に含まれるショ
ットが共通色比率条件を満たす。すなわち、ショット群
を構成する各ショットに、手順１６０７で求めた静止共
通色、動共通色が存在する。In this embodiment, the shots included in the shot group satisfy the common color ratio condition. That is, each shot forming the shot group has the static common color and the dynamic common color obtained in step 1607.

【０１３１】静止共通色、動共通色は、ショット群に共
通して存在する物体の色を表すので、ショット群の特徴
とみなすことができる。例えば、類似背景のショットな
らば背景の色を表し、類似被写体のショットならば、被
写体の色を表す。そこで、手順４１２では、静止共通色
と動共通色の画素を通常と同様に表示し、これ以外の画
素を、半分の輝度で表示する。なお、静止共通色と動共
通色を強調するものならば、静止共通色と動共通色を除
いた画素の輝度をゼロにするなど、どのような加工処理
方法を用いてもよい。Since the static common color and the dynamic common color represent the colors of the objects that commonly exist in the shot groups, they can be regarded as the characteristics of the shot groups. For example, a shot of a similar background indicates the background color, and a shot of a similar subject indicates the subject color. Therefore, in step 412, the pixels of the stationary common color and the pixels of the moving common color are displayed as usual, and the other pixels are displayed with half the luminance. Note that any processing method may be used as long as the static common color and the dynamic common color are emphasized, for example, the luminance of pixels excluding the static common color and the dynamic common color is set to zero.

【０１３２】また、画像の加工処理方法は、ショットの
統合方法にあわせて決めるものであり、様々なバリエー
ションが存在する。例えば、時間長がほぼ等しいショッ
トを統合する場合には、図２２のように、部分動画像の
下側にショットの平均時間長を表示する加工処理方法な
どが考えられる。The image processing method is determined according to the shot integration method, and there are various variations. For example, in the case of integrating shots having almost the same time length, a processing method in which the average time length of shots is displayed below the partial moving image as shown in FIG. 22 can be considered.

【０１３３】手順４１２の終了後、手順４０２に戻る。
なお、ショット群の特徴を明示する必要がない場合に
は、手順４１２を実行せずに、手順４１１の終了後に手
順４０２に戻ってもよい。After the step 412 is completed, the procedure returns to the step 402.
If it is not necessary to clearly specify the characteristics of the shot group, the procedure 412 may be skipped and the procedure may return to the procedure 402 after the procedure 411 ends.

【０１３４】以上が、図１における第１の計算機５の映
像要約処理の一実施例であり、図２の手順２０１の詳細
な説明である。The above is one embodiment of the video summarization process of the first computer 5 in FIG. 1, and is a detailed description of the procedure 201 in FIG.

【０１３５】映像要約処理の終了後、図２の手順２０２
を実行する。手順２０２では、図１における第１の計算
機５がビデオディスク装置１やＶＴＲ２を制御して、処
理対象映像と手順２０１で要約した映像を再生する。再
生された映像は図１の映像圧縮装置４においてＭＰＥＧ
方式で圧縮され、ファイルサーバー６に記憶される。After the video summarization process is completed, the procedure 202 in FIG.
To execute. In step 202, the first computer 5 in FIG. 1 controls the video disk device 1 and the VTR 2 to reproduce the video to be processed and the video summarized in step 201. The reproduced video is MPEG in the video compression device 4 of FIG.
It is compressed by the method and stored in the file server 6.

【０１３６】ただし、本実施例の手順２０１は、要約し
た映像を作成せずに、図３に示したような要約映像の再
生方法を作成するので、この要約映像の再生方法にした
がってビデオディスク装置１やＶＴＲ２を制御すること
で、要約映像の再生を実行する。However, in the procedure 201 of the present embodiment, since the method of reproducing the summarized video as shown in FIG. 3 is created without creating the summarized video, the video disc device is operated according to the method of reproducing the summarized video. By controlling 1 or VTR2, reproduction of the summary video is executed.

【０１３７】なお、必ずしもＭＰＥＧ方式で圧縮する必
要はなく、ＪＰＥＧ方式などの他の圧縮方式で圧縮して
もよい。また、要約映像の記憶方法については、すでに
述べたように、映像を直接圧縮する方法だけでなく、他
の方法が各種考えられる。It is not always necessary to compress in the MPEG system, but it may be compressed in another compression system such as the JPEG system. Further, as the storage method of the summary video, as described above, not only the method of directly compressing the video but also various other methods can be considered.

【０１３８】以下では、図２における手順２０３の具体
的動作である、図１の第２の計算機７の要約映像再生処
理について述べる。The summary video reproduction process of the second computer 7 of FIG. 1, which is a concrete operation of the procedure 203 in FIG. 2, will be described below.

【０１３９】図２３は、図１における第２の計算機７の
要約映像再生処理の一実施例のフローチャートである。FIG. 23 is a flowchart of an embodiment of the summary video reproduction process of the second computer 7 in FIG.

【０１４０】手順２３０１では、映像表示方法の選択を
行う。ただし、多数の選択肢が考えられる。例えば、映
像の内容を詳しく見たい場合には、「処理対象映像の標
準速度再生」を選択すればよい。映像表示時に、ユーザ
ーの見たい映像が普通の速度で再生される。In step 2301, a video display method is selected. However, many options are possible. For example, if the user wants to see the details of the video in detail, “standard speed playback of the video to be processed” may be selected. When the video is displayed, the video that the user wants to watch is played back at a normal speed.

【０１４１】また、被写体の動作を中心として内容を把
握したい場合には、図４の手順４０５から手順４０８ま
での処理によって作成した「区間変速要約映像」を選択
すればよい。映像表示時に、主観的にみて再生速度が許
容範囲内に入るようにしながら、映像全体が再生され
る。When it is desired to grasp the contents centering on the motion of the subject, the "section shift summary video" created by the processing of steps 405 to 408 in FIG. 4 may be selected. When the image is displayed, the entire image is reproduced while the reproduction speed is subjectively within the allowable range.

【０１４２】なお、主観には個人差がある。したがっ
て、ユーザーは、自分の主観に合わせて、再生速度決定
方法を選択してもよい。具体的には、次の３つの処理を
実行するかわりに、その一部だけを実行したり、「テク
スチャーの細かい映像の部分で、再生速度を相対的に遅
くする」処理などを付加してもよい。There are individual differences in subjectivity. Therefore, the user may select the reproduction speed determination method according to his / her subjectivity. Specifically, instead of executing the following three processes, only a part of them may be executed, or a process of “relatively slowing the playback speed in a part of a video with a fine texture” may be added. Good.

【０１４３】・長時間類似区間で、再生速度を相対的に
速くするための、手順４０５・高速動作区間で、再生速度を相対的に遅くするため
の、手順４０６・短時間長ショットで、再生速度を相対的に遅くするた
めの、手順４０８さらに、映像に含まれる内容の時間的な流れを知りたい
場合には、図４の手順４０９によって作成した「リズム
呈示要約映像」を選択すればよい。映像表示時に、同じ
内容の部分をまとめたショットが次々に再生される。-Procedure 405 for relatively increasing the reproduction speed in a long-time similar section-Procedure 406 for relatively decreasing the reproduction speed in a high-speed operation section-Reproduction with a short short shot Procedure 408 for making the speed relatively slower In addition, if the user wants to know the temporal flow of the contents included in the video, the “rhythm presentation summary video” created by the procedure 409 in FIG. 4 may be selected. . At the time of image display, shots that have the same content are collected one after another.

【０１４４】できるだけ異なる内容の部分を少しづつ見
たい場合には、図４の手順４１０から手順４１２までの
処理によって作成した「部分選択要約映像」を選択すれ
ばよい。映像表示時に、時系列の類似する内容のショッ
トをまとめたショット群が次々に再生される。When it is desired to see the portions having different contents as little as possible, the "partial selection summary video" created by the processing from step 410 to step 412 in FIG. 4 may be selected. At the time of image display, a shot group in which shots having similar contents in time series are collected is played one after another.

【０１４５】図２３の手順２３０２では、手順２３０１
による選択結果にしたがって、必要な映像を図１のファ
イルサーバー６から呼び出し、その映像を再生する。た
だし、要約映像作成のためのフレーム番号情報だけがフ
ァイルサーバー６に保存されている場合には、フレーム
番号情報にしたがって、処理対象映像を要約しながら再
生する。また、ユーザーは、映像再生前に、「どのフレ
ームからどのフレームまで再生するか」を指定してもよ
いし、「どのフレームから再生するか」だけを指定し
て、映像を見ながら好みの時点で再生を中断してもよ
い。In step 2302 of FIG. 23, step 2301
According to the selection result according to, the required video is called from the file server 6 in FIG. 1 and the video is reproduced. However, when only the frame number information for creating the summary video is stored in the file server 6, the processing target video is reproduced while being summarized in accordance with the frame number information. In addition, the user may specify "from which frame to which frame to play" before playing the video, or by only specifying "from which frame to play" and watch the video at the desired time. Playback may be interrupted with.

【０１４６】なお、図４に示した映像要約処理の流れ
は、手順２３０１の選択肢にあわせて決めればよい。た
とえば、手順２３０１の選択肢に「リズム呈示要約映
像」が含まれない場合には、図４の手順４０９を実行す
る必要はない。同様に、「区間変速要約映像」が含まれ
ない場合には、図４の手順４０５から手順４０８までを
実行する必要はなく、「部分選択要約映像」が含まれな
い場合には、図４の手順４１０から手順４１２までを実
行する必要はない。Note that the flow of the video summarization process shown in FIG. 4 may be decided according to the option of step 2301. For example, when the option of step 2301 does not include “rhythm presentation summary video”, step 409 of FIG. 4 does not need to be executed. Similarly, if the "section shift summary video" is not included, it is not necessary to perform steps 405 to 408 in FIG. 4, and if the "partial selection summary video" is not included, the procedure in FIG. It is not necessary to perform steps 410 to 412.

【０１４７】また、本実施例では、要約した映像と処理
対象映像を図１の第２の計算機７上で再生する例を示し
たが、ファイルサーバー６と結ばれている他の計算機上
で再生してもよい。例えば、図１の第１の計算機５上で
再生してもよいし、２台以上の計算機がファイルサーバ
ー６と結ばれている場合には、そのすべての計算機上で
再生してもよい。Further, in the present embodiment, an example in which the summarized video and the video to be processed are played back on the second computer 7 in FIG. 1 is shown, but it is played back on another computer connected to the file server 6. You may. For example, the reproduction may be performed on the first computer 5 in FIG. 1, or if two or more computers are connected to the file server 6, the reproduction may be performed on all the computers.

【０１４８】以上のように、本実施例の区間変速要約映
像によれば、一定時間以上にわたって類似した画像が続
く時間区間を検出して長時間類似区間とみなし、長時間
類似区間を相対的に速く再生することで、映像が短時間
で次々に変化するので、「主観的に再生速度が遅い部
分」が消滅する。また、高速で動く物体を含む時間区間
を検出して高速動作区間とみなし、高速動作区間の再生
速度を相対的に遅くすることで、高速で動く物体が存在
しなくなるので「主観的に再生速度が速い部分」が減少
する。さらに、映像を内容ごとにまとめて複数のショッ
トに分割してから、各ショットの表示時間長に下限を設
けることで、各内容が一定時間以上表示されるので、
「主観的に再生速度が速い部分」が消滅する。このよう
に、主観的にみて、再生速度が許容範囲内に入るように
しながら映像全体を再生することで、従来法に比べてユ
ーザーの疲労が軽減される。また、被写体の動作を中心
とした内容を把握することができる。As described above, according to the section shift summary video of the present embodiment, a time section in which similar images continue for a certain time or more is detected and regarded as a long-time similar section, and the long-time similar section is relatively set. By playing back fast, the video changes one after another in a short time, so that the "subjectively slow playback speed" disappears. Also, by detecting a time section that includes an object that moves at high speed and considering it as a high-speed operation section, and making the playback speed in the high-speed operation section relatively slow, there is no object that moves at high speed. There is less "fast part". Furthermore, by dividing the video into multiple shots by content and setting a lower limit on the display time length of each shot, each content is displayed for a certain period of time or more,
The “subjectively high playback speed” disappears. Thus, subjectively, by reproducing the entire image while keeping the reproduction speed within the allowable range, user fatigue is reduced as compared with the conventional method. In addition, it is possible to grasp the contents centering on the motion of the subject.

【０１４９】また、本実施例のリズム呈示要約映像で
は、各ショットの表示時間長に上限を設定してから、次
の２つの条件の少なくとも一方を満足するように再生速
度を決定し、リズムを想起させながら、時系列のショッ
トを次々に再生する。Further, in the rhythm presentation summary video of this embodiment, after setting the upper limit to the display time length of each shot, the reproduction speed is determined so as to satisfy at least one of the following two conditions, and the rhythm is set. Recalling the time-sequential shots one after another while recalling.

【０１５０】リズム条件：ショットの境界と、予め設定
した周期のリズムが相関をもつ。Rhythm condition: A shot boundary and a rhythm of a preset cycle have a correlation.

【０１５１】内容条件：「再生速度決定の基準である
速度上限値以下の速度で再生した部分は必ず内容を把握
できる」ということを保証するように速度上限値が設定
されているとき、各ショットの少なくとも一部分の再生
速度が速度上限値以下になる。Contents condition: When the speed upper limit value is set so as to guarantee that "a portion reproduced at a speed equal to or lower than the speed upper limit value which is the criterion for determining the reproduction speed can always grasp the contents", each shot The playback speed of at least a part of is below the upper speed limit.

【０１５２】リズム条件を満足するように映像を要約す
ることで、ショットの変化を予測できるので、すべての
ショットを見落とさずに見ることができる。By summarizing the video so as to satisfy the rhythm condition, it is possible to predict a change in shots, so that all shots can be viewed without being overlooked.

【０１５３】内容条件を満足するように映像を要約する
ことで、ユーザーは、すべてのショットにおいて、少な
くともその一部分の内容を把握することができる。同じ
内容の部分をまとめたものがショットになっているの
で、すべてのショットの内容を把握できることになる。By summarizing the video so as to satisfy the content condition, the user can grasp the content of at least a part of all shots. A shot is a collection of parts with the same contents, so you can understand the contents of all shots.

【０１５４】ただし、リズム条件を満足しない場合に
は、ショットの変化を予測できないので、ショットの見
落としが発生する。内容条件を満足しない場合には、一
部のショットの内容を把握できない。リズム条件と内容
条件を同時に満足することが望ましい。However, when the rhythm condition is not satisfied, the change of the shot cannot be predicted, so that the shot is overlooked. If the content condition is not satisfied, the content of some shots cannot be grasped. It is desirable to satisfy both the rhythm condition and the content condition at the same time.

【０１５５】さらに、本実施例の部分選択要約映像で
は、時系列のショットの間の相関を用いて、類似ショッ
トを統合してショット群を作成してから、各ショット群
から選択した部分動画像を次々に再生する。この要約映
像を用いることで、できるだけ異なる内容の部分を効率
よく見ることができる。Further, in the partial selection summary video of this embodiment, similar shots are integrated by using the correlation between shots in time series to create a shot group, and then a partial moving image selected from each shot group is created. Are played one after another. By using this summary video, it is possible to efficiently see different content parts.

【０１５６】[0156]

【発明の効果】以上のように本発明は、第１に、一定時
間以上にわたって類似した画像が続く時間区間を検出し
て長時間類似区間とみなし、長時間類似区間を相対的に
速く再生することで、映像を短時間で次々に変化させ、
「主観的に再生速度が遅い部分」を消滅させることがで
きる。また、高速で動く物体を含む画像の時間区間を検
出して高速動作区間とみなし、高速動作区間の再生速度
を相対的に遅くすることで、高速で動く物体を消滅さ
せ、「主観的に再生速度が速い部分」を減少させること
ができる。さらに、同じ内容の部分をショットにまとめ
てから、各ショットの表示時間長に下限を設けること
で、各内容を一定時間以上表示させ、「主観的に再生速
度が速い部分」を消滅させることができる。このよう
に、主観的にみて、再生速度が許容範囲内に入るように
しながら映像全体を再生することで、従来法に比べてユ
ーザーの疲労が軽減される。また、被写体の動作を中心
とした内容を把握することができる。As described above, according to the present invention, firstly, a time section in which similar images continue for a certain time or more is detected and regarded as a long-time similar section, and the long-time similar section is reproduced relatively quickly. By doing so, you can change the image one after another in a short time,
The “subjectively slow playback speed” can be eliminated. In addition, by detecting the time interval of the image containing a fast-moving object and considering it as the high-speed motion interval, and by making the playback speed of the high-speed motion interval relatively slow, the fast-moving object disappears, and It is possible to reduce the “high speed portion”. Furthermore, by grouping shots of the same content, and setting a lower limit on the display time length of each shot, each content can be displayed for a certain period of time or longer, and the "subjectively high playback speed" can be eliminated. it can. Thus, subjectively, by reproducing the entire image while keeping the reproduction speed within the allowable range, user fatigue is reduced as compared with the conventional method. In addition, it is possible to grasp the contents centering on the motion of the subject.

【０１５７】また第２に、各ショットの表示時間長に上
限を設定してから、次の２つの条件の少なくとも一方を
満足するように再生速度を決定し、リズムを想起させな
がら、時系列のショットを次々に再生する。Secondly, after setting an upper limit on the display time length of each shot, the playback speed is determined so as to satisfy at least one of the following two conditions, and the rhythm is recollected in time series. Replay shots one after another.

【０１５８】リズム条件：ショットの境界と、予め設定
した周期のリズムが相関をもつ。Rhythm condition: the shot boundary and the rhythm of a preset cycle have a correlation.

【０１５９】内容条件：「再生速度決定の基準である
速度上限値以下の速度で再生した部分は必ず内容を把握
できる」ということを保証するように速度上限値が設定
されているとき、各ショットの少なくとも一部分の再生
速度が速度上限値以下になる。Content condition: Each shot when the speed upper limit value is set so as to guarantee that "a part reproduced at a speed equal to or lower than the speed upper limit value which is a criterion for determining the reproduction speed can always grasp the content". The playback speed of at least a part of is below the upper speed limit.

【０１６０】リズム条件を満足するように映像を要約す
ることで、ショットの変化を予測できるので、すべての
ショットを見落とさずに見ることができる。By summarizing the video so as to satisfy the rhythm condition, the change in shots can be predicted, so that all shots can be viewed without being overlooked.

【０１６１】内容条件を満足するように映像を要約する
ことで、ユーザーは、すべてのショットにおいて、少な
くともその一部分の内容を把握することができる。同じ
内容の部分をまとめたものがショットになっているの
で、すべてのショットの内容を把握できることになる。By summarizing the video so as to satisfy the content condition, the user can grasp the content of at least a part of all shots. A shot is a collection of parts with the same contents, so you can understand the contents of all shots.

【０１６２】ただし、リズム条件を満足しない場合に
は、ショットの変化を予測できないので、ショットの見
落としが発生する。内容条件を満足しない場合には、一
部のショットの内容を把握できない。リズム条件と内容
条件を同時に満足することが望ましい。However, when the rhythm condition is not satisfied, the change of the shot cannot be predicted, so that the shot is overlooked. If the content condition is not satisfied, the content of some shots cannot be grasped. It is desirable to satisfy both the rhythm condition and the content condition at the same time.

【０１６３】さらに第３に、時系列のショットの間の相
関を用いて、類似ショットを統合してショット群を作成
してから、各ショット群から選択した部分動画像を次々
に再生する。この要約映像を用いることで、できるだけ
異なる内容の部分を効率よく見ることができる。Thirdly, similar shots are integrated to create shot groups by using the correlation between shots in time series, and then partial moving images selected from each shot group are reproduced one after another. By using this summary video, it is possible to efficiently see different content parts.

[Brief description of drawings]

【図１】本発明の一実施例における映像要約装置の全体
システム図FIG. 1 is an overall system diagram of a video summarizing device according to an embodiment of the present invention.

【図２】同実施例における映像要約装置の動作のフロー
チャートFIG. 2 is a flowchart of the operation of the video summarizing device in the embodiment.

【図３】同実施例におけるファイル形式による要約映像
の表現の概念図FIG. 3 is a conceptual diagram of representation of a summary video in a file format in the embodiment.

【図４】同実施例における映像要約処理のフローチャー
トFIG. 4 is a flowchart of video summarization processing in the same embodiment.

【図５】同実施例における長時間類似区間検出処理のフ
ローチャートFIG. 5 is a flowchart of a long-term similar section detection process in the embodiment.

【図６】同実施例における部分領域の作成例を示す図FIG. 6 is a diagram showing an example of creating a partial area in the embodiment.

【図７】同実施例における映像のゆるやかな映像変化を
する部分を示す図FIG. 7 is a diagram showing a portion where a video image changes gradually in the embodiment.

【図８】同実施例における高速動作区間検出処理のフロ
ーチャートFIG. 8 is a flowchart of a high speed operation section detection process in the same embodiment.

【図９】同実施例における画面上の物体が高速に移動す
る場合を示す図FIG. 9 is a diagram showing a case where an object on the screen moves at high speed in the embodiment.

【図１０】同実施例における（６）式のみを用いた画素
の比較を示す図FIG. 10 is a diagram showing comparison of pixels using only the formula (6) in the example.

【図１１】同実施例における（６）、（７）式におけ
る、位置ｐの画素の比較を示す図FIG. 11 is a diagram showing a comparison of pixels at position p in equations (6) and (7) in the same embodiment.

【図１２】同実施例における高速動作画素検出を実行す
る領域を示す図FIG. 12 is a diagram showing a region in which high-speed operation pixel detection is performed in the same embodiment.

【図１３】同実施例におけるリズム呈示要約処理のフロ
ーチャートFIG. 13 is a flowchart of a rhythm presentation summary process in the same embodiment.

【図１４】同実施例におけるリズム条件を満たす再生方
法の概念図FIG. 14 is a conceptual diagram of a reproducing method satisfying a rhythm condition in the embodiment.

【図１５】同実施例におけるリズム呈示要約映像の再生
方法の概念図FIG. 15 is a conceptual diagram of a method of reproducing a rhythm presentation summary video in the embodiment.

【図１６】同実施例におけるショット統合処理のフロー
チャートFIG. 16 is a flowchart of shot integration processing in the same embodiment.

【図１７】同実施例における類似背景のショットの概念
図FIG. 17 is a conceptual diagram of a shot of a similar background in the example.

【図１８】同実施例における類似被写体のショットの概
念図FIG. 18 is a conceptual diagram of a shot of a similar subject in the embodiment.

【図１９】同実施例における動領域の概念図FIG. 19 is a conceptual diagram of a moving area in the example.

【図２０】同実施例における動領域の概念図FIG. 20 is a conceptual diagram of a moving area in the example.

【図２１】同実施例における末尾決定処理の概念図FIG. 21 is a conceptual diagram of tail determination processing in the same embodiment.

【図２２】同実施例における部分動画像の加工方法の概
念図FIG. 22 is a conceptual diagram of a method for processing a partial moving image in the same example.

【図２３】同実施例における要約映像再生処理のフロー
チャートFIG. 23 is a flowchart of a summary video reproduction process in the embodiment.

[Explanation of symbols]

１ビデオディスク装置２ＶＴＲ３フレームメモリ４映像圧縮装置５計算機６ファイルサーバー７計算機 1 Video Disk Unit 2 VTR 3 Frame Memory 4 Video Compressor 5 Computer 6 File Server 7 Computer

Claims

[Claims]

1. A similarity between frame images of videos to be summarized is calculated, and a long-term similar section, which is a time section in which similar images continue for a predetermined time or more, is detected, and the long-term similar section is detected. The video summarization method for reproducing the video data faster than the time interval other than the long time similar interval.

2. The two frame images are each divided into a plurality of partial areas, and the two are divided by using the similarity of the partial areas.
The video summarization method according to claim 1, wherein the degree of similarity between the frame images is calculated.

3. A high-speed operation section, which is a time section in which an object on the screen moves faster than other time sections, is detected from a frame image of a video to be summarized, and the high-speed operation section is other than the high-speed operation section. A video summarization method that plays back slower than the time interval.

4. The video summarization method according to claim 3, wherein the high-speed motion section is detected in a predetermined area of the frame image.

5. A time-series frame image IM (M is a natural number of 1 or more) is sampled from a video to be summarized,
Between a pixel PN in a specific frame image IN (N is a natural number greater than or equal to 1 and less than M) and a pixel around the pixel PN + 1 corresponding to the pixel PN in the next frame image IN + 1. 4. The pixel PN for which the brightness difference exceeds a preset threshold value .theta.W1 is calculated for each frame image unit, and the high-speed operation section is calculated using the pixel PN. 4. The video summarization method described in 4.

6. A time-series frame image IM (M is a natural number of 1 or more) is sampled from a video to be summarized,
A first luminance difference, which is a luminance difference between the pixel PN in the specific frame image IN (N is a natural number of 1 or more and less than M) and the pixels around the pixel PN, is calculated, and the pixel PN is calculated.
And a second luminance difference which is a luminance difference between a pixel around the pixel PN + 1 corresponding to the pixel PN in the next frame image IN + 1 and the second luminance difference is calculated for each frame image. The pixel PN at which the brightness difference of 1 is less than the preset threshold value θW1 and the second brightness difference exceeds the threshold value θW1, and the high-speed operation section is calculated using the pixel PN. The video summarizing method according to claim 3 or 4.

7. The video summarization method according to claim 3, wherein the high-speed motion section is obtained by using a motion vector.

8. A frame image of a video to be summarized is divided into a plurality of shots for each content, and a lower limit is set for a display time length of the shot according to the content of the shot. The video summarization method described in any one of.

9. A frame image of a video to be summarized is divided into a plurality of shots for each content, and a reproduction speed is determined so that a boundary between the shots and a rhythm of a preset cycle have a correlation, A video summarizing method of sequentially reproducing the shots while recalling.

10. The video summarizing method according to claim 9, wherein a rhythm is recalled by using a sound.

11. Reproduction is interrupted at intervals of a preset time Tlen, a frame image displayed at that time is displayed for a preset time Tstl, and a rhythm having (Tlen + Tstl) as a cycle is generated, 10. The video summarizing method according to claim 9, wherein the reproduction speed is determined so that the rhythm has a correlation with a shot boundary.

12. A frame image of a video to be summarized is divided into a plurality of shots for each content, a speed upper limit value that is a criterion for determining a playback speed is determined according to the content of the video, and at least a part of the shots is obtained. Is reproduced at a speed equal to or lower than the speed upper limit value, the reproduction speed of the shot is determined so that the display time length of the shot becomes equal to or lower than a time length upper limit value which is a preset threshold value, and the shots are sequentially reproduced. Video summarization method.

13. A frame image of a video to be summarized is grouped for each content and divided into a plurality of shots, a speed upper limit value which is a reference for determining a playback speed is determined according to the content of the video, and the entire shot is shot. Preset reference time length Trh
When the reproduction speed required to display y is equal to or less than the speed upper limit value, the reproduction speed of the shot is determined so that the display time length of the shot becomes the reference time length Trhy, and the entire shot is displayed. When the reproduction speed necessary for displaying with the reference time length Trhy exceeds the speed upper limit value, the display time length of the shot is changed to the reference time length T.
Set to twice rhy and determine the playback speed of the shots so that the playback speed during the first half of the display time length, Trhy, is less than or equal to the speed upper limit value. Method.

14. A similarity between frame images is calculated, a long-time similar section, which is a time section in which similar contents continue for a predetermined time or more, is detected, and in a shot including the long-time similar section, the long-range similar section is detected. 14. The video summarization method according to claim 12, wherein the upper speed limit value is set to a value larger than that of a shot that does not include a time-similar section.

15. The video summarization method according to claim 14, wherein each of the two frame images is divided into a plurality of partial regions, and the similarity between the two frame images is calculated using the similarity of the partial regions.

16. A high-speed motion section, which is a time section in which an object on the screen moves faster than other time sections, is detected from a frame image of a video to be summarized, and the shot including the high-speed operation section has the high-speed operation section. 14. The video summarization method according to claim 12, wherein the upper speed limit is set to a value smaller than that of a shot that does not include a motion section.

17. The video summarizing method according to claim 16, wherein the high-speed motion section is detected in a predetermined area of the frame image.

18. A pixel PN in a specific frame image IN (N is a natural number of 1 or more and less than M) by sampling a time-series frame image IM (M is a natural number of 1 or more) from a video to be summarized. And a luminance difference between a pixel around the pixel PN + 1 corresponding to the pixel PN in the next frame image IN + 1 is calculated, and the luminance difference is preset in each frame image unit. The pixel PN that exceeds the value θW1
18. The video summarization method according to claim 16, wherein the high speed operation section is calculated using the pixel PN.

19. A time-series frame image IM (M is a natural number greater than or equal to 1) is sampled from a video to be summarized, and a pixel PN in a specific frame image IN (N is a natural number greater than or equal to 1 and less than M) is obtained. , A first brightness difference which is a brightness difference between the pixel PN and the surrounding pixels, and calculates the pixel PN
And a second luminance difference which is a luminance difference between a pixel around the pixel PN + 1 corresponding to the pixel PN in the next frame image IN + 1 and the second luminance difference is calculated for each frame image. The pixel PN at which the brightness difference of 1 is less than the preset threshold value θW1 and the second brightness difference exceeds the threshold value θW1, and the high-speed operation section is calculated using the pixel PN. The video summarization method according to claim 16 or 17.

20. The video summarizing method according to claim 16, wherein the high-speed motion section is obtained by using a motion vector.

21. A frame image of a video to be summarized is divided into a plurality of shots, and the similar shots are integrated by using a correlation between the shots in a time series to be regarded as a shot group. A video summarizing method of selecting a plurality of partial moving images which are time-series frame images from a group and sequentially reproducing the partial moving images.

22. The video summarizing method according to claim 21, wherein a common feature is obtained from all shots included in the shot group, the partial moving image is processed based on the feature, and the processed partial moving images are sequentially reproduced. .

23. A common feature is obtained from all shots included in a shot group, a partial moving image is processed based on the feature, and a top frame image of the processed partial moving image is reduced and displayed on the screen. 23. The video summarizing method according to claim 22, wherein a list is displayed.

24. The video summarizing method according to claim 21, wherein shots in time series in which a similar background and a similar subject are photographed are detected and set as shot groups.

25. A representative spatiotemporal image, which is a frame image selected from shots, is processed, and the representative spatiotemporal image is divided into a moving region which is a region affected by a change due to motion and a region other than the moving region. The video summarization method according to any one of claims 21 to 24, wherein a color histogram is created for each static area, and a correlation between shots is obtained using the color histogram.

26. A partial moving image in a shot group using a common portion of moving constituent colors which are colors appearing in a moving area and a common portion of still constituent colors which are colors appearing in a still area. 26. The video summarizing method according to claim 25, wherein

27. The video summarizing method according to claim 21, wherein shots having substantially the same time length and continuous in time are detected and regarded as a shot group.

28. The method according to claim 8, wherein a portion taken continuously in time by the video camera is regarded as a shot.
7. The video summarization method described in any one of 7.

29. The video summarizing method according to claim 8, wherein one scene of the scenario is regarded as a shot.