JPH06309381A

JPH06309381A - Moving image processor

Info

Publication number: JPH06309381A
Application number: JP5093256A
Authority: JP
Inventors: Junichi Takahashi; 淳一高橋
Original assignee: IBM Japan Ltd
Current assignee: IBM Japan Ltd
Priority date: 1993-04-20
Filing date: 1993-04-20
Publication date: 1994-11-04

Abstract

PURPOSE:To provide a moving image processor in which each scene of a moving image can be easily indexed, and a retrieving condition can be easily generated in order to easily retrieve the desired scene in a short time. CONSTITUTION:One or plural central frames representing each scene are arbitrarily decided by a scene editor 24 for each scene classified in a shorter time than an original moving image C by a scene change detecting device 20, and a key word indicating the meaning content of the scene is applied to each scene. Then, a key frame which easily reminds a user of the key word is selected from the central frames of the scene by a key frame generating device 36. At the time of retrieval, the user selects the plural key frames, and applies a logical condition. A retrieving device 42 replaces the selected key frames with the key words, generates a key word retrieval expression described under the logical condition of the key word, and retrieves the desired scene from the moving image based on the generated key word retrieval expression.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、動画像処理装置にかか
り、特に、時間的に連続する複数の静止画像から構成さ
れた動画像から所定のシーンを検索する動画像処理装置
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a moving picture processing apparatus, and more particularly to a moving picture processing apparatus for retrieving a predetermined scene from a moving picture composed of a plurality of temporally continuous still pictures.

【０００２】[0002]

【従来の技術】近年の映像機器の普及に伴って博物館や
放送局等では膨大な量の静止画等の映像情報が蓄積され
る。この蓄積された映像情報から所望の画像を得るため
には、効率良く索引を付与すると共に、検索しなければ
ならない。この画像として、時間的に連続した複数の静
止画像から構成される動画像データを用いた場合に、所
望の画像を得るには、索引及び検索の効率性が顕著に表
れる。2. Description of the Related Art With the spread of video equipment in recent years, a huge amount of video information such as still images is accumulated in museums and broadcasting stations. In order to obtain a desired image from the stored video information, it is necessary to efficiently add an index and perform a search. When moving image data composed of a plurality of temporally continuous still images is used as this image, the efficiency of indexing and searching is remarkably exhibited in order to obtain a desired image.

【０００３】従来、コンピュータ等により動画像を構成
する複数の静止画像を画像データとすると共に動画像中
の各画像データの位置情報を木構造にして蓄積すること
によって動画像データベースを構成し、この動画像デー
タベースから所望の画像データを検索して検索された動
画像をモニター等の表示装置に表示する動画像処理装置
が提案されている（特開昭６１−３６８９６８号、特開
昭６１−２９９３９号公報等）。Conventionally, a moving image database is constructed by storing a plurality of still images forming a moving image by a computer as image data and accumulating position information of each image data in the moving image in a tree structure. A moving image processing apparatus has been proposed which searches for desired image data from a moving image database and displays the searched moving image on a display device such as a monitor (Japanese Patent Laid-Open Nos. 61-368968 and 61-29939). No.

【０００４】この動画像処理装置では、ユーザの指示に
よって、この木構造の上位の階層から下位の階層へ表示
を移行して、最終的にユーザが所望する画像データを読
み出すと共に、表示装置に表示させている。In this moving image processing apparatus, the display is moved from the upper hierarchy to the lower hierarchy of the tree structure according to a user's instruction, and finally the image data desired by the user is read out and displayed on the display device. I am letting you.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、一般に
動画像はカットやカメラアングルの切り換え等でシーン
に区切られており、このシーン検索をすることが多い。
上記のような木構造による動画像データベースでは、選
択された上位の階層に対応する静止画像に含まれる下位
の階層に対応する静止画像を順次表示することによって
最終的に所望の画像へと検索を重ねているため、検索途
中で選択指示を誤ると最終的に得られる画像がユーザの
所望するシーンの画像に到達できないことがある。この
ため、動画像データベースを構成するとき、適度の分岐
数と階層の深さをもった木構造を慎重に設定しなければ
ならない。また、所望のシーンが複数あって、これらを
比較したい場合、特定の１シーンに辿りついても、他の
シーンについては、それぞれについて上位階層から順に
木構造を辿る必要があり、操作が冗長になると共に、比
較のための一覧性に欠ける。However, in general, a moving image is divided into scenes by cutting or switching camera angles, and this scene search is often performed.
In the moving image database based on the tree structure as described above, the still images corresponding to the lower layers included in the still images corresponding to the selected upper layer are sequentially displayed to finally search for a desired image. Because of overlapping, if the selection instruction is mistaken during the search, the image finally obtained may not reach the image of the scene desired by the user. Therefore, when constructing a moving image database, it is necessary to carefully set a tree structure with an appropriate number of branches and a depth of hierarchy. Further, when there are a plurality of desired scenes and it is desired to compare them, even if a specific one scene is reached, it is necessary to follow the tree structure for each of the other scenes in order from the upper hierarchy, and the operation becomes redundant. At the same time, it lacks a list for comparison.

【０００６】この静止画像を表示させるために、キーワ
ードを入力してシーンを検索することが考えられるが、
キーワードに全ての情報を含ませることは困難であり、
必要な情報の全てを含ませるためにはキーワードの設定
が複雑になる。In order to display this still image, it is possible to search a scene by inputting a keyword.
It is difficult to include all the information in the keyword,
The setting of keywords is complicated to include all the necessary information.

【０００７】また、同じ意味内容を表すキーワードで
も、ユーザが検索時に指定するキーワードと、実際にデ
ータベースに登録されているキーワードと一致しない場
合があり、結果としてユーザが満足するシーンが検索さ
れないことや、そもそもキーワードとしてユーザが何を
入力してよいか思いつかない場合もある。このような場
合、ユーザは適切な検索用のキーワードを見つけるため
に試行錯誤しなければならず、検索効率が悪化する。Further, even if the keywords have the same meaning, the keywords specified by the user at the time of searching may not match the keywords actually registered in the database, and as a result, a scene that satisfies the user may not be searched. In some cases, it may not be possible to think of what the user may input as a keyword. In such a case, the user has to make trial and error to find an appropriate keyword for search, which deteriorates the search efficiency.

【０００８】更に、動画像が有する意味内容の多用性か
ら、検索等を容易とするとためのキーワードを、全ての
ユーザが容易に指定できる標準的な用語表現を予め定め
ておく用語統制は困難である。Further, because of the versatility of the meaning and content of moving images, it is difficult to prescribe a standard term expression that allows all users to easily specify a keyword for facilitating search and the like. is there.

【０００９】本発明は、上記事実を考慮して、容易かつ
短時間で所望のシーンの画像を検索するために、動画像
に含まれる各シーンに容易に索引付与することができる
と共に、容易に検索条件を生成することができる動画像
処理装置を得ることが目的である。In consideration of the above facts, the present invention can easily and easily index each scene included in a moving image in order to easily and quickly retrieve an image of a desired scene. An object is to obtain a moving image processing device that can generate search conditions.

【００１０】[0010]

【課題を解決するための手段】上記目的を達成するため
に請求項１に記載した発明の動画像処理装置は、時間的
に連続しかつ意味内容が同一または類似の複数の静止画
像から成る画像群を１シーンとして該シーンを複数含ん
で構成された動画像を記憶すると共に各シーンを代表す
る少なくとも１つの代表画像から選択した画像をキー画
像として記憶する記憶手段と、前記キー画像を複数個表
示する表示手段と、前記表示手段に表示されたキー画像
を選択するための選択情報と、選択されたキー画像に対
する検索条件または選択された複数のキー画像間に対す
る検索条件を入力する入力手段と、前記選択情報及び前
記検索条件に基づいてシーンを検索する検索手段と、を
備えている。In order to achieve the above object, the moving image processing apparatus according to the invention described in claim 1 is an image composed of a plurality of still images which are temporally continuous and have the same or similar meaning content. A storage unit that stores a moving image configured by including a plurality of scenes with one group as one scene and an image selected from at least one representative image representing each scene as a key image; and a plurality of the key images. Display means for displaying, selection information for selecting the key image displayed on the display means, and input means for inputting search conditions for the selected key image or search conditions for a plurality of selected key images Search means for searching a scene based on the selection information and the search condition.

【００１１】請求項２に記載した発明の動画像処理装置
は、時間的に連続する多数の静止画像から成る動画像
を、意味内容が同一または類似する複数の静止画像から
成る画像群を１シーンとする複数のシーンに分類する分
類手段と、分類された各シーンから各シーンの意味内容
を代表する静止画像を少なくとも１つ選択する第１の選
択手段と、選択された静止画像からキー画像を選択する
第２の選択手段と、キー画像を記憶する記憶手段と、を
備えている。According to a second aspect of the present invention, in a moving image processing apparatus, a moving image composed of a plurality of temporally continuous still images, and an image group composed of a plurality of still images having the same or similar semantic content as one scene. , A first selecting means for selecting at least one still image representative of the meaning and content of each scene from each classified scene, and a key image from the selected still images. Second selection means for selection and storage means for storing the key image are provided.

【００１２】なお、前記第１の選択手段は、分類された
各シーンから各シーンの意味内容に関連する静止画像を
少なくとも１つ選択するようにしてもよい。The first selecting means may select at least one still image related to the meaning and content of each scene from each classified scene.

【００１３】請求項３に記載した発明は、請求項２に記
載の動画像処理装置において、前記第２の選択手段は、
選択された静止画像についての意味内容の和集合が前記
動画像の全シーンについての意味内容の和集合と一致す
る意味内容を備えた静止画像をキー画像として選択する
ことを特徴としている。According to a third aspect of the present invention, in the moving image processing apparatus according to the second aspect, the second selecting means is
It is characterized in that a still image having a semantic content in which the union of the semantic content of the selected still image matches the union of the semantic content of all the scenes of the moving image is selected as a key image.

【００１４】[0014]

【作用】請求項１に記載した発明の動画像処理装置に用
いられる動画像は、時間的に連続し意味内容が同一また
は類似する複数の静止画像から成る画像群を１シーンと
して該シーンを複数含んで構成されている。この動画像
は記憶手段に記憶されており、また記憶手段は、動画像
の各シーンを代表する少なくとも１つの代表画像、すな
わち代表的な静止画像から選択した画像をキー画像とし
て記憶する。表示手段は、記憶されているキー画像を表
示する。従って、表示手段には、各シーンの意味内容に
関連したキー画像が表示される。入力手段は、表示手段
に表示されたキー画像を選択するための選択情報と、選
択されたキー画像に対する検索条件または選択された複
数のキー画像間に対する検索条件を入力する。これによ
って、ユーザの直観的なイメージによりキー画像を選択
でき、このキー画像に対応する検索条件または複数のキ
ー画像間に対して関連性を所有させ得る検索条件を入力
できる。従って、検索手段により、入力された選択情報
及び検索条件に基づいて検索すれば、ユーザが所望する
シーンが検索結果として得られる。A moving image used in the moving image processing apparatus according to the first aspect of the present invention includes a plurality of still images that are temporally continuous and have the same or similar meaning content as one scene. It is configured to include. The moving image is stored in the storage unit, and the storage unit stores at least one representative image representing each scene of the moving image, that is, an image selected from the representative still images as a key image. The display means displays the stored key image. Therefore, a key image related to the meaning and content of each scene is displayed on the display means. The input means inputs selection information for selecting a key image displayed on the display means, a search condition for the selected key image or a search condition for a plurality of selected key images. With this, a key image can be selected by an intuitive image of the user, and a search condition corresponding to this key image or a search condition that can have relevance between a plurality of key images can be input. Therefore, if a search is performed by the search means based on the input selection information and search conditions, the scene desired by the user can be obtained as a search result.

【００１５】請求項２に記載した発明の動画像処理装置
は、時間的に連続する多数の静止画像から成る動画像
を、分類手段によって意味内容が同一または類似する複
数の静止画像から成る画像群を１シーンとする複数のシ
ーンに分類する。この分類された各シーンには、第１の
選択手段によって分類された各シーンから該シーンの意
味内容を代表する静止画像または意味内容に関連する静
止画像が少なくとも１つ選択される。第２の選択手段
は、第１の選択手段により選択された静止画像からキー
画像を選択する。このキー画像は記憶手段によって記憶
される。このように、記憶手段にキー画像が記憶される
ことによって、分類された各シーンの意味内容に関連し
た情報が記憶される。従って、検索時にユーザが指示す
るキーワード等の統一表現に限定されることなく、静止
画像自体をキーワードに対応する情報（キー画像）とし
て用いることができる。このため、キー画像を記録する
ときに用いる情報としては、便宜的に自由なキーワード
を用いることができる。According to a second aspect of the present invention, in a moving image processing apparatus, a moving image composed of a plurality of temporally continuous still images is grouped into an image group composed of a plurality of still images whose meanings are the same or similar by classification means. Are classified into a plurality of scenes. For each of the classified scenes, at least one still image representative of the semantic content of the scene or a still image related to the semantic content is selected from the scenes classified by the first selecting means. The second selecting unit selects a key image from the still images selected by the first selecting unit. This key image is stored by the storage means. In this way, by storing the key image in the storage means, information related to the meaning and content of each classified scene is stored. Therefore, the still image itself can be used as information (key image) corresponding to the keyword without being limited to the unified expression of the keyword or the like designated by the user at the time of search. Therefore, as the information used when recording the key image, a free keyword can be used for convenience.

【００１６】また、上記第２の選択手段は、請求項３に
記載したように、選択された静止画像についての意味内
容の和集合が動画像の全シーンについての意味内容の和
集合と一致するような意味内容を備えた静止画像をキー
画像として選択するようにしてもよい。このようにする
ことによって、１シーンの意味内容が複数シーンに亘っ
て存在するときには、その意味内容を有する静止画像は
１つでもよいことになる。従って、キー画像の選択数は
減少でき、必要最小限の静止画像の選択になる。Further, in the second selecting means, as described in claim 3, the set of meanings of the selected still images matches the set of meanings of all the scenes of the moving image. You may make it select the still image with such meaning content as a key image. By doing so, when the meaning / content of one scene exists over a plurality of scenes, the number of still images having the meaning / content may be one. Therefore, the number of selected key images can be reduced, and the minimum required still image can be selected.

【００１７】[0017]

【実施例】以下、図面を参照して、本発明の実施例を詳
細に説明する。本実施例は、動画像が有する各シーンに
索引を付与する索引処理及び各シーンを検索する検索処
理をする動画像処理システム１０に本発明を適用したも
のである。Embodiments of the present invention will now be described in detail with reference to the drawings. In the present embodiment, the present invention is applied to a moving image processing system 10 that performs an indexing process for assigning an index to each scene of a moving image and a search process for searching for each scene.

【００１８】図１に示したように、動画像処理システム
１０は、各々マイクロコンピュータを含んだ動画像蓄積
装置１２、シーン変化検出装置２０、シーンエディタ装
置２４、キーフレーム生成装置３６及び検索装置４２か
ら構成されている。As shown in FIG. 1, the moving image processing system 10 includes a moving image storage device 12, a scene change detecting device 20, a scene editor device 24, a key frame generating device 36, and a searching device 42 each including a microcomputer. It consists of

【００１９】なお、これらの各構成については、本出願
人による既出願（特願平４−１６００３４号、特願平４
−６５６２０号公報）及び公開技報（IBM Technical Di
sclosure Bulletin.Vol.34 No.10A 1992.3) にも開示さ
れている。Regarding each of these components, the present applicant has already filed an application (Japanese Patent Application No. 4-160034, Japanese Patent Application No. 4-160034).
-65620) and Open Technical Report (IBM Technical Di
Sclosure Bulletin. Vol.34 No.10A 1992.3).

【００２０】動画像蓄積装置１２はレーザーディスクプ
レーヤー（以下、ＬＤプレーヤーという）１４及びビデ
オ信号変換器１６を備え、かつ動画像を表示するための
ＴＶモニター１８に接続されている。この動画像蓄積装
置１２のＬＤプレーヤー１４には、時間的に連続する静
止画像（１コマの画像、以下、フレームという）を複数
記録したレーザーディスク（ＬＤ）が装填可能とされ、
このＬＤを再生することにより動画像を得る。また、Ｌ
ＤにはＮＴＳＣ等のビデオ信号（アナログ信号）でフレ
ームが記録されており、このビデオ信号はビデオ信号変
換器１６によってデジタル信号に変換された画像データ
として出力される。The moving picture storage device 12 comprises a laser disk player (hereinafter referred to as an LD player) 14 and a video signal converter 16, and is connected to a TV monitor 18 for displaying a moving picture. The LD player 14 of the moving image storage device 12 can be loaded with a laser disc (LD) recording a plurality of temporally continuous still images (one frame image, hereinafter referred to as a frame).
A moving image is obtained by reproducing this LD. Also, L
A frame of a video signal (analog signal) such as NTSC is recorded in D, and this video signal is output as image data converted into a digital signal by the video signal converter 16.

【００２１】この動画像蓄積装置１２は、ビデオ信号変
換器１６から出力した画像データが入力されるように、
第１の記憶装置２２を有するシーン変化検出装置２０に
接続されると共に、シーンエディタ装置２４に接続され
ている。このシーン変化検出装置２０はシーンエディタ
装置２４にも接続されている。The moving image storage device 12 receives the image data output from the video signal converter 16,
It is connected to the scene change detection device 20 having the first storage device 22 and also connected to the scene editor device 24. The scene change detection device 20 is also connected to the scene editor device 24.

【００２２】シーンエディタ装置２４は、第２、第３及
び第４の記憶装置２６、２８、３０を有しかつ表示装置
３２及び入力装置３４を有している。このシーンエディ
タ装置２４は、キーフレーム生成装置３６及び検索装置
４２が接続されている。The scene editor device 24 has second, third and fourth storage devices 26, 28, 30 and has a display device 32 and an input device 34. The scene editor device 24 is connected to a key frame generation device 36 and a search device 42.

【００２３】キーフレーム生成装置３６は、第５及び第
６の記憶装置３８、４０を有している。また、検索装置
４２は、検索式生成装置４４及び検索処理装置４６から
構成されかつ、表示装置４８及び入力装置５０を有して
いる。The key frame generation device 36 has fifth and sixth storage devices 38 and 40. The search device 42 includes a search expression generation device 44 and a search processing device 46, and includes a display device 48 and an input device 50.

【００２４】以下、本実施例の作用を説明する。本実施
例では、動画像Ｃに含まれる所定のフレームを検索時に
要素となるキーワードの情報を含みかつフレームのイメ
ージ情報を含んだキー画像としてのキーワード画像（以
下、キーフレームという）として表示させ、論理条件を
付与しながらキーフレームを選択指示等をすることによ
って、所望のシーンを検索処理するためのキーワードを
用いた問い合わせ式（以下、キーワード検索式という）
Ｑを生成し、生成されたキーワード検索式Ｑに基づいて
動画像Ｃから所望のシーンを検索する。先ず、このキー
フレームを生成するキーフレーム生成処理の詳細を図２
及び図３を参照し説明する。The operation of this embodiment will be described below. In the present embodiment, a predetermined frame included in the moving image C is displayed as a keyword image (hereinafter referred to as a key frame) as a key image including keyword information that is an element at the time of retrieval and image information of the frame, A query expression using keywords to search for a desired scene by giving a keyframe selection instruction while giving a logical condition (hereinafter referred to as a keyword search expression)
Q is generated, and a desired scene is searched from the moving image C based on the generated keyword search formula Q. First, the details of the key frame generation process for generating this key frame are shown in FIG.
Also, description will be made with reference to FIG.

【００２５】シーン変化検出装置２０は、ＬＤプレーヤ
ー１４から供給される動画像Ｃの画像データの時間的な
流れに従って、動画像はカットやカメラアングルの切り
換え等のフレームの物理的な変化や意味内容の変化（以
下、シーン変化という）の発生したフレームを検出し、
検出されたフレームのフレーム番号ｆnoをシーン変化情
報Ｉ１として第１の記憶装置２２に記憶する。これによ
り、動画像Ｃにおけるシーン変化の発生したフレーム番
号ｆnoの全てが記憶される。この隣接するフレーム番号
ｆnoの間のフレーム群は、シーンＳｉ（ｉ：１，２，・
・）に対応する。なお、本実施例では、５つのシーンＳ
１，Ｓ２，Ｓ３，Ｓ４，Ｓ５から構成された動画像Ｃを
例にして以下に説明する。The scene change detection device 20 is a physical change of frames such as cutting and switching of camera angles and meaning contents according to the temporal flow of the image data of the moving image C supplied from the LD player 14. The frame in which the change (hereinafter referred to as scene change) has occurred,
The frame number fno of the detected frame is stored in the first storage device 22 as the scene change information I1. As a result, all the frame numbers fno in which the scene change has occurred in the moving image C are stored. The frame group between the adjacent frame numbers fno is scene Si (i: 1, 2, ...
・) In this embodiment, five scenes S
A moving image C composed of 1, S2, S3, S4 and S5 will be described below as an example.

【００２６】このシーン変化検出装置２０に記憶された
シーン変化情報Ｉ１は、シーン・エディタ装置２４に出
力される。シーン・エディタ装置２４は入力されたシー
ン変化情報Ｉ１に基づいて、シーン変化したフレームで
分割される各シーンＳｉの意味内容を代表する代表フレ
ーム（例えば、先頭フレーム）８０を選択する。シーン
・エディタ装置２４は、この代表フレーム８０の選択と
共に、選択された代表フレーム８０のフレーム番号ｆno
を動画像蓄積装置１２へ出力する。これにより、動画像
蓄積装置１２のビデオ信号変換器１６から出力される代
表フレーム８０の画像データ（以下、代表画像データと
いう）がシーン・エディタ装置２４に入力される。シー
ン・エディタ装置２４は入力された代表画像データを第
２の記憶装置２６に記憶蓄積する。また、シーン・エデ
ィタ装置２４は、記憶した代表画像データの代表フレー
ム８０に対応するシーンＳｉの開始フレーム番号、終了
フレーム番号、代表フレーム番号（例えば、先頭フレー
ム番号）、及び代表フレームの画像データを記憶した第
２の記憶装置２６の格納アドレスをシーンＳｉに関する
シーン情報Ｉ２として第３の記憶装置２８に記憶する。
以下の表１に、５つのシーンＳ１〜Ｓ５を有しかつフレ
ーム番号が０〜４９９の動画像Ｃに関するシーン情報Ｉ
２の例を示した。The scene change information I1 stored in the scene change detecting device 20 is output to the scene editor device 24. The scene editor device 24 selects, based on the input scene change information I1, a representative frame (e.g., a head frame) 80 that represents the semantic content of each scene Si divided by the scene-changed frames. The scene editor device 24 selects the representative frame 80 and the frame number fno of the selected representative frame 80.
Is output to the moving image storage device 12. As a result, the image data of the representative frame 80 (hereinafter referred to as representative image data) output from the video signal converter 16 of the moving image storage device 12 is input to the scene editor device 24. The scene editor device 24 stores and stores the input representative image data in the second storage device 26. The scene editor device 24 also displays the start frame number, the end frame number, the representative frame number (for example, the start frame number) of the scene Si corresponding to the representative frame 80 of the stored representative image data, and the image data of the representative frame. The stored storage address of the second storage device 26 is stored in the third storage device 28 as scene information I2 regarding the scene Si.
In Table 1 below, scene information I regarding a moving image C having five scenes S1 to S5 and having frame numbers 0 to 499 is shown.
The example of 2 was shown.

【００２７】[0027]

【表１】 [Table 1]

【００２８】ここで、ユーザは、シーン・エディタ装置
２４の入力装置３４及び表示装置３２の操作によって動
画像Ｃの各シーンＳｉに少なくとも１つのキーワードＷ
ｊ（ｊ：１，２・・）を生成しかつ割りつける。シーン
・エディタ装置２４では、ユーザにより生成されたキー
ワードＷｊとキーワードＷｊに対応する開始フレーム番
号及び終了フレーム番号の間に含まれるフレーム群との
対応がテーブルとして表現されたキーワード索引情報Ｉ
３が第４の記憶装置３０に記憶される。Here, the user operates at least one keyword W for each scene Si of the moving image C by operating the input device 34 and the display device 32 of the scene editor device 24.
Generate and allocate j (j: 1,2 ...). In the scene editor device 24, the keyword index information I in which the correspondence between the keyword Wj generated by the user and the frame group included between the start frame number and the end frame number corresponding to the keyword Wj is expressed as a table
3 is stored in the fourth storage device 30.

【００２９】すなわち、ユーザが入力装置３４から代表
フレーム８０を指示することによって、シーン・エディ
タ装置２４は、第３の記憶装置２８に記憶されたシーン
情報Ｉ２及び第２の記憶装置２６に記憶された代表画像
データを読み込み、代表フレーム８０を表示装置３２に
表示する。このとき、ユーザは代表フレーム８０の表示
時点で、代表フレーム８０に対応するシーンＳｉを再生
させる指示をすることができ、シーン・エディタ装置２
４がユーザからシーン再生の指示を受けた場合は、この
指示されたシーンＳｉのみをＴＶモニター１８に再生す
る指示をＬＤプレーヤー１２に送出する。これにより、
動画像Ｃの指示されたシーンＳｉのみが表示される。こ
の代表フレーム８０または代表フレーム８０に対応する
シーンＳｉを再生しながらユーザは入力装置３４により
キーワードＷｊを入力できる。シーン・エディタ装置２
４は、キーワードＷｊが入力されると、そのキーワード
Ｗｊが割りつけられる開始フレーム番号及び終了フレー
ム番号をキーワード索引情報Ｉ３として第４の記憶装置
３０に記憶する。この開始フレーム番号及び終了フレー
ム番号は、シーンＳｉの再生中におけるユーザの指示や
入力装置３４による指示によって入力される。上記表１
の動画像Ｃを用いたときのキーワード索引情報Ｉ３の１
例を表２に示した。That is, when the user designates the representative frame 80 from the input device 34, the scene editor device 24 is stored in the scene information I2 stored in the third storage device 28 and the second storage device 26. The representative image data is read and the representative frame 80 is displayed on the display device 32. At this time, the user can give an instruction to reproduce the scene Si corresponding to the representative frame 80 when the representative frame 80 is displayed.
When 4 receives a scene reproduction instruction from the user, the LD player 12 is instructed to reproduce only the instructed scene Si on the TV monitor 18. This allows
Only the designated scene Si of the moving image C is displayed. The user can input the keyword Wj with the input device 34 while reproducing the representative frame 80 or the scene Si corresponding to the representative frame 80. Scene editor device 2
4, when the keyword Wj is input, the start frame number and the end frame number to which the keyword Wj is assigned are stored in the fourth storage device 30 as the keyword index information I3. The start frame number and the end frame number are input by a user's instruction during reproduction of the scene Si or an instruction from the input device 34. Table 1 above
1 of the keyword index information I3 when the moving image C of
Examples are shown in Table 2.

【００３０】[0030]

【表２】 [Table 2]

【００３１】従って、各シーンＳｉには、以下のように
キーワードが対応する。先ず、シーンＳ１は、フレーム
番号ｆnoが０〜９９のフレームを有しており、フレーム
番号ｆnoが０〜１９９のフレーム群に対応するキーワー
ド｛車｝、及びフレーム番号ｆnoが０〜９９のフレーム
群に対応するキーワード｛山｝を含んでいることにな
る。これにより、シーンＳ１のキーワードは｛車、山｝
に対応する。同様にして、シーンＳ２が｛車、女性｝、
シーンＳ３が｛女性、男性｝、シーンＳ４が｛女性、
山｝、及びシーンＳ５が｛山｝のキーワードの組に対応
する。なお、｛｝内に複数キーワードが記述されたも
のは、これらの和集合を表す。Therefore, keywords correspond to each scene Si as follows. First, the scene S1 has frames with frame numbers fno of 0 to 99, the keyword {car} corresponding to the frame group of frame numbers fno of 0 to 199, and the frame group of frame numbers fno to 0 to 99. Will include the keyword {mountain} corresponding to. As a result, the keyword of the scene S1 is {car, mountain}
Corresponding to. Similarly, scene S2 is {car, woman},
Scene S3 is {female, male}, scene S4 is {female,
Mountain} and scene S5 correspond to the keyword group of {mountain}. A plurality of keywords described in {} represents a union of these.

【００３２】次に、キーフレーム生成装置３６では、ユ
ーザが図示しないキーフレーム生成処理実行スイッチを
オンすると、図３のキーフレーム生成ルーチンが実行さ
れて、ステップ１０２へ進む。ステップ１０２では、シ
ーン・エディタ装置２４へ信号を出力することにより第
３の記憶装置２８に記憶されたシーン情報Ｉ２及び第４
の記憶装置３０に記憶されたキーワード索引情報Ｉ３が
入力される。次のステップ１０４では、後述する選択基
準に従って、代表フレーム８０の中から幾つかをキーフ
レーム８２として選択する。この選択されたキーフレー
ム８２とこのキーフレーム８２に対応するキーワードと
の関係をテーブルとして表したキーワード・キーフレー
ム対応情報（以下、ＷＦ対応情報という）Ｉ４を生成し
て第５の記憶装置３８に記憶し（ステップ１０６）、キ
ーフレーム８２として選択した代表フレーム８０の画像
データをキーフレーム８２の画像データ（以下、キー画
像データという）として第６の記憶装置４０に記憶して
（ステップ１０８）、本ルーチンを終了する。Next, in the key frame generation device 36, when the user turns on a key frame generation processing execution switch (not shown), the key frame generation routine of FIG. 3 is executed and the routine proceeds to step 102. In step 102, the scene information I2 and the fourth scene information I2 stored in the third storage device 28 are output by outputting a signal to the scene editor device 24.
The keyword index information I3 stored in the storage device 30 is input. In the next step 104, some of the representative frames 80 are selected as the key frames 82 according to the selection criteria described later. Keyword / keyframe correspondence information (hereinafter referred to as WF correspondence information) I4 representing a relationship between the selected keyframe 82 and keywords corresponding to this keyframe 82 is generated and stored in the fifth storage device 38. The image data of the representative frame 80 selected as the key frame 82 is stored as the image data of the key frame 82 (hereinafter referred to as key image data) in the sixth storage device 40 (step 108), This routine ends.

【００３３】なお、この場合、第６の記憶装置４０に
は、画像データの代わりにキーフレーム８２のアドレス
情報を記憶するようにしてもよい。In this case, the sixth storage device 40 may store the address information of the key frame 82 instead of the image data.

【００３４】上記選択基準は、キーフレーム８２として
選択した代表フレーム８０の属するシーンＳｉに割り当
てられたキーワードＷｊの組が、全てのシーンＳｉに割
り当てられたキーワードＷｊの組をカバーするように選
択することである。The selection criterion is such that the set of keywords Wj assigned to the scene Si to which the representative frame 80 selected as the key frame 82 belongs covers the set of keywords Wj assigned to all the scenes Si. That is.

【００３５】例えば、上記表１において、フレーム番号
ｆnoがｆ２，ｆ３及びｆ５である３つの代表フレーム８
０をキーフレーム８２として選択する。フレーム番号ｆ
noがｆ２，ｆ３及びｆ５の各々の代表フレーム８０の属
するシーンＳ２、Ｓ３及びＳ４に割り当てられるキーワ
ードは、シーンＳ２が｛車、女性｝、シーンＳ３が｛女
性、男性｝、及びシーンＳ５が｛山｝である。このシー
ンＳ２，Ｓ３，Ｓ５のキーワードの和集合は全てのシー
ンＳｉ（シーンＳ１〜Ｓ５）に割り当てられた全キーワ
ードの集合｛車、女性、男性、山｝に一致するので、上
記の選択基準を満たしている。For example, in Table 1 above, three representative frames 8 whose frame numbers fno are f2, f3 and f5
0 is selected as the key frame 82. Frame number f
The keywords assigned to the scenes S2, S3, and S4 to which the representative frames 80 of nos are f2, f3, and f5 are, for example, scene S2 is {car, woman}, scene S3 is {woman, man}, and scene S5 is {. Mountain}. The union of the keywords of the scenes S2, S3, S5 matches the set of all keywords assigned to all the scenes Si (scenes S1 to S5) {car, woman, man, mountain}. Meet

【００３６】なお、上記選択基準を満たすキーフレーム
８２の組は、複数決定し得るが、後述する検索処理時に
ユーザが概略を一覧し易いように、選択するキーフレー
ム８２の数は少ないほうが好ましい。従って、例えば、
あるキーフレーム８２に対応するキーワードＷｊの集合
が、他のキーフレーム８２に対応するキーワードＷｊの
集合の部分集合になっている場合、前者に対応するキー
ワードは、後者にも対応しているので、前者をキーフレ
ーム８２として冗長に保持しなくてもよい。Although a plurality of sets of key frames 82 satisfying the above selection criteria can be determined, it is preferable that the number of selected key frames 82 is small so that the user can easily see the outline at the time of a search process described later. So, for example,
When the set of keywords Wj corresponding to a certain key frame 82 is a subset of the set of keywords Wj corresponding to another key frame 82, the keyword corresponding to the former also corresponds to the latter, The former may not be held redundantly as the key frame 82.

【００３７】このキーフレーム８２とキーワードとの関
係をキーフレーム８２の識別子とキーワードとを対応さ
せたＷＦ索引情報Ｉ４の１例を表３に示した。Table 3 shows an example of the WF index information I4 in which the identifier of the key frame 82 and the keyword are associated with each other.

【００３８】[0038]

【表３】 [Table 3]

【００３９】次に、上記で説明したキーフレーム８２を
利用した検索処理の詳細を図４乃至図６を参照し説明す
る。Next, details of the search processing using the key frame 82 described above will be described with reference to FIGS. 4 to 6.

【００４０】検索装置４２では、検索式生成装置４４に
おいてユーザによる入力装置５０及び表示装置４８の操
作に応じたキーフレーム８２の指示等によりキーワード
検索式Ｑが生成され、生成されたキーワード検索式Ｑに
基づいて検索処理装置４６が検索処理を行う。この検索
結果のシーンＳｉの代表フレーム８０が表示装置４８に
表示される。また、ユーザによりシーンＳｉの再生が指
示されたたときは動画像蓄積装置１２に対してシーンＳ
ｉの再生指示信号を送出する。In the search device 42, the keyword search formula Q is generated in the search formula generation device 44 by an instruction of the key frame 82 according to the operation of the input device 50 and the display device 48 by the user, and the generated keyword search formula Q. The search processing device 46 performs search processing based on the above. The representative frame 80 of the scene Si as the search result is displayed on the display device 48. Further, when the user gives an instruction to reproduce the scene Si, the scene S is displayed to the moving image storage device 12.
The reproduction instruction signal of i is transmitted.

【００４１】すなわち、検索式生成装置４４では、ユー
ザが図示しない検索処理実行スイッチをオンすると、図
５のキーワード検索式生成ルーチンが実行されて、ステ
ップ１１２へ進み、キーフレーム生成装置３６へ信号を
出力することによってキーフレーム生成装置３６からＷ
Ｆ対応情報Ｉ４及びキー画像データを読み取って図示し
ないメモリに記憶する。次のステップ１１４では、この
図示しないメモリからＷＦ対応情報Ｉ４及びキー画像デ
ータを読み取って、図４に示したようにキーフレーム８
２の一覧を表示装置４８に表示する。That is, when the user turns on a search process execution switch (not shown) in the search formula generation device 44, the keyword search formula generation routine of FIG. 5 is executed, and the process proceeds to step 112 to send a signal to the key frame generation device 36. By outputting, the key frame generator 36 outputs W
The F correspondence information I4 and the key image data are read and stored in a memory (not shown). In the next step 114, the WF correspondence information I4 and the key image data are read from the memory (not shown), and the key frame 8 is read as shown in FIG.
The list of No. 2 is displayed on the display device 48.

【００４２】この表示装置４８の表示画面６０は、複数
のキーフレーム８２（本実施例では４つ）を表示するた
めのキーフレーム表示領域６２が設けられ、このキーフ
レーム表示領域６２に読み取ったキー画像データに応じ
たキーフレーム８２が表示される。The display screen 60 of the display device 48 is provided with a key frame display area 62 for displaying a plurality of key frames 82 (four in this embodiment), and the key read in the key frame display area 62 is displayed. A key frame 82 corresponding to the image data is displayed.

【００４３】この各キーフレーム表示領域６２の上部に
は、表示されたキーフレーム８２を通し番号等により識
別するためのキーフレームの識別子が表示される識別子
表示領域６３が設けられると共に、キーフレーム８２を
選択指示するための選択チェック・ボタン６４が設けら
れている。この選択チェック・ボタン６４は、入力装置
５０からユーザが指示するマウス等のオン・オフによっ
てキーフレームの選択が指示されると共に選択の有無を
表示する。At the upper part of each key frame display area 62, an identifier display area 63 for displaying a key frame identifier for identifying the displayed key frame 82 by a serial number or the like is provided, and the key frame 82 is also displayed. A selection check button 64 for instructing selection is provided. The selection check button 64 indicates the selection of a key frame by turning on / off the mouse or the like instructed by the user from the input device 50, and displays the presence / absence of selection.

【００４４】表示画面６０の右方（図４の紙面右側）に
は、検索実行を指示するための検索ボタン６８、及びス
クロール・ボタン７０が設けられている。スクロール・
ボタン７０は、選択し得るキーフレーム８２が多数あ
り、同一画面上に全てを表示できないときに前後のキー
フレーム８２を表示させるための指示領域である。この
スクロール・ボタン７０は、順次前画面のキーフレーム
を表示させるためのロールアップ・ボタン７０Ａ及び順
次次画面のキーフレームを表示させるためのロールダウ
ン・ボタン７０Ｂとから構成されている。A search button 68 for instructing search execution and a scroll button 70 are provided on the right side of the display screen 60 (on the right side of the paper surface of FIG. 4). scroll·
The button 70 is an instruction area for displaying the front and rear key frames 82 when there are many selectable key frames 82 and all cannot be displayed on the same screen. The scroll button 70 is composed of a roll-up button 70A for sequentially displaying the key frames of the previous screen and a roll-down button 70B for sequentially displaying the key frames of the next screen.

【００４５】また、表示画面６０の上部は、論理演算子
表示領域６６Ａとされ、論理演算子表示領域６６Ａ内に
は複数の論理演算子選択ボタン６６が設けられている。
論理演算子選択ボタン６６は、指示されたキーフレーム
８２を結び付ける論理演算子を指定するためのボタンで
ある。更に、表示画面６０の下部は任意条件入力フィー
ルド７２とされている。この任意条件入力フィールド７
２には、ユーザが入力装置５０により直接入力された所
望のシーン検索処理のための問い合わせ式（以下、検索
式という）が表示される。The upper portion of the display screen 60 is a logical operator display area 66A, and a plurality of logical operator selection buttons 66 are provided in the logical operator display area 66A.
The logical operator selection button 66 is a button for designating a logical operator for connecting the instructed key frame 82. Further, the lower part of the display screen 60 is an optional condition input field 72. This optional condition input field 7
In 2, a query expression (hereinafter, referred to as a search expression) for a desired scene search process directly input by the user through the input device 50 is displayed.

【００４６】次のステップ１１６では、表示画面６０を
目視しながら選択指示されたユーザの選択指示の入力を
読み取ってステップ１１８へ進む。ステップ１１８で
は、スクロール・ボタン７０がオンされたか否かを判断
し、オンされた場合には、ステップ１２０においてロー
ルアップ・ボタン７０Ａ及びロールダウン・ボタン７０
Ｂの一方の指示ボタンに対応する画面のキー画像データ
を取り込みステップ１１４へ戻り、対応するキーフレー
ムを表示して選択指示を継続する。一方、否定判断の場
合には、ステップ１２２へ進む。At the next step 116, the user's selection instruction input, which is instructed to be selected, is read while visually observing the display screen 60, and the process proceeds to step 118. In step 118, it is determined whether or not the scroll button 70 is turned on. If it is turned on, in step 120, the roll-up button 70A and the roll-down button 70 are turned on.
The key image data of the screen corresponding to one instruction button of B is fetched, the process returns to step 114, the corresponding key frame is displayed, and the selection instruction is continued. On the other hand, if the determination is negative, the process proceeds to step 122.

【００４７】ステップ１２２では、検索ボタン６８が指
示されたか否かを判断し、検索ボタン６８が指示される
まで上記ステップ１１６〜１１８の処理を繰り返し実行
する。検索ボタン６８が指示されたときには、ステップ
１２４へ進む。ステップ１２４では、キーフレーム８２
が１つ以上選択されたか否かまたは任意条件入力フィー
ルド７２に文字列が入力されたか否かを判断することに
より検索条件が入力されたか否かを判断し、否定判断の
場合にはステップ１１６へ戻る。At step 122, it is judged whether or not the search button 68 is instructed, and the processes of steps 116 to 118 are repeatedly executed until the search button 68 is instructed. When the search button 68 is instructed, the process proceeds to step 124. In step 124, the key frame 82
Is selected, or whether a character string is input in the arbitrary condition input field 72, it is determined whether or not the search condition is input. If the determination is negative, the process proceeds to step 116. Return.

【００４８】この任意条件入力フィールド７２への検索
条件の直接入力または、キーフレームの選択及び選択さ
れたキーフレームを論理演算子により結び付ける指示に
応じて、後述するようにキーワード検索式Ｑを生成す
る。A keyword search expression Q is generated as will be described later in response to direct input of a search condition into the arbitrary condition input field 72 or selection of a keyframe and an instruction to connect the selected keyframes with a logical operator. .

【００４９】ここで、本実施例では、上記キーフレーム
８２を結びつける論理演算子としては、「ＡＮＤ」、
「ＯＲ」、「＆」、「｜」及び「¬」を用いている。こ
の論理演算子「ＡＮＤ」及び「ＯＲ」は、オペランドと
なる２つのキーフレーム８２に対応したキーワードＷｊ
の集合に集合演算を施すための演算子である。論理演算
子「ＡＮＤ」は共通集合を求めるための演算子であり、
「ＯＲ」では和集合を求めるための演算子である。In this embodiment, the logical operator for connecting the key frames 82 is "AND",
"OR", "&", "|" and "¬" are used. The logical operators "AND" and "OR" are keywords Wj corresponding to two key frames 82 that are operands.
Is an operator for performing a set operation on a set of. The logical operator "AND" is an operator for finding the intersection,
"OR" is an operator for obtaining a union.

【００５０】例えば、図４に示した識別子１のキーフレ
ーム８２と識別子２のキーフレーム８２とを論理演算子
「ＡＮＤ」で結ぶ指示は、以下の検索式（１）で表せ
る。For example, an instruction to connect the key frame 82 of the identifier 1 and the key frame 82 of the identifier 2 shown in FIG. 4 with the logical operator "AND" can be expressed by the following retrieval formula (1).

【００５１】１ＡＮＤ２・・・（１）この場合、上記表３に示したように識別子１及２のキー
フレーム８２について共通したキーワードの｛女性｝を
有するシーンＳｉという検索条件を指定することに相当
する。1 AND 2 (1) In this case, as shown in Table 3 above, the search condition of scene Si having the keyword {female} common to the key frames 82 of identifiers 1 and 2 should be specified. Equivalent to.

【００５２】同様に、識別子１のキーフレーム８２と識
別子２のキーフレーム８２とを論理演算子「ＯＲ」で結
ぶ指示は、以下の検索式（２）で表せる。Similarly, an instruction to connect the key frame 82 of identifier 1 and the key frame 82 of identifier 2 with the logical operator "OR" can be expressed by the following retrieval formula (2).

【００５３】１ＯＲ２・・・（２）従って、｛車｝、｛女性｝、｛男性｝の何れかのキーワ
ードを有するシーンＳｉという検索条件を指定すること
に相当する。1 OR 2 (2) Therefore, it corresponds to specifying the search condition of scene Si having any one of the keywords {car}, {female}, and {male}.

【００５４】一方、論理演算子「＆」、「｜」及び
「¬」は、検索式の論理演算子そのものであることを明
言するために用いる。On the other hand, the logical operators "&", "|", and "¬" are used to make clear that they are the logical operators themselves of the retrieval expression.

【００５５】例えば、識別子１のキーフレーム８２と識
別子２のキーフレーム８２とを論理演算子「＆」で結ぶ
指示は、以下の検索式（３）で表せる。For example, an instruction to connect the key frame 82 of the identifier 1 and the key frame 82 of the identifier 2 with the logical operator "&" can be expressed by the following retrieval formula (3).

【００５６】１＆２・・・（３）この場合、｛車または女性｝でかつ｛女性または男性｝
という検索条件、すなわち、｛車かつ女性｝、｛女
性｝、｛車かつ男性｝、｛女性かつ男性｝の何れかのキ
ーワードを有するシーンＳｉという検索条件を指定する
ことに相当する。1 & 2 (3) In this case, {car or woman} and {woman or man}
Is equivalent to specifying the search condition of scene Si, that is, scene Si having any one of the keywords {car and woman}, {female}, {car and man}, and {female and man}.

【００５７】同様に、識別子１のキーフレーム８２と識
別子２のキーフレーム８２とを論理演算子「｜」で結ぶ
指示は、以下の検索式（４）で表すことができ、論理演
算子「¬」を用いた指示は、検索式（５）で表せる。Similarly, an instruction to connect the key frame 82 of the identifier 1 and the key frame 82 of the identifier 2 with the logical operator "|" can be expressed by the following retrieval formula (4), and the logical operator "¬" The instruction using “” can be expressed by the search formula (5).

【００５８】１｜２・・・（４） ¬ ３・・・（５）従って、検索式（４）は、キーワードが｛車または女
性｝または｛女性または男性｝という検索条件、すなわ
ち、｛車｝、｛女性｝、｛男性｝の何れかのキーワード
を有するシーンＳｉという検索条件を指定することに相
当する。また、検索式（５）は、キーワードが｛山｝を
含まないという検索条件、すなわち、｛山｝を含まない
シーンＳｉを検索することを指定することに相当する。1 | 2 (4) ¬ 3 (5) Therefore, in the search expression (4), the search condition that the keyword is {car or woman} or {female or man}, that is, {car }, {Female}, or {male}, which corresponds to specifying a search condition of scene Si having any of the keywords. The search formula (5) corresponds to specifying a search condition that the keyword does not include {mountain}, that is, specifying that scene Si that does not include {mountain} is searched.

【００５９】上記説明した検索式は、入力装置５０によ
って、キーフレーム８２の識別子及び上記論理演算子
「ＡＮＤ」、「ＯＲ」、「＆」、「｜」及び「¬」を用
いて直接入力でき、入力された検索式は任意条件入力フ
ィールド７２に表示される。例えば、以下の検索式
（６）に示したように、上記検索式（１）、（３）の組
合せによる検索条件を指定できる。The search formula described above can be directly input by the input device 50 using the identifier of the key frame 82 and the logical operators "AND", "OR", "&", "|" and "¬". The entered search formula is displayed in the optional condition input field 72. For example, as shown in the following search expression (6), search conditions can be specified by a combination of the search expressions (1) and (3).

【００６０】（１ＡＮＤ２）＆３・・・（６）この場合、（）内が各キーフレームの共通したキーワ
ードＷｊの｛女性｝を示して「＆３」により、キーワー
ドが｛女性｝でかつ｛山｝に対応するシーンＳｉという
検索条件を指定したことに相当する。(1 AND 2) & 3 (6) In this case, () indicates {female} of the keyword Wj common to the key frames, and "&3" indicates that the keyword is {female} and This is equivalent to designating the search condition of scene Si corresponding to {mountain}.

【００６１】次のステップ１２６では、任意条件入力フ
ィールド７２に文字列が入力されたか否かを判断し、肯
定判断の場合にはユーザにより検索式が直接入力されて
いるため、入力された文字列をそのまま検索式として次
のステップ１３０へ進む。一方、否定判断の場合には、
ステップ１２８へ進み指示されたキーフレーム８２の識
別子と論理演算子とを結び付けた文字列を検索式として
生成した後、ステップ１３０へ進む。In the next step 126, it is determined whether or not a character string is input in the arbitrary condition input field 72. If the determination is affirmative, the user has directly input the search expression, so the input character string Is used as it is as a search expression and the process proceeds to the next step 130. On the other hand, in the case of negative judgment,
After proceeding to step 128, a character string in which the identifier of the instructed key frame 82 and the logical operator are connected is generated as a search expression, and then proceeds to step 130.

【００６２】ステップ１３０は、上記の検索式に基づい
てキーワード検索式Ｑを生成する処理である。すなわ
ち、ＷＦ対応情報Ｉ４を参照し、上記検索式の各キーフ
レーム８２の識別子を対応するキーワードＷｊの集合に
置換する。このとき、論理演算子ＡＮＤは共通集合の演
算、論理演算子ＯＲは和集合の演算と解釈して検索式を
評価する。例えば、上記の検索式（１）に対しては、キ
ーフレーム８２の識別子１、２を各々キーワードＷｊ
｛車、女性｝、｛女性、男性｝に置換し、その共通集合
を求めて、キーワード検索式Ｑは、Ｑ＝｛女性｝となる。Step 130 is a process for generating the keyword search formula Q based on the above search formula. That is, referring to the WF correspondence information I4, the identifier of each key frame 82 in the search formula is replaced with a set of corresponding keywords Wj. At this time, the logical operator AND is interpreted as a common set operation and the logical operator OR is interpreted as a union operation to evaluate the search expression. For example, in the above search formula (1), the identifiers 1 and 2 of the key frame 82 are respectively set to the keyword Wj
Substituting {car, female}, {female, male} for the common set, the keyword search expression Q becomes Q = {female}.

【００６３】また、上記検索式（６）に対しては、キー
フレーム８２の識別子１、２及び３を各々キーワード
｛車、女性｝、｛女性、男性｝及び｛山｝に置換し、Ａ
ＮＤに対して集合演算を施すと、キーワード検索式Ｑ
は、Ｑ＝（１ＡＮＤ２）＆３＝（｛車、女性｝ＡＮＤ｛女性、男性｝）＆｛山｝＝｛女性｝＆｛山｝となる。For the above retrieval formula (6), the identifiers 1, 2 and 3 of the key frame 82 are replaced with keywords {car, female}, {female, male} and {mountain}, respectively, and A
When a set operation is applied to ND, a keyword search expression Q
Is Q = (1 AND 2) & 3 = ({car, woman} AND {female, man}) & {mountain} = {woman} & {mountain}.

【００６４】このように、論理演算子「ＡＮＤ」、「Ｏ
Ｒ」の評価が終了した後、生成されたキーワード検索式
Ｑを検索処理装置４６に出力し、本ルーチンを終了す
る。In this way, the logical operators "AND", "O"
After the evaluation of "R" is completed, the generated keyword search expression Q is output to the search processing device 46, and this routine is completed.

【００６５】次に、検索処理装置４６では、検索式生成
装置４４からキーワード検索式Ｑが入力されると、図６
に示したキーワード検索式Ｑに基づく検索処理実行ルー
チンが実行される。先ず、ステップ１３２では、シーン
・エディタ装置２４へ信号を出力することにより第４の
記憶装置３０に記憶されたキーワード索引情報Ｉ３が入
力される。次のステップ１３４では、キーワード索引情
報Ｉ３を参照することによって、入力されたキーワード
検索式Ｑのキーワードに対応する開始フレーム番号及び
終了フレーム番号を得てステップ１３６へ進む。ステッ
プ１３６では、シーン・エディタ装置２４へ信号を出力
することにより第３の記憶装置２８に記憶されたシーン
情報Ｉ２が入力される。次のステップ１３８では、キー
ワード検索式Ｑが論理演算子を含むか否かを判断し、肯
定判断の場合は、ステップ１４０において、論理演算子
に基づいてキーワード検索式Ｑが真となる開始フレーム
番号及び終了フレーム番号を得て、ステップ１４２へ進
む。一方否定判断の場合は、ステップ１４１においてシ
ーン情報Ｉ２より直接、開始フレーム番号及び終了フレ
ーム番号を得てステップ１４２へ進む。Next, in the search processing device 46, when the keyword search formula Q is input from the search formula generating device 44, the process shown in FIG.
A search process execution routine based on the keyword search formula Q shown in is executed. First, at step 132, the keyword index information I3 stored in the fourth storage device 30 is input by outputting a signal to the scene editor device 24. In the next step 134, the keyword index information I3 is referred to obtain the start frame number and the end frame number corresponding to the keyword of the input keyword search expression Q, and the process proceeds to step 136. In step 136, the scene information I2 stored in the third storage device 28 is input by outputting a signal to the scene editor device 24. At the next step 138, it is determined whether or not the keyword search expression Q includes a logical operator. If the determination is affirmative, at step 140, the start frame number at which the keyword search expression Q becomes true based on the logical operator. And the end frame number are obtained and the process proceeds to step 142. On the other hand, in the case of negative determination, in step 141, the start frame number and the end frame number are directly obtained from the scene information I2, and the process proceeds to step 142.

【００６６】ステップ１４２では、得られた開始フレー
ム番号及び終了フレーム番号で定まるフレーム区間に含
まれるシーンＳｉのシーン情報Ｉ２を参照することによ
り、代表フレーム番号及び代表フレーム８０の格納アド
レスを得る。この代表フレーム番号及び代表フレーム８
０の格納アドレスに応じてステップ１４４において表示
装置４８に代表フレーム８０を表示する。次のステップ
１４６では、ユーザが入力装置５０により代表フレーム
８０を指示しかつシーン再生を指示したか否かを判断
し、肯定判断の場合には、ステップ１４８において指示
されたシーンＳｉをＴＶモニター１８に再生させるため
の命令をＬＤプレーヤー１２へ出力する。一方、否定判
断の場合には、そのまま本ルーチンを終了する。In step 142, the representative frame number and the storage address of the representative frame 80 are obtained by referring to the scene information I2 of the scene Si included in the frame section determined by the obtained start frame number and end frame number. This representative frame number and representative frame 8
In step 144, the representative frame 80 is displayed on the display device 48 according to the storage address of 0. In the next step 146, it is determined whether or not the user has instructed the representative frame 80 and the scene reproduction by the input device 50. If the determination is affirmative, the scene Si instructed in step 148 is displayed on the TV monitor 18 Then, the LD player 12 is instructed to play back. On the other hand, in the case of a negative determination, this routine is finished as it is.

【００６７】すなわち、上記検索式（１）の場合、上記
表２のキーワード索引情報Ｉ３を参照して、｛女性｝の
キーワードに対応するフレーム区間〔１００、３９９〕
を得る。次に、上記表３のシーン情報Ｉ２を検索して、
シーンＳ２、Ｓ３及びＳ４が検索結果のシーンとして得
られる。検索処理装置４６は、結果のシーンＳｉの代表
フレームの代表画像データを読み込み、表示装置４８に
表示する。That is, in the case of the retrieval formula (1), referring to the keyword index information I3 in Table 2 above, the frame section [100, 399] corresponding to the keyword {female}
To get Next, search the scene information I2 in Table 3 above,
Scenes S2, S3 and S4 are obtained as scenes of the search result. The search processing device 46 reads the representative image data of the representative frame of the resulting scene Si and displays it on the display device 48.

【００６８】同様に、上記検索式（６）の場合、｛女
性｝のキーワードに対応するフレーム区間〔１００、３
９９〕を得ると共に、｛山｝のキーワードに対応するフ
レーム区間〔３００、４９９〕または〔０、９９〕を得
る。これにより、｛女性｝と｛山｝との共通区間とし
て、〔３００、３９９〕を得る。従って、シーンＳ４が
検索結果として得られる。Similarly, in the case of the above retrieval formula (6), the frame section [100, 3 corresponding to the keyword {female}
99] together with the frame section [300, 499] or [0, 99] corresponding to the {mountain} keyword. As a result, [300,399] is obtained as a common section between {female} and {mountain}. Therefore, the scene S4 is obtained as the search result.

【００６９】このように、本実施例では、動画像全体を
短時間に分割した各シーンに対して、そのシーンを代表
する１つまたは複数の代表フレームを任意に定め、各シ
ーンにシーンの意味内容を表すキーワードを付与してい
る。ユーザはシーンの代表フレームの中から代表フレー
ムを見たときに、シーン検索するときの要素となるキー
ワードやイメージとして連想し易いキーフレームを選択
する。これにより、ユーザは、検索時に複数のキーフレ
ームを選択しかつ論理条件を指定することにより、容易
に検索条件を指定することができる。選択されたキーフ
レームに対応したキーワードに置換することにより、キ
ーワードの論理条件で記述されるキーワード検索式を生
成でき、生成されたキーワード検索式に基づいて、動画
像から所望のシーンを検索できる。As described above, in this embodiment, for each scene obtained by dividing the entire moving image in a short time, one or a plurality of representative frames representative of the scene are arbitrarily set, and the meaning of the scene is defined for each scene. Keywords that represent the content are attached. When viewing the representative frame from the representative frames of the scene, the user selects a key frame that is easy to associate as a keyword or an image that is an element for scene search. Thus, the user can easily specify the search condition by selecting a plurality of key frames and specifying the logical condition at the time of search. By substituting a keyword corresponding to the selected key frame, a keyword search formula described by the logical condition of the keyword can be generated, and a desired scene can be searched from the moving image based on the generated keyword search formula.

【００７０】従って、多種多様な意味内容を有する動画
像に含まれる所定シーンを検索するための索引として用
いるキーワードとしては、階層関係により記述した概念
（シソーラス）等を用いた複雑な用語統制によって標準
的な用語表現に限定する必要がなく、索引作成者の判断
で、キーワードを選択することができる。このため、各
々のシーンに対応するキーワードの選定において柔軟性
が増し、索引の生成が容易になる。Therefore, as a keyword used as an index for retrieving a predetermined scene included in a moving image having a wide variety of meanings, a standard term based on a complicated term control using a concept (thesaurus) described by a hierarchical relationship is used. The keyword can be selected at the discretion of the index creator without the need to limit it to specific term expressions. Therefore, flexibility is increased in selecting a keyword corresponding to each scene, and an index can be easily generated.

【００７１】また、検索条件を決定するキーフレームは
動画像中の所定の代表フレームでありシーンの意味内容
を視覚的に提示するイメージ情報をも含んでいるため、
ユーザは、キーフレームを直接選択することにより、動
画像の意味内容を特定のキーワードを用いて表現する必
要がなく、検索条件を直観的に指示できる。また、索引
者と検索者の用語表現が違った場合であっても、検索条
件を決定するキーフレームは動画像中の所定の代表フレ
ームであるため、検索者が意図するシーンの検索条件を
索引者が期待する用語表現に拘わらず設定することがで
き、用語表現の違いに起因する検索漏れや検索誤差を少
なくすることができる。Since the key frame for determining the search condition is a predetermined representative frame in the moving image and also includes image information for visually presenting the meaning and content of the scene,
By directly selecting the key frame, the user does not need to express the meaning content of the moving image using a specific keyword, and can intuitively specify the search condition. Even if the terms used by the indexer and the searcher are different, the key frame that determines the search condition is a predetermined representative frame in the moving image, so the search condition for the scene intended by the searcher is indexed. This can be set regardless of the term expression expected by a person, and the omission of search and the search error due to the difference in term expression can be reduced.

【００７２】なお、本実施例では、動画像処理システム
として、複数の装置から構成した例を説明したが、本発
明はこれに限定されるものではなく、１つの動画像処理
によって索引処理及び検索処理を行うようにしてもよ
い。In this embodiment, the moving image processing system has been described as an example including a plurality of devices, but the present invention is not limited to this, and the indexing process and the search are performed by one moving image process. You may make it process.

【００７３】また、上記実施例では、各シーンに対応す
る代表フレーム及びキーフレームにキーワードを割りつ
けているが、本発明はこのキーワードの割りつけを必須
とするものではない。すなわち、索引時に、例えばキー
ワードの識別番号のみを介してキーフレームとシーンと
の対応がなされていれば、索引時にキーワードを割りつ
けていなくても検索時にはキーフレームの選択及び論理
条件付与によって、容易に所望するシーンを検索結果と
して得ることができる。Further, in the above embodiment, the keywords are assigned to the representative frame and the key frame corresponding to each scene, but the present invention does not require this keyword assignment. That is, at the time of indexing, if a keyframe and a scene are associated with each other only through the identification number of a keyword, it is easy to select a keyframe and give a logical condition at the time of searching even if a keyword is not assigned at the time of indexing. A desired scene can be obtained as a search result.

【００７４】[0074]

【発明の効果】以上説明したように本発明の動画像処理
装置によれば、ユーザは検索時に動画像に含まれる静止
画像の選択により検索条件を指示できるため、動画像の
意味内容を記述する用語表現を考慮することなく、検索
条件を直観的に指示でき、索引を付与するときのユーザ
と検索するときのユーザとの用語表現の違いに起因する
検索漏れが減少する、という効果がある。As described above, according to the moving image processing apparatus of the present invention, the user can instruct the search condition by selecting a still image included in the moving image at the time of searching, so that the meaning of the moving image is described. There is an effect that the search condition can be intuitively specified without considering the term expression, and the omission of the search due to the difference in the term expression between the user when adding the index and the user when performing the search is reduced.

【００７５】また、本発明によれば、多種多様な意味内
容を有する動画像に対して、動画像データベースを構成
するためキーワードとして用語統制による標準的な用語
表現に限定する必要がなく、索引作成者のみの判断で、
キーワードを柔軟に選択することができるため、動画像
データベースの索引生成が容易になる。Further, according to the present invention, for moving images having various meanings, it is not necessary to limit to standard term expressions by term control as keywords for constructing a moving image database, and index creation is possible. Only by the person
Since the keyword can be flexibly selected, the index of the moving image database can be easily generated.

[Brief description of drawings]

【図１】本実施例の動画像処理システムの概略構成を示
すブロック図である。FIG. 1 is a block diagram showing a schematic configuration of a moving image processing system of this embodiment.

【図２】キーフレームが生成される過程を説明するため
のイメージ図である。FIG. 2 is an image diagram illustrating a process of generating a key frame.

【図３】本実施例のキーフレーム生成処理の流れを示す
フローチャートである。FIG. 3 is a flowchart showing the flow of a key frame generation process of this embodiment.

【図４】本実施例の検索装置による検索処理時における
表示装置の表示画面を示すイメージ図である。FIG. 4 is an image diagram showing a display screen of a display device during a search process by the search device of the present embodiment.

【図５】キーワード検索式生成ルーチンの流れを示すフ
ローチャートである。FIG. 5 is a flowchart showing a flow of a keyword search expression generation routine.

【図６】検索処理ルーチンの流れを示すフローチャート
である。FIG. 6 is a flowchart showing the flow of a search processing routine.

[Explanation of symbols]

１０動画像処理システム３６キーフレーム生成装置４２検索装置４４検索式生成装置４６検索処理装置８０代表フレーム（代表画像）８２キーフレーム（キー画像） 10 Moving Image Processing System 36 Key Frame Generator 42 Search Device 44 Search Formula Generator 46 Search Processor 80 Representative Frame (Representative Image) 82 Key Frame (Key Image)

Claims

[Claims]

1. A moving image composed of a plurality of still images that are temporally continuous and have the same or similar meanings as one scene is stored, and each scene is represented. Storage means for storing an image selected from at least one representative image as a key image, display means for displaying a plurality of the key images, selection information for selecting the key image displayed on the display means, and selection A moving image process including: an input unit for inputting a search condition for the selected key image or a search condition for a plurality of selected key images; and a search unit for searching a scene based on the selection information and the search condition. apparatus.

2. A classifying unit for classifying a moving image composed of a large number of still images temporally continuous into a plurality of scenes, each of which is made up of an image group composed of a plurality of still images having the same or similar meanings. A first selecting unit that selects at least one still image related to the meaning and content of each scene from each classified scene, a second selecting unit that selects a key image from the selected still images, and a key image A moving image processing apparatus comprising: a storage unit that stores the image.

3. The second selection means selects a still image having a semantic content in which the union of the semantic content of the selected still images matches the union of the semantic content of all the scenes of the moving image. The moving image processing apparatus according to claim 2, wherein the moving image processing apparatus is selected as a key image.