JPH07175816A

JPH07175816A - Video associative retrieving device and method

Info

Publication number: JPH07175816A
Application number: JP6260013A
Authority: JP
Inventors: Akio Nagasaka; 晃朗長坂; Takafumi Miyatake; 孝文宮武; Hirotada Ueda; 博唯上田; Kazuaki Tanaka; 和明田中
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1993-10-25
Filing date: 1994-10-25
Publication date: 1995-07-14

Abstract

PURPOSE:To acquire a desired scene while tracing associatively a memory by deciding the state of an image to be reproduced next based on the video description information which is stored separately from the point information. CONSTITUTION:When a feat that one of icons displayed in a window 1108 is pointed is detected by a point position detecting part, an index control part checks the presence or absence of a double click state. If the absence of the double click state is decided, the video call processing is carried out. If the presence of the double click state is confirmed, a new window similar to the window 1108 is produced and the icons of lower orders are shown in a list by referring to a lower order scene 1014 that is included in a scene structure 1000 corresponding to the pointed scene. Furthermore the index control part displays the corresponding scene in a monitor window or produces a new window to show scenes of lower orders, if detected, in a list when an icon is pointed in the window.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は，映像を連想的に検索し
任意の映像を見つけ出す装置及び方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an apparatus and method for associatively retrieving an image and finding an arbitrary image.

【０００２】[0002]

【従来の技術】近年，計算機の高速化と大容量化を背景
にして，従来は扱えなかった映画やビデオなどの映像情
報を対象としたデータベースの構築が活発になってい
る。これに伴い，蓄積された大量の映像の中から，所望
のシーンを効率良く選び出す検索技術の実用化が進めら
れている。こうした検索のための方法としては，ユーザ
が所望のシーンの特徴もしくはキーワード等を指示し，
計算機がそれに合致したキーワードが付けられているシ
ーンを見つけ出す方法が，映像データベースの分野で一
般的となっている。しかし，シーンの特徴を的確に指定
することは，検索に不慣れなユーザにとってはもちろ
ん，熟練者にとっても非常に難しく，思い通りの検索結
果が得られないことが多い。2. Description of the Related Art In recent years, against the backdrop of high-speed and large-capacity computers, construction of a database for image information such as movies and videos that cannot be handled in the past has become active. Along with this, a search technology for efficiently selecting a desired scene from a large amount of accumulated video is being put into practical use. As a method for such a search, the user specifies a desired scene feature or keyword,
The method by which a computer finds a scene with a keyword that matches it is common in the field of video databases. However, it is very difficult not only for a user who is unfamiliar with searching but also for an expert to accurately specify the characteristics of the scene, and the desired search result is often not obtained.

【０００３】古典的な情報である本には，検索の補助情
報として，目次と索引が用意されている。目次は，本文
のひとまとまりを象徴するキーワードを，本文中におけ
る進行順に列挙した情報である。索引は，本文中の重要
なキーワードを，五十音順などの見つけやすい順番で整
理して列記した情報である。両者に共通する最大の特徴
は，そうしたキーワードが一覧表示されていることにあ
る。そして，必ず巻頭もしくは巻末にあると決まってお
り，探す手間がかからないことである。読者は，本文中
の一節を，目次や索引を用いることで，自らキーワード
を考えることなく探し当てることができる。また，目次
を見れば，本文の概要を把握でき，その本が読むに値す
るかどうかも短時間でわかる。Books, which are classical information, have a table of contents and an index as auxiliary information for retrieval. The table of contents is information that lists keywords that symbolize a group of texts in the order of progression in the text. The index is information in which important keywords in the text are arranged in an order that is easy to find, such as alphabetical order. The greatest feature common to both is that such keywords are displayed in a list. And it is always decided that it is at the beginning or end of the book, and it does not take time to search. Readers can find a passage in the text by using the table of contents or index without having to think about keywords themselves. The table of contents also gives you an overview of the text, and you can quickly see if the book is worth reading.

【０００４】目次や索引による検索には，表示されてい
るキーワードが多すぎると適当な部分が見つけにくい，
逆に少ないと適当なキーワードがそもそも存在しない，
という問題点もある。しかし，これらの問題点は，ハイ
パーテキストやフルテキスト検索との併用で解決でき
る。すなわち，まず目次・索引の項目数をある程度限定
してユーザに提示する。ユーザは，その中から，とりあ
えず目的の部位に関係しそうな次善のキーワードを使っ
て本文を参照し，その本文中に意中の部位に直接関係あ
りそうなキーワードがないか探せばよい。見つかれば，
ハイパーテキストの機構を使って，意中の部位を参照す
ることで目的は達せられる。これはオンラインマニュア
ルでの検索などで日常的に行われるテクニックである。
ハイパーテキストには，あらかじめキーワードが登録さ
れていることが必要であるが，フルテキスト検索を使え
ば，登録されていないキーワードでも同様のことができ
る。このように，キーワードを連想的に辿る機構によっ
て，目次や索引の利用できる範囲が広がり，多くの場
合，眼前に現れるキーワードをただ取捨選択していくだ
けの処理で目的の部位が検索できる（以下，連想検索と
呼ぶ）ようになる。For the search by the table of contents or the index, if too many keywords are displayed, it is difficult to find an appropriate part.
On the contrary, if there are few, there are no suitable keywords in the first place,
There is also a problem. However, these problems can be solved by using hypertext and full-text search together. That is, first, the number of items in the table of contents / index is limited to some extent and presented to the user. The user may refer to the text using the suboptimal keyword that is likely to be related to the target site, and search for a keyword that is likely to be directly related to the desired site in the text. If found
The purpose can be achieved by referring to the desired part using the hypertext mechanism. This is a technique that is routinely used for searching online manuals.
It is necessary for keywords to be registered in the hypertext in advance, but if full-text search is used, the same can be done for keywords that have not been registered. In this way, the mechanism for associatively tracking keywords expands the usable range of the table of contents and index, and in many cases, the target site can be searched by simply selecting the keywords that appear in front of the eyes (hereinafter, Called associative search).

【０００５】こうした機構は，映像の検索においても有
効と考えられる。映像においては，その中に登場する人
や物などの様々な事物が，上記のキーワードに相当する
ものとして用いることができる。これを用いた連想検索
を実現する要素技術としては，例えば，映像の表示画面
中の事物から関連するシーンや情報を参照する方式とし
て，特開平３-５２０７０号公報「動画像の関連情報参
照方式」がある。これによれば，映像中の各事物の登場
する画像区間及び位置を記憶する手段と，対応する関連
情報と結び付ける手段とを設けることにより，各事物の
表示されている画面上の一点をマウス等によってポイン
トすることで容易に関連するシーンにジャンプしたり，
関連情報を呼び出すことができる。また，画像処理を用
いることによって，各事物とその関連情報との対応づけ
を省力化する手段として，例えば，発明者らによる特開
平５−２０４９９０号公報がある。It is considered that such a mechanism is also effective in image retrieval. In the video, various things such as people and things appearing in the video can be used as those corresponding to the above keywords. As an elemental technology for realizing an associative search using this, for example, as a method of referring to a related scene or information from an object on a video display screen, Japanese Patent Laid-Open No. Hei 3-52070, “Related information reference method for moving images” is disclosed. There is. According to this, by providing means for storing the image section and position where each object appears in the video and means for associating with the corresponding related information, a point on the screen where each object is displayed can be moved to a mouse or the like. You can easily jump to related scenes by pointing by
Relevant information can be called. Further, as a means for saving the correspondence between each thing and its related information by using image processing, there is, for example, Japanese Patent Laid-Open No. 5-204990 by the inventors.

【０００６】[0006]

【発明が解決しようとする課題】上記で挙げた先行技術
は，専ら各事物とその関連情報との対応づけを行うため
の手段であって，検索システム全体としての構成やユー
ザの側の使い勝手については十分検討されているとはい
えない。また，あらかじめ関連情報との対応づけが済ん
でいる事物についてしか連想的に辿ることはできないと
いう問題点がある。The above-mentioned prior art is a means exclusively for associating each object with its related information, and is related to the configuration of the retrieval system as a whole and the usability for the user. Is not well considered. In addition, there is a problem in that it is possible to associatively follow only things that have already been associated with related information.

【０００７】本発明の目的は，映像の検索にあたって，
計算機が提示する限定された情報の中から，ただ取捨選
択していくだけの操作で，ユーザが記憶を連想的に辿り
ながら所望のシーンを見つけることができるインタフェ
ースを提供することにある。An object of the present invention is to search a video image.
The purpose is to provide an interface that allows a user to find a desired scene while associatively tracing memory by simply selecting and selecting from the limited information presented by a computer.

【０００８】本発明の第二の目的は，あらかじめ対応づ
けのされていない事物についても連想的に辿ることがで
きるような手段を提供することにある。A second object of the present invention is to provide means for associatively tracing things that are not associated in advance.

【０００９】[0009]

【課題を解決するための手段】本発明は，画面上に任意
の映像を表示するための映像表示領域と，映像の再生状
態を制御するための操作パネル領域と，映像の目次や索
引に相当するインデクス情報を表示する領域とを具備
し，それらの表示領域のうち，いずれがポイントされた
かを検出する手段と，このポイント情報と別途蓄積して
ある映像の記述情報とから次に再生すべき映像の状態を
定める手段を有する。また，表示中の事物とその位置と
を把握して，その事物の関連情報を重畳して表示する手
段と，関連情報登録変更手段を設ける。また，これら処
理に必要な情報を登録し管理する手段を設ける。The present invention corresponds to a video display area for displaying an arbitrary video on the screen, an operation panel area for controlling the playback state of the video, and a table of contents or index of the video. Area for displaying index information, and means for detecting which of these display areas has been pointed out, and this point information and video description information separately stored should be reproduced next. It has means for determining the state of the image. Further, a means for grasping the displayed object and its position, superimposing and displaying the related information of the object, and a related information registration changing means are provided. Also, a means for registering and managing the information necessary for these processes will be provided.

【００１０】さらに，表示中のシーンに現れている特定
の事物を指定する手段と，その事物の特徴量を抽出する
ための手段と，その特徴量と合致する特徴量を持つ他の
映像シーンを探し出す手段と，見つかった映像シーンに
直ちにジャンプする手段とを設ける。Further, means for designating a specific object appearing in the scene being displayed, means for extracting the feature amount of the object, and another video scene having a feature amount matching the feature amount are displayed. A means for finding and a means for immediately jumping to the found video scene are provided.

【００１１】[0011]

【作用】本発明によれば，所望のシーンを探すとき，映
像のインデクス情報の中に所望のシーンに直接関係のあ
る事物の情報がなくても，所望のシーンを連想させる何
らかの事物を次々と辿りながら目的のシーンまで到達す
ることができる。このように，インデクス表示と連想検
索を有機的に組み合わせることにより，インデクスの利
用できる範囲が大幅に広がり，多くの場合，計算機が提
示する情報をただ取捨選択していくだけの処理で目的の
シーンが検索できるようになる。そのため，目的のシー
ンを一意に決定づける適切なキーワードもしくは画像特
徴量を考えたり指示する必要がなくなり，あやふやな記
憶でも検索ができ，初心者にもわかりやすい。また，関
連情報重畳手段によって，再生中の映像に現れている事
物の関連情報のうち選択された一部または全部の情報
を，再生映像中の該事物の位置に重畳，もしくは該事物
とその関連情報とが対応していることを明示される形態
で表示されるので，連想検索途中で現れた事物に関する
情報を即座に，かつ，どの事物の情報なのか混同するこ
となく正確に知ることができる。また，関連情報登録変
更手段を設けることにより，再生中の映像に現れている
事物の関連情報の一部または全部の情報を，その事物が
現れたその場で直ちに登録もしくは変更できる。According to the present invention, when a desired scene is searched for, even if there is no information of an object directly related to the desired scene in the index information of the video, some objects that are associated with the desired scene are successively displayed. You can reach the target scene while tracing. In this way, by organically combining index display and associative search, the usable range of the index is greatly expanded, and in many cases, the target scene can be obtained by simply selecting the information presented by the computer. You will be able to search. Therefore, it is not necessary to consider or instruct an appropriate keyword or image feature amount that uniquely determines the target scene, and it is possible to perform a search even in uncertain memory, which is easy for beginners to understand. Further, the related information superimposing means superimposes a part or all of the information selected from the related information of the thing appearing in the image being reproduced on the position of the object in the reproduced image, or the object and its relation. Since it is displayed in a form that clearly indicates that the information and the information correspond, it is possible to immediately and accurately know the information about the thing that appeared during the associative search without confusing which thing the information was. . Further, by providing the related information registration changing means, it is possible to immediately register or change a part or all of the related information of the thing appearing in the video being reproduced, on the spot where the thing appears.

【００１２】また，現在画面に表示されている事物が，
関連するシーンにジャンプするための情報を未だ付与さ
れていない場合にも，表示画面から事物についての特徴
量を抽出し，特徴量の照合を行う手段によって，その事
物が現れている別のシーンをその場で検索して映し出す
ことができる。In addition, the thing currently displayed on the screen is
Even when the information for jumping to the related scene is not added yet, the feature quantity of the thing is extracted from the display screen, and the means for collating the feature quantity is used to change another scene in which the thing appears. You can search and display on the spot.

【００１３】[0013]

【実施例】以下，本発明の１実施例を詳細に説明する。EXAMPLE An example of the present invention will be described in detail below.

【００１４】図２は，本発明を実現するためのシステム
構成例の概略ブロック図である。１はＣＲＴ等のディス
プレイ装置であり，コンピュータ４の出力画面を表示す
る。１２は，音声を再生するためのスピーカである。コ
ンピュータ４に対する命令は，マウス等の間接的なポイ
ンティングデバイス５，タッチパネル等の直接的なポイ
ンティングデバイス１３，あるいはキーボード１１を使
って行うことができる。１０の映像再生装置は，光ディ
スクやビデオデッキ等の映像を再生するための装置であ
る。映像再生装置１０から出力される映像信号は，逐
次，３の映像入力装置によってコンピュータ４の扱える
フォーマット形式に変換され，コンピュータ４に送られ
る。コンピュータ内部では，映像データは，インタフェ
ース８を介してメモリ９に入り，メモリ９に格納された
プログラムに従って，ＣＰＵ７によって処理される。１
０が扱う映像の各フレームには，映像の先頭から順に番
号，例えばフレーム番号が付けられている。フレーム番
号を制御線２によってコンピュータ４から映像再生装置
１０に送ることで，フレーム番号に対応する場面の映像
が再生される。映像データや各種情報は，外部情報記憶
装置６に格納することもできる。メモリ９にはプログラ
ムの他に，以下に説明する処理によって作成される各種
のデータが格納され，必要に応じて参照される。FIG. 2 is a schematic block diagram of a system configuration example for realizing the present invention. A display device 1 such as a CRT displays an output screen of the computer 4. Reference numeral 12 is a speaker for reproducing voice. Instructions to the computer 4 can be performed using an indirect pointing device 5 such as a mouse 5, a direct pointing device 13 such as a touch panel, or the keyboard 11. The video reproduction device 10 is a device for reproducing a video on an optical disc, a video deck, or the like. The video signal output from the video playback device 10 is sequentially converted into a format format that can be handled by the computer 4 by the video input device 3 and sent to the computer 4. Inside the computer, the video data enters the memory 9 via the interface 8 and is processed by the CPU 7 according to the program stored in the memory 9. 1
A number, for example, a frame number, is sequentially assigned to each frame of the video handled by 0 from the beginning of the video. By sending the frame number from the computer 4 to the video playback device 10 via the control line 2, the video of the scene corresponding to the frame number is played back. The video data and various types of information can also be stored in the external information storage device 6. In addition to the program, the memory 9 stores various data created by the processing described below and is referred to as necessary.

【００１５】以下では，まず連想検索システムの概要に
ついて説明し，次に各技術の詳細な実行手順について説
明する。In the following, an outline of the associative search system will be described first, and then detailed execution procedures of each technique will be described.

【００１６】映像の連想検索を実現するシステムの画面
例を図１に示す。１はディスプレイ装置であり，１２は
音声やＢＧＭ等を出力するスピーカ，５はマウスやジョ
イステイツク等の間接的なポインティングデバイス，１
１はキーボード，１３はタッチパネルのような直接的な
ポインティングデバイスである。FIG. 1 shows an example of a screen of a system that realizes associative search of video. 1 is a display device, 12 is a speaker for outputting voice, BGM, etc., 5 is an indirect pointing device such as a mouse or joystick, 1
Reference numeral 1 is a keyboard, and 13 is a direct pointing device such as a touch panel.

【００１７】ディスプレイ装置１中のモニタウインドウ
１１００は，モニタ画面になっており，ＶＣＲと同形式
の操作パネル１１０２があって，映像を自由に再生し視
聴することができる。モニタ画面に表示される映像が
「本」における「本文」，パネル（ボタン）操作は「頁
めくり」に対応する。右下のウインドウ１１０８は，対
象とする映像の各シーンの代表画像のシーン一覧表示，
右中のウインドウ１１１２は，その映像に登場する被写
体の一覧表示である。これらの一覧表示を総称して，
「インデクス」と呼ぶ。ウインドウ１１０８のシーン一
覧表示は，映像中の各シーンから典型的なフレーム画像
を選び，縮小して時間順にアイコン１１１０として一覧
に並べたものである。これらの画像はシーンの「見出
し」に相当するものとして考えることができ，それらを
時系列に並べたシーン一覧は「本」の「目次」にあた
る。一方，被写体は，シーンの重要な構成要素の一つで
あり，その意味でテキストにおける「キーワード」に相
当する。したがって，ウインドウ１１１２の被写体の一
覧表示は，「索引」に当たる。シーン一覧表示中のアイ
コン１１１０がマウスクリックされると，モニタ画面の
映像が切り替わり，そのアイコンの示すシーンが再生さ
れる。被写体の一覧表示は，被写体が何であるかを示す
アイコン１１１４と，その右側の時間軸表示部（棒グラ
フ）１１１６から成る。時間軸表示部（棒グラフ）は，
左端が映像の先頭，右端が末尾を示す時間軸になってい
て，棒として表示された部分が，その被写体の現れてい
る時間区間を示している。棒の部分をクリックすると，
その区間の映像をモニタ画面に表示する。尚，１１０４
は，マウスのようなポインティングデバイスの動きに合
わせて移動するカーソル，１１０６のウインドウは，映
像の各種関連情報を表示する汎用入出力ウインドウであ
る。A monitor window 1100 in the display device 1 is a monitor screen, and has an operation panel 1102 of the same format as a VCR so that a video can be freely reproduced and viewed. The image displayed on the monitor screen corresponds to "text" in "book", and panel (button) operation corresponds to "page turning". The lower right window 1108 displays a scene list of representative images of each scene of the target video,
The window 1112 in the middle right is a list display of subjects appearing in the video. Collectively, these list displays are
It is called "index". In the scene list display of the window 1108, typical frame images are selected from each scene in the video, reduced, and arranged in the list as icons 1110 in chronological order. These images can be thought of as being equivalent to the "heading" of the scene, and the scene list that arranges them in time series corresponds to the "table of contents" of the "book". On the other hand, the subject is one of the important constituent elements of the scene, and in that sense corresponds to the “keyword” in the text. Therefore, the list display of the subjects in the window 1112 corresponds to the “index”. When the icon 1110 in the scene list display is clicked with the mouse, the image on the monitor screen is switched and the scene indicated by the icon is reproduced. The subject list display includes an icon 1114 indicating what the subject is and a time axis display unit (bar graph) 1116 on the right side thereof. The time axis display (bar graph) is
The left end is the time axis indicating the beginning and the right end is the end, and the portion displayed as a bar indicates the time interval in which the subject appears. If you click on the bar,
The video of the section is displayed on the monitor screen. Incidentally, 1104
Is a cursor that moves in accordance with the movement of a pointing device such as a mouse, and the window 1106 is a general-purpose input / output window that displays various related information of an image.

【００１８】次に，本発明にかかる連想検索の基本的な
考え方をシンプルな例で説明する。あるユーザが一連の
映像の中から被写体Ｂが出ている特定のシーンを見つけ
たいとする。インデクスに表示される代表画像のシーン
一覧表示や被写体の一覧表示の中に目的のシーン（被写
体Ｂが出ているシーン）や被写体Ｂそのもののアイコン
が運よく存在すれば，それを直接クリックし，再生する
ことにより所期の目的は達成される。しかし，通常，映
像情報は膨大であり，容易に目的のシーンが見つけられ
ないことが多い（例えば，映像中，被写体Ｂが出ている
時間が短かければ，検索が簡単にできないことは容易に
理解されよう）。そこで，本発明の連想検索が重要な意
味を持ってくる。つまり，目的のシーン（被写体Ｂ）は
直接探せなくとも，ユーザは目的のシーン（被写体Ｂ）
に関する何らかの知識を持っていることが多く，本発明
ではその知識を利用して連想という名のリンクを施すと
いうものである。例えば，ユーザが，被写体Ｂと被写体
Ａが同時に現われていた（シーンがあったはずだ）とい
うことを記憶しているか，若しくは同時に現われている
可能性が高いという予測が成り立つと考えているなら
ば，まず被写体Ａを検索することを試みる。Next, a basic example of the associative search according to the present invention will be described with a simple example. Suppose a user wants to find a specific scene in which a subject B appears in a series of videos. If the target scene (the scene where the subject B appears) or the icon of the subject B itself is lucky in the scene list display of the representative images displayed in the index or the subject list display, click it directly, The intended purpose is achieved by reproducing. However, the amount of video information is usually enormous, and it is often the case that the target scene cannot be easily found (for example, if the time during which the subject B appears in the video is short, it is easy to perform the search easily). As you can see). Therefore, the associative search of the present invention has an important meaning. That is, the user does not need to directly search for the target scene (subject B), but the user does not want to search for the target scene (subject B).
In many cases, the user has some knowledge of the above, and in the present invention, the knowledge is used to make a link named associative. For example, if the user remembers that subject B and subject A were appearing at the same time (there should have been a scene), or thinks that there is a high probability that they are likely to appear at the same time. , First, try to search the subject A.

【００１９】図３に本発明の映像の連想検索機能のイメ
ージを示す。図中の３枚の絵（シーン１〜３）は，連想
検索を行ったときに，モニタ画面に表示される映像中の
１シーンをそれぞれ表したイラストである。例えば，ユ
ーザは，インデクス（ウインドウ１１１２の被写体のア
イコン１１１４）の中から目的の被写体Ｂを連想できる
被写体Ａが写っているシーンを１つ探してモニタ画面に
表示する。モニタウインドウ１１００のモニタ画面に最
左側のシーン１が再生されているときに，登場している
被写体Ａ，Ｃのうちの被写体Ａをマウスカーソルでクリ
ックすると，その被写体Ａが現れている図中中央のシー
ン２に画面が切り替わる。このシーン２に一緒に現れて
いる別の被写体Ｂをクリックすれば，Ｂが現れている図
中右側のシーン３に到達することができる。このシーン
が目的のシーンであれば，連想検索は終了する。FIG. 3 shows an image of the video associative search function of the present invention. The three pictures (scenes 1 to 3) in the figure are illustrations showing one scene in the video displayed on the monitor screen when the associative search is performed. For example, the user searches the index (subject icon 1114 of the window 1112) for one scene in which the subject A that can associate with the target subject B is found and displays it on the monitor screen. When the leftmost scene 1 is reproduced on the monitor screen of the monitor window 1100 and the subject A of the appearing subjects A and C is clicked with the mouse cursor, the subject A appears. The screen switches to scene 2 of. By clicking another subject B appearing together in this scene 2, it is possible to reach the scene 3 on the right side in the figure where B appears. If this scene is the target scene, the associative search ends.

【００２０】すなわち，被写体Ｂが出ている特定のシー
ンを見つける場合，被写体Ｂが被写体Ａと同時に現れる
という連想を基に，インデクスに登録されている被写体
Ａを通して連想的に目的のシーンである被写体Ｂの特定
のシーンまで辿ることができる。このとき，キーワード
を考えるといった面倒な操作は不要であり，画面に現れ
る情報だけを見て，ただ取捨選択すればよい。That is, in the case of finding a specific scene in which the subject B appears, the subject B which is a target scene is associated with the subject A registered in the index based on the association that the subject B appears at the same time as the subject A. The specific scene of B can be traced. At this time, there is no need for troublesome operations such as thinking of keywords, and it suffices to select only by looking at the information that appears on the screen.

【００２１】尚，後述するように単に複数の被写体間の
連想に限らず，シーン自体や言葉，ＢＧＭ，字幕といっ
た，映像のあらゆるマルチメディア情報に基づいた連想
を用いて検索を行なうことが可能である。It should be noted that, as will be described later, the retrieval can be performed using not only associations between a plurality of subjects but associations based on all multimedia information of video such as scenes, words, BGM, and subtitles. is there.

【００２２】さて，こうした連想検索の機能を実現する
のに必要となる情報は，基本的に，（１）被写体の現れ
ている映像区間（出現区間），（２）被写体の画面上で
の位置（出現位置），（３）クリックされたときに切り
替わるべき他の映像区間（リンク情報）の３つである。
これら３つの情報は組にして扱われる。The information required to realize such an associative search function is basically (1) a video section (appearing section) in which the subject appears, and (2) the position of the subject on the screen. (Appearance position) and (3) other video sections (link information) to be switched when clicked.
These three pieces of information are handled as a set.

【００２３】映像再生中にどの被写体がクリックされた
かは，（１），（２）の出現区間・出現位置情報から判
定され，同じ組に格納された（３）のリンク情報から映
像の切り替え先が決定される。ここで，映像は，フレー
ムと呼ばれる静止画が毎秒３０枚の割合で連続的に表示
されることによって実現されている。これらのフレーム
に，映像の先頭から順にフレーム番号と呼ばれる連続番
号を割り振れば，（１）の出現区間は，その区間の先頭
のフレーム番号と末尾のフレーム番号とで表現すること
ができる。（２）の出現位置は，（１）の区間中の各フ
レームのどの領域に被写体が映されているのかを表す座
標情報である。（３）のリンク情報としては，同じ被写
体が現れている別のシーンを次々と巡れるようなリンク
を施しておく。１本の映像中には，同じ被写体が何度も
現れることが多いが，このリンクにより，その被写体が
登場する全てのシーンをクリックだけで簡単に呼び出す
ことができる。Which subject was clicked during video reproduction is determined from the appearance section / appearance position information of (1) and (2), and the video switching destination is determined from the link information of (3) stored in the same set. Is determined. Here, the video is realized by continuously displaying still images called frames at a rate of 30 frames per second. If consecutive numbers called frame numbers are sequentially assigned to these frames from the beginning of the video, the appearance section of (1) can be expressed by the beginning frame number and the ending frame number of the section. The appearance position of (2) is coordinate information indicating in which area of each frame in the section (1) the subject is imaged. As the link information of (3), a link is provided so that different scenes in which the same subject appears can be visited one after another. The same subject often appears in a single image many times, but this link allows you to easily call up all the scenes in which that subject appears by simply clicking.

【００２４】上述の構成による連想検索方法は，すでに
リンク情報が設定されている被写体にしか用いることが
できない。しかし，先に挙げた連想検索に必要な３種の
情報のうち，被写体の出現区間と出現位置は，例えば，
発明者らによる特願平４−２６１０３３等の被写体自動
検索技術により求めることができる。The associative search method with the above configuration can be used only for a subject for which link information has already been set. However, of the three types of information required for the associative search mentioned above, the appearance section and the appearance position of the subject are, for example,
It can be obtained by the subject automatic retrieval technique such as Japanese Patent Application No. 4-261033 by the inventors.

【００２５】被写体自動検索アルゴリズムの概略を図４
に示す。探そうとする被写体に固有の色の組み合わせを
フレーム中から見つけ出すのが基本である。まずユーザ
がビデオ映像中からその被写体が現れているフレームを
例示画像として一枚だけ選び，その画像中から特徴的な
色を抽出する。その後，システムは映像中の全てのフレ
ームについて一枚一枚小さなブロックに分割し，特徴色
を含むブロックを探していく。１枚のフレーム中に，特
徴色を含むブロックが各色について一定数以上あれば，
そのフレームにその被写体があると判定する。フレーム
におけるその被写体の出現位置は，上述の被写体検索の
処理において，その被写体の特徴色を含むブロックがフ
レーム中のどの位置に分布しているかを調べることで容
易に求められる。An outline of the subject automatic retrieval algorithm is shown in FIG.
Shown in. Basically, it is necessary to find out from the frame a color combination unique to the subject to be searched. First, the user selects only one frame in which the subject appears in the video image as an exemplary image, and extracts a characteristic color from the image. After that, the system divides every frame in the video into small blocks one by one, and searches for blocks containing characteristic colors. If there are more than a certain number of blocks containing characteristic colors in one frame,
It is determined that the subject exists in the frame. The appearance position of the subject in the frame can be easily obtained by investigating to which position in the frame the block including the characteristic color of the subject is distributed in the above-described subject search processing.

【００２６】しかし，この被写体検索方法そのものは，
例示画像をシステムに提示することを原則とし，探した
い被写体が現れている区間を最低１つは手作業によって
見つける必要があり，これが面倒な場合が多い。しか
し，本発明のような連想検索の場合には，モニタ画面上
の被写体を例示画像として直ちに利用することができる
ため，この被写体検索方法を非常に効果的に活用でき
る。However, the subject retrieval method itself is
As a general rule, an example image is presented to the system, and it is necessary to manually find at least one section in which the subject to be searched appears, which is often troublesome. However, in the case of the associative search as in the present invention, since the subject on the monitor screen can be immediately used as an example image, this subject retrieval method can be used very effectively.

【００２７】更に，映像中の全てのフレームについて，
あらかじめブロック分割し，各ブロックごとに含まれる
色の種類のリストを記憶装置に格納しておけば，上記被
写体検索から毎フレームごとのブロック分割処理が不要
になり非常に高速になる。その速度は，現行のワークス
テーション程度の性能でもリアルタイムの１００倍速が
可能となっており，１時間の映像の中から３０秒程度で
全ての被写体出現区間を見つけることができる。現在表
示されている映像から最も近い出現区間を１つだけ探せ
ばよいのであれば，平均して数秒程度で見つけることが
できる。当然ながら，記憶装置に格納された色のリスト
は，検索する被写体に関わらず同じものを使うことがで
きる。Furthermore, for all frames in the video,
If the block division is performed in advance and a list of color types included in each block is stored in the storage device, the block division processing for each frame is unnecessary from the above-described subject search, which is extremely fast. The speed is 100 times faster in real time even with the performance of current workstations, and all the subject appearance sections can be found in about 30 seconds from the image of 1 hour. If only one closest appearance section from the currently displayed image needs to be searched, it can be found in a few seconds on average. Of course, the same color list stored in the storage device can be used regardless of the subject to be searched.

【００２８】以下では，本発明を実現するシステムの実
行手順を，メモリ９に格納されたプログラムに従ってＣ
ＰＵ７により実行されるソフトウエアモジュールのブロ
ック図を用いて説明する。ここで説明する各モジュール
は，ハードウエア自体で実現することも可能である。In the following, the execution procedure of the system for implementing the present invention will be described in C according to the program stored in the memory 9.
This will be described with reference to the block diagram of the software module executed by the PU 7. Each module described here can also be realized by hardware itself.

【００２９】図５は，本発明によるシーンの連想検索を
実現するための処理ブロック図の一例である。連想検索
の手掛かりとなる映像中の被写体などの事物が，映像中
のどの時間にどの位置に現れるかの情報（出現区間，出
現位置），並びにその関連情報・飛び先となるシーンの
情報（リンク情報）は，後述するオブジェクトと呼ぶデ
ータ構造体の形式で，あらかじめ図２のメモリ９もしく
は外部情報記憶装置６に蓄積されているものとする。FIG. 5 is an example of a processing block diagram for realizing a scene associative search according to the present invention. Information (appearance section, appearance position) at which position in the video, such as a subject in the video, which is a clue for the associative search, appears in the video, as well as related information and information about the scene to which the jump is made (link (Information) is stored in advance in the memory 9 or the external information storage device 6 in FIG. 2 in the form of a data structure called an object described later.

【００３０】ここで，映像中の被写体などの事物に関す
る情報は，出現区間ごとに１つずつ作成されるオブジェ
クト指向型のデータ構造体の中で管理している。図６
は，その概念を表す説明図である。これを，以下，映像
オブジェクト，もしくは単にオブジェクトと呼ぶ。映像
は動画部分と音声部分とに分けられるが，動画について
は，その全体をフレーム画像を成すｘｙ平面，および時
間ｔの軸からなる３次元空間で表現でき，被写体の出現
区間と出現位置は，その部分空間であると考えることが
できる。この部分空間と１対１に対応づけられるデータ
構造体として映像オブジェクトを定義する。（つまり，
同一被写体であっても，原則として出現区間ごとにそれ
ぞれの映像オブジェクトとして定義され，それらの映像
オブジェクト間（被写体間）にはリンクが施される。）
映像オブジェクトには，被写体をはじめ，字幕やシーン
など映像中の様々な情報を対応づけることができる。言
葉やBGMといった音声情報についても，同様に時間軸を
持つ音声情報空間の任意の部分区間と１対１に対応づけ
られるデータ構造体として映像オブジェクトを定義でき
る。そして，リンク情報は，映像オブジェクトを相互に
参照しあうポインタとして格納する。このように，動画
・音声間など，対応するメディアが異なっても共通のデ
ータ構造体の枠組みで管理することで，映像中のあらゆ
る情報の間に自由にリンクを設定することができる。Here, information about objects such as a subject in a video is managed in an object-oriented data structure created one by one for each appearance section. Figure 6
Is an explanatory diagram showing the concept. Hereinafter, this is called a video object, or simply an object. The video is divided into a moving image part and an audio part. The entire moving image can be expressed in a three-dimensional space consisting of the xy plane forming the frame image and the axis of time t, and the appearance section and the appearance position of the subject are It can be considered to be the subspace. A video object is defined as a data structure that is in one-to-one correspondence with this subspace. (That is,
Even for the same subject, as a general rule, each appearance section is defined as each video object, and a link is provided between those video objects (subjects). )
The video object can be associated with various information in the video such as the subject, subtitles, and scenes. Similarly, for audio information such as words and BGM, a video object can be defined as a data structure that is associated with an arbitrary subsection of an audio information space having a time axis in a one-to-one correspondence. Then, the link information is stored as a pointer that mutually refers to the video objects. In this way, even if the corresponding media such as moving images and audio are different, by managing with a common data structure framework, it is possible to freely set links between all the information in the video.

【００３１】さて，図５に戻って，処理ブロック図を詳
細に説明する。オブジェクト管理部１２０は，これらオ
ブジェクトを管理するモジュールであり，オブジェクト
の登録・変更・削除の処理を行うとともに，他のモジュ
ールから要求があれば，示された条件に合致するオブジ
ェクトの情報１２２を取り出し，そのモジュールに提示
する。１００の映像再生表示部は，図１のディスプレイ
装置１のモニタ画面であるモニタウインドウ１１００
に，映像の再生及び表示処理を行うとともに，現在表示
している映像の再生位置情報２１８を１０２のポイント
事物識別部に送る。ポイント位置検出部１０４は，図１
のマウス等の間接的なポインティングデバイス５やタッ
チパネルのような直接的なポインティングデバイス１３
を常時監視し，ユーザがポイントの動作を行った表示画
面上の位置情報１１２をポイント事物識別部１０２に送
る。また，併せてその位置情報１１２はインデクス管理
部１０８，操作パネル部１１０にも送られる。ポイント
事物識別部１０２は，映像再生表示部１００から受け取
った再生位置情報２１８をオブジェクト管理部１２０に
送り，その再生位置に出現しているとして登録されてい
る全ての事物の情報をオブジェクトとして得る。もし該
当するオブジェクトがあれば，さらにそれらから事物の
位置情報を取得して，ポイント位置検出部１０４からの
位置情報１１２との照合を行い，どの事物がポイントさ
れたのかを識別する。ポイント事物識別部１０２は，識
別された事物に関する情報１１４を映像制御部１０６に
送る。映像制御部１０６は，識別された事物に関する情
報１１４の中でリンク情報に基づき，その事物が現れて
いる別のシーンにジャンプする等の処理を行うため，映
像再生表示部１００に制御情報２０８を送る。また，後
述するように，事物の関連情報を表示する際には制御情
報２１０を映像再生表示部１００に送る。１０８のイン
デクス管理部は，登録されている映像の代表的なフレー
ム画像をアイコン１１１０として記憶するとともに，そ
れらのアイコンの一覧にしてウインドウ１１０８に表示
する。インデクス管理部１０８はアイコン１１１０と一
緒にそのフレーム番号も記憶しており，ポイント位置検
出部１０４が，あるアイコンをポイントしていることを
検出すると，そのアイコンに対応するシーンを再生する
ように，制御情報１１６を映像制御部１０６に伝える。
また，ポイント事物識別部１０２から，どの事物がポイ
ントされたかの情報１２４をもらい，その事物がポイン
トされたことがインデクスからもわかるような表示を行
う。また，インデクス管理部１０８は，図１のウインド
ウ１１１２の被写体の一覧表示も管理する。つまり，被
写体が何であるかを示すアイコン１１１４を表示すると
共に，その被写体の時間軸表示（棒グラフ表示）を行な
い，棒の部分がクリックされるとその区間の映像を再生
するように制御情報１１６を映像制御部１０６に送る。
操作パネル部１１０は，再生・早送り・巻戻し等の各種
再生状態を表す図１の操作パネル１１０２を表示し，ポ
イント位置検出部１０４によって，その操作パネルがポ
イントされていることが検出されると，ポイントされた
操作パネルに対応する再生状態にするよう制御情報１１
８を映像制御部１０６に送る。Now, returning to FIG. 5, the processing block diagram will be described in detail. The object management unit 120 is a module that manages these objects, performs registration / change / deletion processing of the objects, and retrieves information 122 of the objects that match the indicated conditions if requested by another module. ， Present to that module. The video reproduction display unit 100 is a monitor window 1100 which is a monitor screen of the display device 1 of FIG.
First, the video is reproduced and displayed, and the reproduction position information 218 of the currently displayed video is sent to the point thing identifying unit 102. The point position detector 104 is shown in FIG.
Indirect pointing device 5 such as a mouse or a direct pointing device 13 such as a touch panel
Is constantly monitored, and the position information 112 on the display screen where the user has performed the point operation is sent to the point thing identifying unit 102. In addition, the position information 112 is also sent to the index management unit 108 and the operation panel unit 110. The point object identification unit 102 sends the reproduction position information 218 received from the video reproduction display unit 100 to the object management unit 120, and obtains information of all the objects registered as appearing at the reproduction position as objects. If there is a corresponding object, the position information of the thing is further acquired from them, and it is collated with the position information 112 from the point position detection unit 104 to identify which thing is pointed. The point thing identifying unit 102 sends information 114 regarding the identified thing to the video control unit 106. Since the video control unit 106 performs processing such as jumping to another scene in which the object appears, based on the link information in the information 114 regarding the identified object, the video reproduction display unit 100 displays the control information 208. send. Further, as will be described later, when displaying the related information of the thing, the control information 210 is sent to the video reproduction display unit 100. The index management unit 108 stores a representative frame image of the registered video as an icon 1110 and displays a list of those icons in the window 1108. The index management unit 108 stores the frame number together with the icon 1110, and when the point position detection unit 104 detects that the icon is pointed to, it reproduces the scene corresponding to the icon. The control information 116 is transmitted to the video control unit 106.
In addition, the information 124 indicating which thing is pointed is received from the point thing identifying unit 102, and a display is made such that the index shows that the thing is pointed. The index management unit 108 also manages the list display of subjects in the window 1112 of FIG. In other words, the icon 1114 indicating what the subject is is displayed, and the time axis display (bar graph display) of the subject is performed. It is sent to the video control unit 106.
The operation panel unit 110 displays the operation panel 1102 of FIG. 1 showing various reproduction states such as reproduction, fast forward, and rewind, and when the point position detection unit 104 detects that the operation panel is pointed. , Control information 11 to set the playback state corresponding to the pointed operation panel
8 is sent to the video control unit 106.

【００３２】図７は，映像再生表示部１００をより詳し
く示した処理ブロック図の一例である。２００の映像再
生部は，映像制御部１０６から送られる制御情報２０８
によって，どの映像をどこからどのように再生するか等
の指示を受けて映像を再生する。現在表示されている映
像の再生位置情報２１２は，逐次事物有無判定部２０２
に送られる。２０２は，その再生位置の映像に，あらか
じめ登録されている事物が登場しているかどうかをチェ
ックし，あれば，登場している全ての事物の表示画面上
での位置情報２１６を取得して，関連情報表示部２０４
に送る。この位置情報２１６は，先述のポイント事物識
別部１０２で取得する位置情報と同じものであり，位置
情報取得処理の重複を避けるため，事物情報２１８とし
てポイント事物識別部１０２に送ることができる。関連
情報表示部２０４は，再生中の各事物の関連情報を画面
上に合わせて表示することができる。制御情報２１０に
より，関連情報を表示するのか否か，表示するのならど
の関連情報をどのような形態で表示するのか等が決定さ
れる。特に，事物の位置情報２１６によって，表示中の
どの位置と対応する情報なのかを明示することができ
る。この表示方法については後述する。表示方法によっ
ては，映像２１４に重畳合成処理を行い，その映像を映
像表示部２０６で表示する。FIG. 7 is an example of a processing block diagram showing the video reproduction display unit 100 in more detail. The video reproduction unit 200 includes control information 208 sent from the video control unit 106.
The video is played according to instructions such as which video is played from where and how. The reproduction position information 212 of the currently displayed video is sequentially detected by the presence / absence determination unit 202.
Sent to. The 202 checks whether or not the pre-registered thing appears in the video of the reproduction position, and if there is, acquires the position information 216 on the display screen of all the appearing things, Related information display section 204
Send to. This position information 216 is the same as the position information acquired by the point thing identifying unit 102 described above, and can be sent to the point thing identifying unit 102 as thing information 218 in order to avoid duplication of the position information obtaining process. The related information display unit 204 can display the related information of each thing being reproduced on the screen. The control information 210 determines whether or not the related information is displayed, and if so, which related information is displayed in what form. Particularly, the position information 216 of the object can clearly indicate which position in the display corresponds to the information. This display method will be described later. Depending on the display method, the image 214 is subjected to the superimposing synthesis process, and the image is displayed on the image display unit 206.

【００３３】図８は，映像オブジェクトのデータ構造を
示す一例である。５００はデータ構造体の大枠である。
５０２は，オブジェクトのＩＤ番号で，他のオブジェク
トと識別するための一意な数が与えられる。５０４は，
オブジェクトが，例えば，人を表すのか，字幕であるの
か，あるいは音声なのかを示す分類コードである。５０
６は，そのオブジェクトが登場する映像へのポインタで
ある。この例では，後述するように，映像は物理映像６
００と論理映像９００の２階層に分けたデータ構造にな
っており，５０６は，物理映像へのポインタである。５
１０は，オブジェクトが表す映像中の事物が登場する区
間の始点のフレーム番号，５１２は終点のフレーム番号
である。５０８は，その事物を代表する映像のフレーム
番号であり，オブジェクトを視覚的に扱うインタフェー
スの下においては，この事物を表すアイコンの絵柄とし
て用いられる。５１４は，オブジェクトの表す事物の画
面上での位置を示すための構造体７００へのポインタで
ある。FIG. 8 is an example showing the data structure of a video object. 500 is the outline of the data structure.
An object ID number 502 is given a unique number for distinguishing it from other objects. 504 is
It is a classification code indicating whether the object represents, for example, a person, a subtitle, or a voice. Fifty
Reference numeral 6 is a pointer to the video in which the object appears. In this example, the video is a physical video 6 as will be described later.
00 and logical video 900 have a data structure divided into two layers, and 506 is a pointer to a physical video. 5
10 is the frame number of the start point of the section in which the object in the video represented by the object appears, and 512 is the frame number of the end point. Reference numeral 508 denotes a frame number of a video representing the object, which is used as a pattern of an icon representing the object under the interface for visually handling the object. Reference numeral 514 is a pointer to the structure 700 for indicating the position of the object represented by the object on the screen.

【００３４】図９に，オブジェクト位置構造体７００の
一例を示す。この構造体は，事物の動きがないか，ある
いは十分小さい区間ごとに１つずつ作成され，それらが
数珠つなぎになった連接リストの形をとる。７０２は，
その動きのない区間の始点フレーム番号，７０４が終点
フレーム番号である。７０６から７１２は，事物を矩形
領域で囲んだときの矩形領域の原点座標と大きさであ
る。５１６は，より抽象度の高い上位のオブジェクトへ
のポインタである。全てのオブジェクトは固有の関連情
報を持つことができるが，幾つかのオブジェクトで関連
情報を共有したほうが都合のいい場合がある。例えば，
映像中の人や物などの被写体は，同じ被写体が複数のシ
ーンで現れることが多い。もちろん，現れたときの姿や
挙動は各シーンごとに違うため，各シーンごとに固有の
関連情報が存在するが，名前であるとか，性別・年齢・
職業といった抽象度の高い情報は共有したほうがデータ
量が少なくて済み，また，情報が更新されたときにも整
合性に破綻をきたすことがない。その意味で，こうした
抽象度の高い情報は，より上位のオブジェクトの関連情
報にもたせ，そのオブジェクトへのポインタを５１６に
持つデータ構造としている。５１８は，上位のオブジェ
クトから下位のオブジェクトを参照するためのポインタ
である。これは，上位のオブジェクトも下位のオブジェ
クトも同じ５００のデータ構造体を用いるためによる。
もちろん，上位のオブジェクトには，始点・終点フレー
ムや位置情報等の映像に直接関係する情報は不要である
ので，それらを省いた簡略版の構造体を用いることもで
きる。FIG. 9 shows an example of the object position structure 700. This structure is in the form of a concatenated list in which one is created for each interval where there is no movement of objects or is small enough, and they are linked together. 702 is
The starting point frame number of the section in which there is no movement and 704 are the ending point frame numbers. Reference numerals 706 to 712 denote the origin coordinates and the size of the rectangular area when the object is enclosed by the rectangular area. Reference numeral 516 is a pointer to an upper object having a higher degree of abstraction. Every object can have its own associated information, but it may be convenient for some objects to share the associated information. For example,
As for subjects such as people and objects in the image, the same subject often appears in multiple scenes. Of course, since the appearance and behavior when it appears is different for each scene, there is related information unique to each scene, but it is a name, sex, age,
Sharing information with a high degree of abstraction, such as occupation, requires a smaller amount of data, and does not break the integrity even when the information is updated. In that sense, such information having a high degree of abstraction is given to related information of a higher-order object, and has a data structure having a pointer 516 to the object. Reference numeral 518 is a pointer for referencing a lower object from a higher object. This is because both the upper and lower objects use the same 500 data structures.
Of course, the upper-level objects do not require information directly related to the video such as the start / end frames and position information, so a simplified version of the structure without them can be used.

【００３５】５２０は，事物の関連情報を記憶するディ
クショナリ８００へのポインタである。ディクショナリ
は，図１０に示されるように，関連情報を呼び出すため
のキーとなる文字列８０４へのポインタであるキー８０
２と，そのキー文字列に対応づけて登録する関連情報の
文字列８０８へのポインタである内容８０６，及び関連
するオブジェクトへのポインタを持つリンク８１０から
構成され，登録する関連情報の項目数だけ作られ，それ
らが数珠つなぎになった連接リスト形式をとる。オブジ
ェクトの関連情報の読み出しは，キーを指定して，その
キーと合致するディクショナリ構造体の内容を返すこと
で行う。例えば，キーが「名前」で内容が「太郎」の場
合には，「名前」というキーを指定すると「太郎」とい
う関連情報が得られる。関連情報表示部２０４では，ど
の関連情報を表示するかの選択は，どのキーに対応する
内容を表示するかという処理に帰着する。リンクは，連
想検索を行うときのジャンプ先の事物へのポインタであ
り，内容８０６には，例えば，「他のシーンに現れてい
る同じ被写体」といったリンクの意味を表す文字列ある
いは記号が入り，リンク先８１０には，その被写体のオ
ブジェクトへのポインタが入る。連想検索でジャンプす
るときには，映像制御部１０６は，このオブジェクトの
構造体から，その被写体が現れている映像と先頭フレー
ム番号を読み出して，その映像位置から再生するように
映像再生部２００を制御する。Reference numeral 520 is a pointer to the dictionary 800 that stores the related information of the thing. As shown in FIG. 10, the dictionary is a key 80 that is a pointer to a character string 804 that is a key for calling related information.
2 and a content 806 that is a pointer to a character string 808 of related information to be registered in association with the key character string, and a link 810 that has a pointer to a related object, and has only the number of items of related information to be registered. It takes the form of a linked list that is created and linked together. Object related information is read by specifying a key and returning the contents of the dictionary structure that matches the key. For example, when the key is “name” and the content is “Taro”, the related information “Taro” can be obtained by specifying the key “name”. In the related information display unit 204, the selection of which related information is to be displayed results in the process of displaying the content corresponding to which key. The link is a pointer to an object to jump to when performing an associative search, and the content 806 contains, for example, a character string or a symbol representing the meaning of the link, such as “the same subject appearing in another scene”. A pointer to the object of the subject is entered in the link destination 810. When jumping by associative search, the video control unit 106 reads the video in which the subject appears and the start frame number from the structure of this object, and controls the video playback unit 200 to play back from that video position. .

【００３６】図１１は，映像再生部２００のより詳しい
処理ブロック図である。映像は，論理映像と物理映像の
２階層構造になっている。論理映像はシーンの集合体と
しての構造情報だけを持ち，物理映像は映像の実データ
を持つ。論理映像呼出部３００は，映像制御部から送ら
れる再生位置設定情報３１０から，論理映像のライブラ
リ３０４から合致する論理映像を呼び出す。FIG. 11 is a more detailed processing block diagram of the video reproducing unit 200. The video has a two-layer structure of a logical video and a physical video. A logical video has only structural information as a set of scenes, and a physical video has actual data of the video. The logical video calling unit 300 calls the matching logical video from the logical video library 304 based on the reproduction position setting information 310 sent from the video control unit.

【００３７】図１２に論理映像のデータ構造体９００の
一例を示す。９０２は，論理映像を一意に特定するＩＤ
番号である。９０４は，論理映像を代表するシーンの番
号である。９０６は，構成シーンを表す連接リストで，
シーン１０００が再生されるべき順番に連なっている。
９０８は，シーン間のデゾルブやワイプといった特殊効
果の設定情報である。９１０には，各種関連情報が入
る。FIG. 12 shows an example of a logical video data structure 900. 902 is an ID that uniquely identifies the logical video
It is a number. Reference numeral 904 is the number of the scene that represents the logical image. 906 is a concatenation list representing constituent scenes.
The scenes 1000 are in the order in which they should be played.
Reference numeral 908 is setting information of special effects such as dissolve and wipe between scenes. Various related information is entered in 910.

【００３８】図１３に，シーン構造体１０００の一例を
示す。１００２がシーンの代表フレーム番号で，１００
４が始点，１００６が終点のフレーム番号である。対応
する物理映像へのポインタが１００８に入る。１０１０
には，このシーンの中に登場する全ての事物のデータ構
造体，すなわちオブジェクトへのポインタが連接リスト
形式で入る。シーンは，その映像内容のつながりを単位
にまとめることができ，ピラミッド状の階層的な管理を
行うことができる。１０１２の上位シーンは，そうした
上位のシーンへのポインタであり，１０１４の下位シー
ンは，１段下位にある全てのシーンを連接リストにした
ものへのポインタである。１０１６はシーンの属性情報
である。物理映像呼出部３０２は，フレーム番号に３０
０でシーン情報が加わった情報３１２によって，物理映
像のライブラリ３０８から呼び出す物理映像と再生する
フレーム位置を決定する。FIG. 13 shows an example of the scene structure 1000. 1002 is the representative frame number of the scene, and 100
4 is a start point and 1006 is an end point frame number. A pointer to the corresponding physical image enters 1008. 1010
A data structure of all things appearing in this scene, that is, pointers to objects, is entered in the linked list format. For scenes, the connection of the video content can be collected in units, and hierarchical management in a pyramid shape can be performed. The upper scene of 1012 is a pointer to such an upper scene, and the lower scene of 1014 is a pointer to a concatenation list of all the scenes that are one step lower. Reference numeral 1016 is scene attribute information. The physical image calling unit 302 sets the frame number to 30
The physical image called from the physical image library 308 and the frame position to be reproduced are determined by the information 312 to which the scene information is added at 0.

【００３９】図１４は，物理映像構造体６００の一例で
ある。６０２は，物理映像を一意に特定するＩＤ番号で
ある。６０４は，レーザーディスクの映像なのか，ビデ
オテープのものか，あるいは外部情報記憶装置に格納さ
れたデータなのかを識別するための分類コードである。
６０６は代表フレーム番号，６０８が始点，６１０が終
点フレーム番号である。６１６には属性情報が入る。他
は，映像データが物理映像のデータ構造体の中に持って
いる場合に必要となる情報である。６１２が映像の画面
幅，６１４が同高さであり，６１８は，あるフレーム番
号に対応するフレーム画像データが，物理映像のどのア
ドレスから存在するかを記憶したディレクトリである。
６２０はフレーム番号，６２２にフレームの画素デー
タ，６２４に音声データという形式がフレーム数だけ繰
り返される。物理映像呼出部は，分類コードにより，レ
ーザディスク等の映像再生装置１０を用いる映像である
とわかれば，映像再生装置に制御命令を送って該当する
映像を呼び出す処理を行い，物理映像中にある場合に
は，該当する映像を呼び出す。FIG. 14 shows an example of the physical image structure 600. An ID number 602 uniquely identifies the physical image. Reference numeral 604 is a classification code for identifying whether the image is a laser disk image, a video tape image, or data stored in an external information storage device.
Reference numeral 606 is a representative frame number, 608 is a start point, and 610 is an end point frame number. Attribute information is entered in 616. The other is information required when the video data is contained in the physical video data structure. 612 is a screen width of the video, 614 is the same height, and 618 is a directory that stores from which address of the physical video the frame image data corresponding to a certain frame number exists.
A frame number 620, frame pixel data 622, and audio data 624 are repeated for the number of frames. If the physical video calling unit finds from the classification code that the video uses the video playback device 10 such as a laser disk, it sends a control command to the video playback device to call the corresponding video, and the physical video calling unit is in the physical video. In that case, call the corresponding video.

【００４０】論理映像を用いるメリットの一つは，大き
なデータ量になりがちな物理映像の１本から，その映像
を用い，様々に編集された多種多様の映像作品を少ない
データ量で作れることにある。特に，ニュースなど過去
の資料映像を頻繁に使い回すような映像ほど，論理映像
を用いる利点が大きい。もう一つのメリットは，シーン
ごとに登場するオブジェクトをあらかじめ記憶しておく
ことにより，映像再生中にどの事物が現れているのか
を，全てのオブジェクトについて調べる必要がなくな
り，迅速な処理が期待できる。One of the merits of using a logical image is that one physical image, which tends to have a large amount of data, can be used to create a wide variety of edited video works with a small amount of data. is there. In particular, the more frequently the past material video such as news is reused, the more the advantage of using the logical video is. Another advantage is that by storing the objects that appear in each scene in advance, it is not necessary to check for all the objects what is appearing during video playback, and rapid processing can be expected.

【００４１】先に簡単に説明した図１のコンピュータ画
面例を用いて，連想検索のインタフェース部分の実行手
順について詳細に説明する。モニタウインドウ１１００
には，前述の映像再生表示部１００により任意の映像が
表示される。表示と合わせ，音声もスピーカ１２から出
力される。１１０４がカーソルで，マウスやジョイステ
ィク等の間接的なポインティングデバイス５の操作に合
わせて画面上を移動しポイント操作を行う。同様のポイ
ント操作はタッチパネルのような直接的なポイティング
デバイス１３によっても行うことができ，その際はカー
ソルは不要にできる。前述のポイント位置検出部１０４
は，これらのポインティングデバイスを常時監視し，マ
ウスの移動に合わせてカーソル１１０４を移動したり，
マウスのボタンが押されたときには，ポイント操作があ
ったとして，そのときの画面上のカーソルの位置情報
を，その位置情報を必要とする各処理モジュールに送
る。タッチパネルの場合には，タッチがあった時点で，
そのタッチされた位置を検出し，その位置情報を送る。
１１０２は，映像の再生状態を制御するための操作パネ
ルであり，操作パネル部１１０によって，再生・早送り
などの再生状態を示す絵や文字が描かれたボタンと，モ
ードを変更するためのボタン，映像再生表示部からの各
種情報を表示するためのディスプレイ領域などが表示さ
れる。操作パネルの表示領域がポイントされたことが，
ポイント位置検出部１０４から伝えられると，その位置
情報から，さらにどのボタンがポイントされたかを検出
し，そのボタンに対応づけられた制御コードが映像再生
表示部１００に送られる。１１０６は，汎用入出力ウイ
ンドウで，キーボード１１等を使って各種情報をコンピ
ュータとやりとりできる。ファイル名を入力すること
で，連想検索を行う映像の指定をこのウインドウから行
うことができる。入力されたファイル名は，再生開始位
置を示す先頭フレームの番号と一緒に再生位置設定情報
３１０として映像再生表示部１００に送られ，１００の
中の論理映像呼出部３００は，その情報から対応する映
像を呼び出し，物理映像呼出部を経由して映像がモニタ
ウインドウ１１００に表示される。また，映像の各種関
連情報をこの汎用入出力ウインドウ１１０６に表示する
こともできる。The execution procedure of the interface portion of the associative search will be described in detail with reference to the computer screen example of FIG. 1 briefly described above. Monitor window 1100
, An arbitrary image is displayed by the image reproduction display unit 100 described above. Along with the display, voice is also output from the speaker 12. A cursor 1104 moves on the screen according to an indirect operation of the pointing device 5 such as a mouse or a joystick to perform a point operation. The same point operation can be performed by the direct pointing device 13 such as a touch panel, and in that case, the cursor can be omitted. The point position detection unit 104 described above
Constantly monitors these pointing devices, moves the cursor 1104 according to the movement of the mouse,
When the mouse button is pressed, it is assumed that there is a point operation, and the position information of the cursor on the screen at that time is sent to each processing module that requires the position information. In case of touch panel, when touched,
The touched position is detected and the position information is sent.
Reference numeral 1102 denotes an operation panel for controlling the reproduction state of the image. The operation panel unit 110 uses a button on which a picture or a character indicating the reproduction state such as reproduction / fast forward is drawn, and a button for changing the mode. A display area or the like for displaying various information from the video reproduction display unit is displayed. When the display area of the operation panel is pointed,
When transmitted from the point position detection unit 104, which button is pointed is further detected from the position information, and the control code associated with the button is sent to the video reproduction display unit 100. A general-purpose input / output window 1106 allows various kinds of information to be exchanged with a computer using the keyboard 11 or the like. By entering the file name, you can specify the video for associative search from this window. The input file name is sent to the video reproduction display unit 100 as reproduction position setting information 310 together with the number of the first frame indicating the reproduction start position, and the logical video calling unit 300 in 100 responds from the information. The video is called, and the video is displayed on the monitor window 1100 via the physical video calling unit. Also, various related information of the video can be displayed on the general-purpose input / output window 1106.

【００４２】ウインドウ１１０８に表示中のアイコン１
１１０の一つがポイントされたことがポイント位置検出
部によって検出されると，インデクス管理部１０８は，
そのアイコンに対応するシーンの先頭フレーム番号を再
生位置設定情報として映像再生表示部１００に伝える。
１００は，モニタウインドウ１１００にそのシーンの映
像を表示する。表示された映像は，１１０２の操作パネ
ルによって再生や早送りなどの制御ができる。これによ
り映像の再生が開始されると，論理映像呼出部３００が
出力する再生位置情報３１４が，インデクス管理部１０
８に伝えられ，１０８は，１１０８のウインドウにおい
て，例えば，再生中のシーンのアイコンがハイライトし
たり点滅するといった強調表示を行い，現在モニタウイ
ンドウ１１００で再生されている映像に対応するシーン
が一目でわかるようにする。Icon 1 displayed in window 1108
When the point position detection unit detects that one of the 110 is pointed, the index management unit 108
The top frame number of the scene corresponding to the icon is transmitted to the video reproduction display unit 100 as reproduction position setting information.
100 displays the video of the scene on the monitor window 1100. The displayed image can be controlled by the operation panel 1102 such as reproduction and fast forward. When the video reproduction is started by this, the reproduction position information 314 output by the logical video calling unit 300 is changed to the index management unit 10
8 is displayed, and in the window 1108, for example, the icon of the scene being reproduced is highlighted or blinked, and the scene corresponding to the video currently being reproduced in the monitor window 1100 is displayed at a glance. To understand.

【００４３】１１０８におけるシーンの表示は階層的に
行うことができる。まず，ポイントのしかたを，例え
ば，クリックとダブルクリックとの２種類用意し，クリ
ックを上述の映像呼び出しのためのポイント手段とし
て，ダブルクリックを後述するシーンの階層管理のため
のポイント手段に用いる。１１０８に表示されたアイコ
ンの一つがポイントされたことがポイント位置検出部に
よって検出されると，インデクス管理部１０８は，それ
がダブルクリックかどうかを調べる。ダブルクリックで
なければ，上述の映像呼び出しの処理を行い，ダブルク
リックならば，ポイントされたシーンに対応するシーン
構造体１０００の中の下位シーン１０１４を参照し，１
１０８と同様のウインドウを新たに作成して，それら下
位シーンのアイコンを一覧表示する。こうして新たに作
成されたウインドウは，１１０８と同様にポイントを検
出する対象となり，このウインドウ上のアイコンがポイ
ントされると，インデクス管理部は，対応するシーンを
モニタウインドウに表示したり，さらに下位のシーンが
あれば，それら下位のシーンの一覧表示を行うウインド
ウを新たに作成する。こうした階層的な管理は，映像の
選択の際にも用いることができ，１本の映像ごとに，そ
の全てのシーンを束ねる最上位のシーン１個を対応づけ
ておけば，上記の枠組みの範疇で，登録されている映像
の中から所望の映像をウインドウから選択したり，さら
に下位のシーンの一覧を表示させたりすることができ
る。The scene display at 1108 can be performed hierarchically. First, two types of points are prepared, for example, click and double-click, and the click is used as the point means for calling the above-mentioned video, and the double-click is used as the point means for hierarchical management of the scene described later. When the point position detection unit detects that one of the icons displayed in 1108 is pointed, the index management unit 108 checks whether it is a double click. If it is not a double click, the above-mentioned video calling process is performed. If it is a double click, the lower scene 1014 in the scene structure 1000 corresponding to the pointed scene is referred to, and 1
A window similar to that of 108 is newly created and the icons of the subordinate scenes are displayed in a list. The window newly created in this way becomes a target for detecting a point similarly to 1108, and when the icon on this window is pointed, the index management unit displays the corresponding scene in the monitor window or further down. If there are scenes, create a new window that displays a list of those subordinate scenes. Such hierarchical management can also be used when selecting an image, and if one top-level scene that bundles all the scenes is associated with each image, the above-mentioned framework is included. Then, a desired image can be selected from the registered images from the window, and a list of subordinate scenes can be displayed.

【００４４】１１１２は，アイコン１１１４と時間軸表
示部１１１６からなり，例えば，別々のシーンに現れて
いるが実は同じ被写体である，などといった基準で分類
された幾つかの事物をまとめ，代表する一つのアイコン
１１１４を表示して，その横に，映像全体の中でそれら
の事物が登場する区間を，横軸を時間軸にとった棒グラ
フで表示したインデクスである。同じ分類の事物は各々
オブジェクト構造体５００で管理されており，共通のオ
ブジェクト構造体へのポインタを上位オブジェクト５１
６に持つ。逆に上位オブジェクトは，各事物のオブジェ
クト構造体へのポインタを下位オブジェクト５１８に連
接リスト形式で持つ。インデクス管理部１０８は，上位
オブジェクトを記憶管理する。アイコンとして表示され
るのは，上位オブジェクトの構造体が記憶する代表フレ
ームの縮小画像である。棒グラフは，下位オブジェクト
の各々を調べ，その始点・終点フレーム番号から映像全
体に占める区間を計算して描画する。この棒グラフにお
ける事物の登場区間に対応する部分がポイントされたこ
とが検出されると，インデクス管理部１０８は，その部
分の映像をモニタウインドウ１１００に表示させる。ア
イコンをポイントしてオブジェクトを選択し関連情報を
付与・変更すれば，上位オブジェクトの関連情報とし
て，すなわち，同じ分類の全ての事物に共通の情報とし
て登録される。Reference numeral 1112 is composed of an icon 1114 and a time axis display portion 1116, and is a representative of several things classified according to a criterion such as appearing in different scenes but actually the same subject. One icon 1114 is displayed, and next to it, a section in which those things appear in the entire image is displayed as a bar graph with the horizontal axis as the time axis. Objects of the same classification are managed by the object structure 500, and a pointer to a common object structure is used as the upper object 51.
Have 6 On the contrary, the upper object has a pointer to the object structure of each thing in the lower object 518 in the linked list format. The index management unit 108 stores and manages upper objects. What is displayed as an icon is a reduced image of the representative frame stored in the structure of the upper object. The bar graph examines each of the subordinate objects, calculates the section that occupies the entire image from the start and end frame numbers, and draws the section. When it is detected that the portion corresponding to the appearance section of the thing in this bar graph is pointed, the index management unit 108 causes the monitor window 1100 to display the image of the portion. If you point to the icon and select an object to add / change related information, it will be registered as related information for the higher-level object, that is, as information common to all things of the same classification.

【００４５】一方，モニタウインドウ１１００がポイン
トされたことが検出されると，そのポイント位置の情報
から，ポイント事物識別部１０２によって，映像中のど
の事物がポイントされたかを検出する。この処理は，現
在再生中のシーンがどれであるかを示す再生位置情報３
１４を論理映像呼出部３００から受け，そのシーンに対
応するシーン構造体の対応オブジェクト１０１０に記憶
されているオブジェクトのそれぞれについて，その始点
・終点を調べて，現在再生中のフレーム番号を示す再生
位置情報３１６と比較し，そのオブジェクトで表される
事物が現在画面上に現れているのかどうかを判定する。
現れていると判定された事物の各々について，事物の位
置，すなわちオブジェクトの位置５１４と再生位置情報
３１６とから，現在の事物の存在領域を求め，その中に
ポイントされた位置が含まれているかどうかを判定す
る。複数合致した場合には，優先順位の高いものを１つ
だけ選択する。優先順位は，例えば，連接リストの登録
順で表現できる。この方法だと，優先順位のために特別
なデータ領域を用意する必要がない。ポイントされたと
判定された事物がある場合には，その事物のオブジェク
ト構造体中のオブジェクト属性情報５２０を調べて，
「連想検索のジャンプ先」を意味するキーを持つディク
ショナリ構造体８００を探し，リンク８１０に登録され
たオブジェクトの始点フレーム番号を読みだして，その
フレームにジャンプする。オブジェクト属性情報５２０
に該当するキーがないときには，共通の上位オブジェク
トを持つ別の事物が登場しているシーンにジャンプする
ようにする。これは，ポイントされた事物の１ランク上
位のオブジェクトに登録されている下位オブジェクトの
連接リストを参照し，その事物に連接する次のオブジェ
クトの始点フレーム番号を読みだして，そのフレームに
ジャンプする。On the other hand, when it is detected that the monitor window 1100 is pointed, the point thing identification section 102 detects which thing in the image is pointed to from the information of the point position. This processing is performed by the reproduction position information 3 indicating which scene is currently being reproduced.
14 is received from the logical video calling unit 300, the start point and the end point of each of the objects stored in the corresponding object 1010 of the scene structure corresponding to the scene are checked, and the reproduction position indicating the frame number currently being reproduced is displayed. It is compared with the information 316 to determine whether the object represented by the object is currently appearing on the screen.
For each object determined to appear, the current object existing area is obtained from the object position, that is, the object position 514 and the reproduction position information 316, and whether the pointed position is included in it. Determine whether When there are multiple matches, only the one with the highest priority is selected. The priority can be expressed in the order of registration of the connection list, for example. With this method, it is not necessary to prepare a special data area for priority. If there is an object determined to be pointed, the object attribute information 520 in the object structure of the object is checked,
A dictionary structure 800 having a key that means “jump destination of associative search” is searched for, the starting point frame number of the object registered in the link 810 is read, and the frame is jumped to. Object attribute information 520
If there is no key corresponding to, jump to a scene in which another object with a common upper object appears. This refers to the connection list of the lower objects registered in the object one rank higher than the pointed object, reads the starting point frame number of the next object connected to the object, and jumps to that frame.

【００４６】以上のように，階層的にシーンを探して当
りをつけてから映像をモニタウインドウで確認し，連想
検索を行い，またインデクスウインドウで確認するとい
ったことが可能になる。これは，シーンによって構成さ
れた論理映像による映像管理手段を導入したことによっ
て達成されている。As described above, it is possible to search for a scene hierarchically and hit it, then check the video in the monitor window, perform an associative search, and check it in the index window. This is achieved by introducing an image management means using a logical image composed of scenes.

【００４７】図１５に，モニタウインドウ１１００の詳
細な画面例を示す。１２００が実際に映像が表示される
領域で，１２０２は，映像再生部２００から送られる再
生中のフレーム番号を表示する。フレーム番号を表示し
ている部分は，数値入力部を兼ねており，キーボード等
によって数字を修正すると，修正された数字を新たなフ
レーム番号と見做して，その番号に対応するシーンから
映像を再生することができる。１２０４は，映像全体中
で，現在どの部分を再生しているのかを表示するための
インジケータパネルである。このパネル上のどの位置に
指示棒１２０６があるかによって，再生位置を示す。指
示棒の位置は，上述のフレーム番号と，再生中の論理映
像の構造体データから計算される。１２０８の縦棒は，
シーンの変わり目を表す線であり，これによって，どの
シーンが再生されているのかも直感的に知ることができ
る。このパネルによって，連想検索によってジャンプし
たことが指示棒１２０６の大きな移動によって明確に知
ることができ，映像の中で自然にシーンが変わっただけ
なのか区別がつかないといった混乱がなくなる。ポイン
ト位置検出部が指示棒１２０６がポイントされ，ドラッ
グ操作によって強制的に動かされた場合，操作パネル部
１１０は，ポイント位置検出部１０４によって得られる
移動後確定した位置情報を使って，その位置に対応する
シーンとフレーム番号が計算され，その位置に対応する
映像部分から再生するように，映像制御部１０６にこの
情報を伝えることができる。１２１０は，このモニタウ
インドウを閉じる場合のボタンである。FIG. 15 shows a detailed screen example of the monitor window 1100. Reference numeral 1200 denotes an area in which a video is actually displayed, and 1202 displays a frame number being reproduced which is sent from the video reproducing unit 200. The part displaying the frame number also serves as a numerical value input part, and when the number is corrected with a keyboard or the like, the corrected number is regarded as a new frame number, and the image from the scene corresponding to that number is displayed. Can be played. Reference numeral 1204 is an indicator panel for displaying which part of the entire image is currently being reproduced. The reproduction position is indicated by the position on the panel where the indicator rod 1206 is located. The position of the indicator rod is calculated from the frame number described above and the structure data of the logical image being reproduced. The vertical bar of 1208 is
It is a line that represents a scene transition, and this makes it possible to intuitively know which scene is being reproduced. With this panel, it is possible to clearly know that the jump has been made by the associative search by the large movement of the indicator rod 1206, and there is no confusion that it is impossible to distinguish whether the scene has changed naturally in the image. When the pointer 1206 is pointed at the point position detection unit and is forcibly moved by the drag operation, the operation panel unit 110 uses the position information determined by the point position detection unit 104, which has been determined after the movement, to move to that position. The corresponding scene and frame number are calculated and this information can be communicated to the video controller 106 so that it will be played back from the video portion corresponding to that position. 1210 is a button for closing this monitor window.

【００４８】図１６は，音声にマッピングされたオブジ
ェクトがある場合の映像表示画面の例である。音声は目
で見えない情報であるので，ボタン１４００及び１４０
２の形で可視化している。音声かどうかの判定は，事物
有無判定部２０２が，オブジェクト分類コード５０４を
調べることで行える。２０２は，現在再生中のシーンと
フレームの情報を用い，どのオブジェクトが現れている
かをチェックするとき，現れているオブジェクトの分類
コードが音声のものであれば，ボタンを表示する。ボタ
ンの表示位置は，オブジェクトの位置５１４に登録され
る。これにより，ポイント事物識別部の処理に変更を加
えることなく，このボタンをポイントすることにより，
その音声に関連するシーンにジャンプすることができ
る。ボタンは現在再生中の音声にマッピングされたオブ
ジェクトの種類だけ表示され，ボタン面のタイトルで区
別される。FIG. 16 is an example of a video display screen when there is an object mapped to voice. Since the voice is invisible information, the buttons 1400 and 140
It is visualized in the form of 2. Whether or not it is a voice can be determined by the thing presence / absence determining unit 202 by checking the object classification code 504. When checking which object is appearing by using the information of the scene and frame currently being reproduced, 202 displays a button if the classification code of the appearing object is voice. The display position of the button is registered in the position 514 of the object. As a result, by pointing to this button without changing the processing of the point thing identification part,
You can jump to the scene associated with that voice. Buttons are displayed only by the type of object mapped to the sound currently being played, and are distinguished by the title on the button surface.

【００４９】図１７の（ａ）〜（ｃ）は，連想検索で別
のシーンにジャンプするときの表示画面例である。画面
上の事物がポイントされると，映像再生表示部１００
は，映像中の通常のシーンの変わり目と区別がつきやす
いように特殊効果を加えた変化をするようにする。例え
ば，ポイントされた事物の領域の重心から，飛び先のシ
ーンの縮小された映像がみるみる大きくなるようなシー
ンの変わり方をさせる。これにより，どの事物がポイン
トされたのかもすぐにわかる。FIGS. 17A to 17C are examples of display screens when jumping to another scene in the associative search. When an object on the screen is pointed, the video playback display unit 100
Changes with a special effect so that it can be easily distinguished from the transition of the normal scene in the video. For example, from the center of gravity of the area of the pointed object, the scene is changed so that the reduced image of the scene at the jump destination becomes large. This makes it easy to see which item was pointed.

【００５０】ところで，図１５における１２１２は，事
物の関連情報を表示するかどうかを決めるためのボタン
である。このボタンをポイントすると，例えば，図１８
に示す１３００のようなメニューが現れる。このメニュ
ーには，関連情報を表示をしなくするＯＦＦのほか，現
在表示可能な関連情報の種類が表示される。ユーザは，
このメニューの中から見たい関連情報の種類を選ぶこと
ができる。この情報は，映像制御部１０６を通じて，制
御信号２１０として映像再生表示部１００の関連情報表
示部２０４に伝えられ，関連情報を表示するのか，する
ならば，どのキーに対応する情報なのかが決定される。
このメニューは１本の映像ごとに作られて，その映像に
ついて登録されている全てのオブジェクト構造体５００
におけるオブジェクト属性情報５２０のディクショナリ
全てのキーを調べ，全種類をメニューに載せている。１
２１４は，モードを変更するためのボタンで，連想検索
のモード，関連情報を変更するモードなどを切り替える
ことができる。これによって，ポイント事物識別部１０
２の内部状態を変化させ，ポイント位置検出部からポイ
ントが伝えられたときの対応処理が各内部状態に応じた
ものにする。By the way, 1212 in FIG. 15 is a button for deciding whether or not to display the related information of the thing. If you point this button, for example,
A menu like 1300 shown in appears. In this menu, the type of related information that can be displayed at present is displayed in addition to OFF for not displaying the related information. The user
From this menu, you can select the type of related information you want to see. This information is transmitted as a control signal 210 to the related information display unit 204 of the video reproduction display unit 100 through the video control unit 106, and determines whether the related information is displayed, and if so, which key corresponds to the information. To be done.
This menu is created for each video, and all object structures 500 registered for that video are displayed.
All the keys of the dictionary of the object attribute information 520 in are examined and all kinds are listed in the menu. 1
Reference numeral 214 denotes a button for changing the mode, which can switch the associative search mode, the mode for changing related information, and the like. As a result, the point thing identification unit 10
The internal state of 2 is changed so that the corresponding processing when the point is transmitted from the point position detection section corresponds to each internal state.

【００５１】図１９は，関連情報を表示する画面の一例
である。映像中の事物１５００とその関連情報１５０２
との関係が一目でわかるように，事物の上に重畳するよ
うに関連情報を表示する。事物有無判定部２０２が，前
述した手順で現在現れている事物を確定したとき，それ
らの事物についてオブジェクトの位置５１４を読みだ
し，その位置情報から重心を求め，また，関連情報の表
示に必要となる領域の重心を求めて，その重心が一致す
るように関連情報の表示位置を定める。但し，複数の事
物が密に接している場合には，相互にオフセットをかけ
て１５０２の表示が重ならないようにする。関連情報１
５０２は図のようなテキストに限定されるものではな
く，アイコンなどの画像であっても一向に構わない。ま
た，連想検索時には，ポイント事物識別部１０２が，関
連情報１５０２の表示領域をポイントすることでも，対
応する事物がポイントされたと識別できるようにし，別
のシーンにジャンプできるようにする。これは，一つの
事物につき，２つの位置情報を持たせ，そのＯＲで判定
することで行う。また，図２０に示すように，関連情報
１５０２と事物１５００の間を連結線１５０４で結ぶこ
とでも対応づけのわかりやすい表示を行うことができ
る。特に，関連情報１５０２の表示位置を固定にしてお
き，連結線だけを事物の動きに合わせて変化させること
で，事物の動きが激しく事物をポイントすることが困難
な場合でも，固定している１５０２をポイントすること
で容易に連想検索を行うことができる。FIG. 19 is an example of a screen displaying related information. Things 1500 in the video and related information 1502
The related information is displayed so as to be superimposed on the thing so that the relationship with and can be seen at a glance. When the thing presence / absence determination unit 202 determines the presently appearing thing by the above-mentioned procedure, it reads out the position 514 of the object for these things, obtains the center of gravity from the position information, and is necessary for displaying related information. The center of gravity of the area is calculated, and the display position of the related information is determined so that the centers of gravity match. However, when a plurality of objects are in close contact with each other, offsets are applied to each other so that the displays of 1502 do not overlap. Related information 1
502 is not limited to the text as shown in the figure, and may be an image such as an icon. Further, at the time of associative search, the point thing identifying unit 102 can identify that the corresponding thing is pointed by pointing to the display area of the related information 1502, and jump to another scene. This is done by having two position information for one thing and judging by its OR. Further, as shown in FIG. 20, by connecting the related information 1502 and the thing 1500 with a connecting line 1504, an easy-to-understand display of the correspondence can be performed. In particular, by fixing the display position of the related information 1502 and changing only the connecting line according to the movement of the object, even if the movement of the object is difficult and it is difficult to point the object 1502. An associative search can be easily performed by pointing to.

【００５２】システムの内部状態が関連情報変更モード
のときには，図２１に示すように，表示されている関連
情報のテキスト１５０２をポイントすると文字修正カー
ソル１５０６が現れ，キーボード等を使って，その場で
直ちに変更することができる。表示された情報が上位の
オブジェクトに格納されている関連情報であれば，この
変更により，同じ上位オブジェクトを共有する全ての事
物について一斉に関連情報が更新されることになる。表
示されている以外の関連情報を変更するときには，図２
２に示すような関連情報変更ウインドウ１６００が現れ
る。１６０２は，関連情報のキーのリストである。この
リスト中には，その事物の関連情報のほか，その上位オ
ブジェクトの関連情報もある。１６０４のボタンをポイ
ントすると，文字入力ウインドウが現れて，そこに新し
いキーを入力すると登録されて１６０２のリストに登録
される。１６０２のリストに表示されているキーはポイ
ントによって選択でき，選択されると強調表示される。
この状態で，１６０８の文字入力領域に何か入力する
と，それが，その選択されたキーに対応する関連情報と
して登録される。１６０６は，キーを抹消するためのボ
タンで，キーを選択した状態で１６０６をポイントする
と，そのキーに対応する関連情報ごと登録抹消される。
１６１０は，このようにして行った変更を受容して完了
する場合にポイントするボタンで，１６１２は，変更を
全てキャンセルして取りやめる場合にポイントするボタ
ンである。When the internal state of the system is in the related information changing mode, as shown in FIG. 21, when the user points the displayed related information text 1502, the character correction cursor 1506 appears, and the character correction cursor 1506 appears on the spot. Can be changed immediately. If the displayed information is the related information stored in the higher-level object, this change causes the related information to be updated all at once for all things sharing the same higher-level object. If you want to change the related information other than the one shown,
A related information change window 1600 as shown in 2 appears. Reference numeral 1602 is a list of keys of related information. In this list, in addition to the related information of the thing, the related information of its superordinate objects is also included. When the user presses the button 1604, a character input window appears, and when a new key is input there, it is registered and registered in the list 1602. The keys displayed in the list 1602 can be selected by points and are highlighted when selected.
In this state, if something is input in the character input area of 1608, it will be registered as the relevant information corresponding to the selected key. Reference numeral 1606 denotes a button for deleting a key, and when the key is selected and the user points to 1606, the related information corresponding to the key is deleted.
Reference numeral 1610 is a button to point when accepting and completing the changes made in this way, and 1612 is a button to point when canceling all the changes and canceling them.

【００５３】また，システムの内部状態が事物複写モー
ドのときには，再生中の映像に現れた事物を複写して，
他の映像に貼り付けるといったことも動画間・音声間の
それぞれで可能である。複写は，ポイントされた事物の
オブジェクトの構造体をそっくり複製することによって
行う。複写されたオブジェクトは，上位オブジェクトを
共有し，また，その上位のオブジェクトの下位オブジェ
クトとして追加される。貼り付けについては，映像中の
事物は映像情報の部分空間と対応づけられているので，
貼り付け先の映像情報の同じ形状の部分空間と置換する
ことで行える。そして，この複写・貼り付けは，関連情
報も合わせて複写・貼り付けが行えるので，関連情報に
関する作業量はほとんどない。When the internal state of the system is the object copy mode, the object appearing in the image being reproduced is copied,
It is also possible to attach it to other images between video and audio. Copying is performed by duplicating the entire structure of the object of the pointed thing. The copied object shares the superordinate object and is added as a subordinate object of the superordinate object. Regarding pasting, since the objects in the video are associated with the subspace of the video information,
This can be done by replacing the subspace of the same shape of the pasted video information. Since this copying / pasting can be performed by copying / pasting the related information as well, there is almost no work for the related information.

【００５４】以上の実施例では，ワークステションレベ
ルのコンピュータを用いて検索を行なう例で説明した
が，ＶＴＲやＴＶなどの一機能として実現することも可
能である。In the above-described embodiments, an example in which a computer of the workstation level is used for searching has been described, but it is also possible to realize it as a function of a VTR or TV.

【００５５】[0055]

【発明の効果】本発明によれば，所望のシーンを探すと
きには，インデクス情報から特定しきれなくても，その
シーンに関係する何らかの手掛かりが現れているシーン
さえ見つかれば，その手掛かりが現れるシーンを連想的
に検索しながら最終的に所望のシーンが得られる，とい
うように，それぞれの表示を融合的に用いた多面的な映
像検索ができる。また，再生中の映像中の事物に関する
情報を即座に，かつ，どの事物の情報なのか混同するこ
となく正確に知ることができる。また，再生中の映像に
現れている事物の関連情報の一部または全部の情報を，
該事物が現れたその場で直ちに変更できる。また，本発
明のモニタウインドウによれば，再生中のシーンの全映
像の中での位置を常に監視することもでき，連想検索で
シーンがジャンプしても，ワイプ等の特殊効果と相俟っ
て，そのことが明示的にわかり，通常のシーン変わりと
混同することがなくなる。また，重畳表示された関連情
報の表示領域をポイントしても，事物をポイントしたの
と同じ効果が得られるので，シーンごとに都合のよいポ
イントの方法を選ぶことができ操作性が向上する。ま
た，表示する関連情報を一覧にすることで，キーを直接
入力する手間が省け，またキーを忘れてしまった場合で
も，メニューを見て思いだせる。以上のように，本発明
によれば，使い勝手のよい連想検索が実現できる。According to the present invention, when a desired scene is searched for, even if it is not possible to specify it from the index information, as long as it is possible to find a scene in which some clue related to the scene appears, the scene in which the clue appears can be found. Multi-faceted video search using fusion of each display is possible, such that desired scene is finally obtained while associatively searching. In addition, it is possible to immediately and accurately know the information about the thing in the video being reproduced without confusing which thing the information is. In addition, some or all of the information related to the thing appearing in the video being played,
You can change it immediately when the thing appears. Further, according to the monitor window of the present invention, the position of the scene being reproduced in all the images can be constantly monitored, and even if the scene jumps in the associative search, it is combined with a special effect such as a wipe. That's why you can see it explicitly and don't get confused with the usual scene changes. Further, even if the user points the display area of the related information displayed in a superimposed manner, the same effect as that of pointing the object can be obtained, so that a convenient point method can be selected for each scene and the operability is improved. In addition, by displaying a list of related information to be displayed, it is possible to save the trouble of directly entering a key, and even when the key is forgotten, the menu can be recalled and reminded. As described above, according to the present invention, a user-friendly associative search can be realized.

[Brief description of drawings]

【図１】映像の連想検索を実現するシステムの画面の構
成例である。FIG. 1 is a configuration example of a screen of a system that realizes an associative search of video.

【図２】本発明の一実施例に係る映像の連想検索システ
ムの装置構成のブロック図である。FIG. 2 is a block diagram of a device configuration of a video associative search system according to an embodiment of the present invention.

【図３】映像の連想検索機能の説明図である。FIG. 3 is an explanatory diagram of a video associative search function.

【図４】被写体検索方法を説明する図である。FIG. 4 is a diagram illustrating a subject search method.

【図５】映像の連想検索を実現するための処理ブロック
図である。FIG. 5 is a processing block diagram for realizing a video associative search.

【図６】オブジェクト指向型のデータ構造体の概略図で
ある。FIG. 6 is a schematic diagram of an object-oriented data structure.

【図７】映像再生表示部の詳細処理ブロック図である。FIG. 7 is a detailed processing block diagram of a video reproduction display unit.

【図８】映像オブジェクトを記憶する構造体を示す図で
ある。FIG. 8 is a diagram showing a structure for storing a video object.

【図９】オブジェクトの位置を記憶する構造体を示す図
である。FIG. 9 is a diagram showing a structure for storing the position of an object.

【図１０】ディクショナリを記憶する構造体を示す図で
ある。FIG. 10 is a diagram showing a structure that stores a dictionary.

【図１１】映像再生部の詳細処理ブロック図である。FIG. 11 is a detailed processing block diagram of a video reproducing unit.

【図１２】論理映像を記憶する構造体を示す図である。FIG. 12 is a diagram showing a structure for storing a logical image.

【図１３】シーンを記憶する構造体を示す図である。FIG. 13 is a diagram showing a structure storing a scene.

【図１４】物理映像を記憶する構造体を示す図である。FIG. 14 is a diagram showing a structure for storing a physical image.

【図１５】モニタウインドウを示す画面例である。FIG. 15 is an example of a screen showing a monitor window.

【図１６】モニタウインドウの表示画面例である。FIG. 16 is a display screen example of a monitor window.

【図１７】モニタウインドウの表示画面例である。FIG. 17 is a display screen example of a monitor window.

【図１８】メニュー表示の例である。FIG. 18 is an example of a menu display.

【図１９】モニタウインドウの表示画面例である。FIG. 19 is a display screen example of a monitor window.

【図２０】モニタウインドウの表示画面例である。FIG. 20 is a display screen example of a monitor window.

【図２１】モニタウインドウの表示画面例である。FIG. 21 is a display screen example of a monitor window.

【図２２】関連情報を変更するためのウインドウを示す
図である。FIG. 22 is a diagram showing a window for changing related information.

[Explanation of symbols]

１…ディスプレイ，２…制御信号線，３…映像入力装
置，４…コンピュータ，５…ポインティングデバイス，
６…外部情報記憶装置，７…ＣＰＵ，８…接続インタフ
ェース，９…メモリ，１０…映像再生装置，１１…キー
ボード，１２…スピーカ，１３…タッチパネル。1 ... Display, 2 ... Control signal line, 3 ... Video input device, 4 ... Computer, 5 ... Pointing device,
6 ... External information storage device, 7 ... CPU, 8 ... Connection interface, 9 ... Memory, 10 ... Video reproducing device, 11 ... Keyboard, 12 ... Speaker, 13 ... Touch panel.

───────────────────────────────────────────────────── フロントページの続き (72)発明者田中和明神奈川県横浜市戸塚区戸塚町5030番地株式会社日立製作所ソフトウェア開発本部内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Kazuaki Tanaka 5030 Totsuka-cho, Totsuka-ku, Yokohama City, Kanagawa Prefecture Hitachi, Ltd. Software Development Division

Claims

[Claims]

1. A video display means having a video display area for viewing a video and an index display area for displaying index information of the video, and a point for detecting which one of the display areas is pointed. Detection means,
Based on the object management means for pre-registering the thing or sound in the moving image appearing in the video and the point information obtained by the point detection means and the logical structure description of the video stored separately, A video associative retrieval device, characterized in that it has a control means for determining a state and changes the displayed video.

2. The video associative retrieval apparatus according to claim 1, further comprising attribute information superimposing means, wherein at least a part of the selected attribute information of the objects appearing in the video being reproduced is reproduced. A video associative retrieval device characterized by being displayed in a superimposed manner on the position of the object in the inside, or by displaying in a form clearly showing that the object and its attribute information correspond.

3. The video associative search apparatus according to claim 1, further comprising attribute information changing means for displaying at least a part of attribute information of an object appearing in a video being reproduced,
An associative search device for video, characterized in that the object is changed immediately when it appears.

4. The video associative retrieval apparatus according to claim 1, wherein a video display area, a video display position display area, and a video display area are provided as partial areas of the display screen of the video display means. It is characterized by having an operation window having an area for displaying a button for controlling a reproduction state and an area for displaying a button for determining presence / absence of display of attribute information and a type of display information. Video associative search device.

5. The video associative retrieval apparatus according to claim 1, wherein when a scene changes depending on the point of an object, the change is executed by adding a special video effect to display the change separately from a normal scene change. A video associative search device characterized in that

6. The video associative retrieval apparatus according to claim 1, wherein when the attribute information of the object is displayed, it is determined that the object is also pointed when the display area of the attribute information of the object is pointed. A video associative search device characterized by:

7. The video associative search apparatus according to claim 1, wherein the type of attribute information to be displayed is specified by displaying a list of the types of attribute information in the video that is the target of the associative search. A video associative search device characterized by:

8. The video associative search device according to claim 6, wherein the display position of the attribute information of the object is fixed, and the correspondence with the object is always linked to the position of the object and the display position of the attribute information. A video associative search device characterized by clearly displaying by changing line segments.

9. A video display means for displaying at least a moving picture and a video index, an audio output means for reproducing audio, an input means for instructing a control state of the video, a video reproducing means, and the video. A video input means for converting the video obtained by the reproducing means into a data format that can be handled by a computer;
A video associative search system having a memory for storing data obtained by the video input means, and a control means for controlling a display state of the video based on information input by the input means.

10. An object or voice in a moving image appearing in a video is registered in advance, and by detecting that an area on the moving image corresponding to the registered object is pointed, the next reproduction should be performed. A video associative search method characterized in that a video state is determined and the video is displayed on a video display means.