JP2001092849A

JP2001092849A - Video retrieval method, computer-readable recording medium having program for causing computer to execute the method recorded thereon, video retrieval processor, method for imparting video index, and computer-readable recording medium having program for causing computer to execute the method recorded thereon, method for creating introductory document of image contents and a computer readable medium having program for causing computer to execute the method recorded thereon

Info

Publication number: JP2001092849A
Application number: JP2000077193A
Authority: JP
Inventors: Takako Hashimoto; 隆子橋本; Yukari Shirata; 由香利白田; Hiroko Mano; 博子真野; Atsushi Iizawa; 篤志飯沢
Original assignee: Ricoh Co Ltd; Jisedai Joho Hoso System Kenkyusho KK
Current assignee: Ricoh Co Ltd; Jisedai Joho Hoso System Kenkyusho KK
Priority date: 1999-07-19
Filing date: 2000-03-17
Publication date: 2001-04-06
Anticipated expiration: 2020-03-17
Also published as: JP3602765B2

Abstract

PROBLEM TO BE SOLVED: To make an inquiry about the meaning of the contents of a video by using a highly abstract term or concept and to perform retrieval at a high speed. SOLUTION: In a video retrieval method, a structural unit which is suitable as a unit to be retrieved at the time of retrieving a video is set as a retrieval granularity with respect to a term which can be expressed by the combination of a plurality of events, and the state transition pattern corresponding to the term is defined as the input column of a plurality of event indexes which are continuously generated in a retrieval granularity on the basis of generation pattern of the event expressing the term. When a desired video scene is retrieved from the video, the term expressing the desired video scene is inputted and whether the state transition pattern corresponding to the inputted term coincides with the generation pattern of an event index in the structural unit is discriminated at every structural unit which is coincident with the retrieval granularity by using the retrieval granularity corresponding to the inputted word as a retrieval object unit. The coincident structural unit is outputted as a retrieved result.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、構造インデックス
および事象インデックスを用いて構造化された映像を対
象として、抽象度の高い用語を用いた映像検索処理を行
う映像検索方法および映像検索処理装置、抽象度の高い
用語を用いた映像インデックス付与処理を行う映像イン
デックス付与方法、映像シーンの映像内容を説明する説
明文を生成する映像内容の説明文生成方法、およびそれ
の方法をコンピュータに実行させるためのプログラムを
記録したコンピュータ読み取り可能な記録媒体に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a video search method and a video search processing apparatus for performing a video search process using terms having a high degree of abstraction on a video structured using a structure index and an event index. A video index assigning method for performing a video index assigning process using a term having a high degree of abstraction, a video content description generating method for generating a description to explain the video content of a video scene, and a method for causing a computer to execute the method The present invention relates to a computer-readable recording medium recording the program.

【０００２】[0002]

【従来の技術】近年、コンピュータ・ハードウェア技
術、情報処理技術の進歩や、インターネット、デジタル
衛生放送の普及に伴って、日常的に様々な映像を利用す
ることが可能となっている。このため、映像の有する情
報としての価値・多様性・娯楽性がさらに重要な意味を
持つようになり、従来から利用されているテレビジョン
放送やビデオの再生等のように、映像を連続的に再生し
て視聴することに加えて、インデックスを付与して構造
化した映像から所望の映像シーンを検索して視聴した
り、情報を収集したり、ダイジェスト版を作成する等、
多彩な利用方法が提案されている。2. Description of the Related Art In recent years, with the progress of computer hardware technology and information processing technology, and the spread of the Internet and digital satellite broadcasting, it has become possible to use various images on a daily basis. For this reason, the value, diversity, and recreational value of the information possessed by the video become even more important, and the video can be continuously displayed, such as in the case of television broadcasting and video playback that have been used in the past. In addition to playback and viewing, search and view the desired video scene from the structured video with an index added, collect information, create a digest version, etc.
Various usages have been proposed.

【０００３】ところで、映像を効率的に検索するため
に、通常は、時間的に連続した映像シーンの集合である
映像をより小さいユニット（区間）に分割して利用して
いる。また、この際、映像シーンを所定の単位時間で分
割したり、映像シーンの量で分割する等のように物理的
に分割するのではなく、予め定めた条件を満足する映像
シーンの集合として論理的に分割することが一般的であ
る。この論理的な分割を行った後、それにインデックス
を付加することにより、分割された映像シーンを再利用
可能な意味的な纏まりとして扱うことができる。In order to efficiently search for video, a video, which is a set of video scenes that are temporally continuous, is usually divided into smaller units (sections) and used. Also, at this time, the video scene is not physically divided such as divided into predetermined unit times or divided according to the amount of the video scene, but is logically defined as a set of video scenes satisfying a predetermined condition. It is common to divide it into two. After performing this logical division, by adding an index to the logical division, the divided video scene can be handled as a reusable semantic group.

【０００４】この論理的な分割方法としては、例えば、
映像を目でみて、人手（手作業）によって映像を分割
し、映像にインデックスを付与する方法や、リアルタイ
ムオーサリング（リアルタイム内容記述）の手法の一つ
として、論理的分割の開始点にあたる点をインデックス
（構造インデックス）として指定し、次の論理的分割の
開始点までを分割の区間として判定する方法が考えられ
る。As a logical division method, for example,
As a method of visually dividing a video, dividing the video manually (manually), and assigning an index to the video, or a method of real-time authoring (real-time content description), the index at the starting point of the logical division is indexed. (Structure index), and a method of determining the start of the next logical division as a division section is considered.

【０００５】一方、上記の分割する区間を表す構造イン
デックスの他に、映像のインデックスとして、映像上で
起こった事象を表す事象インデックスというものも考え
られている。これは、上記の論理的な構造インデックス
が映像の論理的な区間（Ｄｕｒａｔｉｏｎ：継続時間）
を示すものに対し、基本的に区間を持たないインデック
スである。ただし、この事象インデックスも、現在の技
術では断片的な情報しか設定できない。On the other hand, in addition to the structure index representing the section to be divided, an event index representing an event that has occurred on the video has been considered as a video index. This is because the logical structure index is a logical section of the video (Duration: duration).
Is basically an index having no section. However, this event index can also set only fragmentary information with the current technology.

【０００６】なお、従来、この事象インデックスを利用
した検索による論理的な区間の切り出しが映像検索の処
理である。換言すれば、事象インデックスとして設定さ
れている用語（断片的な情報）と一致する問い合わせを
用いることにより、所望の映像部分を検索することが可
能となる。Conventionally, extraction of a logical section by a search using the event index is a video search process. In other words, a desired video portion can be searched by using a query that matches a term (fragmentary information) set as the event index.

【０００７】[0007]

【発明が解決しようとする課題】しかしながら、上記従
来の技術によれば、構造インデックスまたは事象インデ
ックスとして設定されている用語と一致する問い合わせ
に対して映像の検索を行うことはできるものの、映像の
内容の意味について検索することはできないという問題
点があった。However, according to the above-mentioned prior art, a video can be searched for an inquiry that matches a term set as a structure index or an event index. There was a problem that it was not possible to search for the meaning of.

【０００８】換言すれば、上記従来の技術によれば、映
像に対して断片的なインデックスしか付与することがで
きないため、断片的なインデックスを頼りに論理的な構
造の単位で検索を行い、映像部分を取り出すことは可能
であるが、利用者が真に欲する映像部分の検索が行える
とは限らなかった。[0008] In other words, according to the above-described conventional technique, only a fragmentary index can be assigned to a video. Therefore, a search is performed in units of a logical structure based on the fragmentary index, and the video is searched. Although it is possible to extract a part, it is not always possible to search for a video part that the user really wants.

【０００９】なお、説明を明確にするために本明細書中
において「映像の内容の意味」を以下のように定義す
る。映像の内容の意味とは、人が映像を見て、その映像
の内容を抽象的・概念的に理解することにより、特定の
区間に対して生成される情報である。すなわち、人間の
主観を使って初めて発見できるような意味を持つ情報で
ある。また、映像の内容の意味とは、事象インデックス
が映像上で起こった事象を断片的に表すものであるのに
対して、映像上で起こった事象の変化の状態および事象
の変化によって起こった結果となる事象の状態に基づい
て、ある区間の映像の内容を総合的に人が判断して意味
付けを行ったものである。したがって、映像の内容の意
味は、ある区間の映像の内容が明らかになって初めて意
味が成立するものである。[0009] In order to clarify the explanation, the "meaning of the contents of the video" is defined as follows in this specification. The meaning of the content of the video is information generated for a specific section when a person views the video and understands the content of the video abstractly and conceptually. That is, it is information having a meaning that can be discovered only by using human subjectivity. In addition, the meaning of the content of the video means that the event index is a fragmentary representation of the event that occurred on the video, while the state of the change of the event that occurred on the video and the result that occurred due to the change of the event Based on the state of the event, the content of the video in a certain section is comprehensively judged by a person and given meaning. Therefore, the meaning of the content of the video is meaningful only after the content of the video in a certain section becomes clear.

【００１０】また、映像を検索する場合に、利用者から
の問い合わせに対する答えとして期待される区間は、必
ずしも１つの構造単位に対応しているとは限らず、ある
構造単位の中の一部であるかもしれないし、複数の構造
単位が組み合わさったものかもしれないが、論理的な構
造単位に付与された断片的なインデックスに基づいて、
問い合わせの内容に対する答え（検索結果）として１つ
の構造単位を選択するため、必ずしも利用者が満足する
映像部分を検索することはできなかった。[0010] Also, when searching for a video, a section expected as an answer to an inquiry from a user does not always correspond to one structural unit, and may be a part of a certain structural unit. May be, or a combination of multiple structural units, but based on the fragmentary index assigned to the logical structural unit,
Since one structural unit is selected as an answer (search result) to the contents of the inquiry, it has not always been possible to search for a video portion that satisfies the user.

【００１１】また、映像シーンは多様に解釈されるた
め、論理的な構造を作成した人（すなわち、定義した
人）と、問い合わせを行う人が異なる場合に、意味のあ
る纏まりとして設定した区間が必ずしも一致するとは限
らなかった。[0011] In addition, since the video scene is interpreted in various ways, if the person who created the logical structure (that is, the person who defined it) is different from the person making the inquiry, the section set as a meaningful group may be different. They did not always match.

【００１２】さらに、映像の内容の意味について検索す
るということは、「抽象度の高い用語を条件として検索
する」ということでもあるが、上記従来の技術では断片
的に付けられたインデックス情報だけを頼りに、論理的
な構造単位を取り出すという処理を行うしかなく、特定
の応用の場合を除いて、一般的には、抽象度の高い検索
条件によって映像の内容の意味を効率的に解析する、と
いう処理はできなかった。Further, searching for the meaning of the contents of a video also means "searching for a term having a high degree of abstraction". However, in the above-described conventional technique, only the index information that is fragmentarily added is used. The only way to do this is to extract logical structural units. Except for specific applications, in general, the meaning of video content is efficiently analyzed using high-level search conditions. Could not be processed.

【００１３】また、上記従来の技術では、インデックス
を用いて映像の内容を説明する文字列を生成しようとし
た場合、断片的に付与されているインデックスを単に並
列で並べることしかできいなため、利用者にとって理解
しやすい文章を生成することはできないという不具合も
あった。すなわち、断片的なインデックスから、一般的
な意味のある文字列に変換して、利用者にとって分かり
やすい文章を生成することはできなかった。Further, in the above-mentioned conventional technique, when an attempt is made to generate a character string describing the content of a video using an index, it is only possible to arrange in a fragmentary manner an index in parallel. There was a problem that it was not possible to generate sentences that were easy for people to understand. That is, it was not possible to convert a fragmentary index into a general meaningful character string and generate a sentence that is easy for the user to understand.

【００１４】また、リアルタイムオーサリング時におい
て、ある事象に対する属性（または事象インデックス）
は、続けて起きる事象によって決定するということがあ
る。例えば、野球映像などにおいては、２塁打、３塁打
というのは、ヒットを打った時点では判定できない。そ
の後、バッターが何塁まで出塁したかという結果を見て
からでなければ、意味付け（属性または事象インデック
スの付与）を行うことができない。At the time of real-time authoring, the attribute (or event index) for a certain event
May be determined by successive events. For example, in a baseball video or the like, it is not possible to determine whether a player has hit two or three at the time of hitting. After that, it is not possible to make meaning (attachment of an attribute or an event index) only after watching the result of how many bases the batter has made.

【００１５】これを解決する技術としては、リアルタイ
ムに内容記述を行う方式として、赤迫、飯島、角谷、田
中の関連研究（「映像データのリアルタイム内容記述方
式とその実装」、ＤＥＷＳ’９９予稿集、３Ａ−２、１
９９９年３月４日〜６日）が報告されている。この報告
によれば、ラジオのナレーションのようなリアルタイム
に入ってくる不完全なインデックス情報シーケンスに対
して、それをどのように解釈して上位概念に置き換える
か、という処理を行っている。すなわち、予め上位概念
を状態遷移図によって表現しておき、リアルタイムに上
位概念に置き換えることで、元入力インデックスシーケ
ンスの冗長性および不完全性を正すようにしている。As a technique for solving this problem, as a method for describing the contents in real time, a related study by Akasako, Iijima, Kakutani and Tanaka (“Real-time contents description method of video data and its implementation”, DEWS '99 Proceedings, 3A-2, 1
March 4-6, 999). According to this report, incomplete index information sequences that come in real time, such as narration of a radio, are interpreted and replaced with higher-level concepts. That is, the superordinate concept is represented in advance by a state transition diagram, and replaced with the superordinate concept in real time, thereby correcting the redundancy and incompleteness of the original input index sequence.

【００１６】ところが、上記の赤迫等の関連研究で提案
されている方式では、リアルタイムの内容記述といって
いるように、状態遷移図は時系列に入っているインデッ
クスを解釈するために使うことを前提としている。この
ため、この方式をそのまま検索に応用すると、時系列で
インデックスを解釈するため、処理に時間がかかるとい
う問題点が発生する。また、如何にして、高速に抽象度
の高い検索を実現するか、という課題に対しての解決は
提案されていない。However, in the method proposed in the above related research by Akasako et al., As described in real-time content description, it is assumed that a state transition diagram is used to interpret a time-series index. And For this reason, if this method is applied to the search as it is, the index is interpreted in a time series, so that there is a problem that the processing takes time. Also, no solution to the problem of how to realize a search with a high degree of abstraction at high speed has not been proposed.

【００１７】さらに、常に時系列に入っているインデッ
クスを全ての状態遷移図と比較して解釈する必要がある
ため、必ずしも効率的でないという問題点があった。Further, there is a problem that the index which is always included in the time series must be interpreted by comparing it with all the state transition diagrams, which is not always efficient.

【００１８】本発明は上記に鑑みてなされたものであっ
て、映像の内容の意味に対する問い合わせを抽象度の高
い用語または概念を用いて行うことができ、かつ、高速
に検索を行えることを目的とする。The present invention has been made in view of the above, and an object of the present invention is to make it possible to make an inquiry about the meaning of the contents of a video using terms or concepts with a high degree of abstraction and to perform a high-speed search. And

【００１９】また、本発明は上記に鑑みてなされたもの
であって、映像の内容の意味を解釈し、一般的な意味の
ある文字列に変換して、利用者にとって分かりやすい映
像内容の説明文字列を生成できることを目的とする。Further, the present invention has been made in view of the above, and interprets the meaning of the contents of a video, converts the meaning into a character string having a general meaning, and provides a description of the video contents which is easy for a user to understand. The purpose is to be able to generate character strings.

【００２０】また、本発明は上記に鑑みてなされたもの
であって、抽象度の高い用語を用いたインデックスの付
与を効率良く高速に行えることを目的とする。Further, the present invention has been made in view of the above, and it is an object of the present invention to efficiently and quickly assign an index using terms having a high degree of abstraction.

【００２１】[0021]

【課題を解決するための手段】上記の目的を達成するた
めに、請求項１に係る映像検索方法は、少なくとも映像
を意味的な纏まりで分割するための構造インデックスお
よび映像中で発生した事象の内容および場所を特定する
ための事象インデックスが付与され、構造インデックス
で分割された区間の映像シーンを映像の構造単位とし、
かつ、複数の階層化した構造単位を用いて構造化した映
像を対象として、前記構造インデックスおよび事象イン
デックスを用いて前記映像中から所望の映像シーンを検
索する映像検索方法において、予め、複数の事象の組み
合わせによって表現可能な用語に対して、前記映像を検
索する際の検索対象単位として適当な構造単位を検索粒
度として設定し、前記用語を表現する事象の発生パター
ンに基づいて、前記検索粒度の中で連続して発生する複
数の事象インデックスの入力列として前記用語に対応し
た状態遷移パターンを定義しておき、前記映像中から所
望の映像シーンを検索する場合に、所望の映像シーンを
表現した用語を入力し、入力した用語に対応する検索粒
度を検索対象単位として、検索粒度と一致する構造単位
毎に、入力した用語に対応した状態遷移パターンと構造
単位中の事象インデックスの発生パターンが一致するか
否かを判定し、一致した構造単位を検索結果として出力
するものである。In order to achieve the above object, a video search method according to claim 1 comprises at least a structure index for dividing a video into a semantic group and an event index in the video. An event index for specifying content and location is given, and a video scene in a section divided by the structure index is used as a video structure unit,
In a video search method for searching for a desired video scene from the video using the structure index and the event index for a video structured using a plurality of hierarchical structure units, a plurality of events are set in advance. For a term that can be expressed by a combination of, a suitable structural unit is set as a search granularity as a search target unit when searching the video, and based on an occurrence pattern of an event expressing the term, the search granularity is determined. A state transition pattern corresponding to the term is defined as an input sequence of a plurality of event indexes that occur consecutively in the video, and a desired video scene is expressed when a desired video scene is searched from the video. Enter a term and use the search granularity corresponding to the entered term as the search target unit. Occurrence pattern of an event index in the state transition patterns of structural units corresponding it is determined whether matching, and outputs the matching structural unit as a search result.

【００２２】また、請求項２に係る映像検索方法は、請
求項１に記載の映像検索方法において、さらに、前記用
語に対応した状態遷移パターン毎に、それぞれの状態遷
移パターン中に存在する事象インデックスのうち、少な
くとも１つの事象インデックスが検索の取り掛かりとな
るキーインデックスとして指定されており、前記映像中
から所望の映像シーンを検索する場合に、検索粒度と一
致する構造単位で、かつ、前記キーインデックスと一致
する事象インデックスを有する構造単位を検索した後、
該当する構造単位に対して、前記状態遷移パターンと構
造単位中の事象インデックスの発生パターンが一致する
か否かの判定を行うものである。According to a second aspect of the present invention, there is provided the video search method according to the first aspect, further comprising, for each state transition pattern corresponding to the term, an event index existing in each state transition pattern. At least one of the event indexes is designated as a key index for starting a search, and when searching for a desired video scene from the video, a structure unit that matches a search granularity and the key index After searching for a structural unit with an event index that matches
For the corresponding structural unit, it is determined whether or not the state transition pattern matches the occurrence pattern of the event index in the structural unit.

【００２３】また、請求項３に係る映像検索方法は、請
求項１または２に記載の映像検索方法において、前記検
索結果として出力された構造単位に基づいて、前記映像
から所望の映像シーンを取り出す際に、前記構造単位で
特定された映像シーンを出力するものである。According to a third aspect of the present invention, in the video search method according to the first or second aspect, a desired video scene is extracted from the video based on the structural unit output as the search result. In this case, the video scene specified in the structural unit is output.

【００２４】また、請求項４に係る映像検索方法は、請
求項１または２に記載の映像検索方法において、前記検
索結果として出力された構造単位に基づいて、前記映像
から所望の映像シーンを取り出す際に、構造化された映
像上における上位または下位の任意の構造単位を指定可
能であるものである。According to a fourth aspect of the present invention, in the video search method according to the first or second aspect, a desired video scene is extracted from the video based on the structural unit output as the search result. At this time, an arbitrary upper or lower structural unit on the structured video can be designated.

【００２５】また、請求項５に係る映像検索方法は、請
求項２に記載の映像検索方法において、前記検索結果と
して出力された構造単位に基づいて、前記映像から所望
の映像シーンを取り出す際に、前記キーインデックスが
付与された場所の前後に映像切り出しのためのオフセッ
トを指定して映像シーンを取り出すものである。According to a fifth aspect of the present invention, in the video search method according to the second aspect, a desired video scene is extracted from the video based on the structural unit output as the search result. The video scene is extracted by designating an offset for video clipping before and after the location where the key index is added.

【００２６】また、請求項６に係る映像検索方法は、請
求項２〜５のいずれか一つに記載の映像検索方法におい
て、前記入力した用語に対応した状態遷移パターンと構
造単位中の事象インデックスの発生パターンが一致する
か否かを判定した際に、一致した構造単位に、その用語
を表すインデックスを抽象インデックスとして定義して
新たに付加し、キーインデックスとして再利用するもの
である。A video search method according to a sixth aspect of the present invention is the video search method according to any one of the second to fifth aspects, wherein a state transition pattern corresponding to the input term and an event index in a structural unit. When it is determined whether or not the occurrence patterns match, an index representing the term is defined as an abstract index, newly added to the matching structural unit, and reused as a key index.

【００２７】また、請求項７に係る映像検索方法は、請
求項１〜６のいずれか一つに記載の映像検索方法におい
て、予め複数の事象の組み合わせによって表現可能な用
語のそれぞに、各用語を表すインデックスとして抽象イ
ンデックスを設定しておき、前記状態遷移パターンを定
義する際に、事象インデックスに加えて前記抽象インデ
ックス用いて、事象インデックスと抽象インデックスか
らなる入力列として前記用語に対応した状態遷移パター
ンを定義するものである。A video search method according to a seventh aspect of the present invention is the video search method according to any one of the first to sixth aspects, wherein each of the terms that can be expressed in advance by a combination of a plurality of events includes An abstract index is set as an index representing a term, and when defining the state transition pattern, a state corresponding to the term is used as an input sequence composed of an event index and an abstract index by using the abstract index in addition to the event index. It defines a transition pattern.

【００２８】また、請求項８に係る映像検索方法は、請
求項２〜７のいずれか一つに記載の映像検索方法におい
て、前記構造インデックスおよび事象インデックスに
は、複数の属性情報が付加されており、前記用語に対応
した状態遷移パターンには、前記構造インデックスおよ
び事象インデックスの各属性情報を用いて前記用語に関
連した説明文を生成するための文字列定義情報が付加さ
れており、前記状態遷移パターンの文字列定義情報に基
づいて、前記検索結果として出力された構造単位中の属
性情報を参照して前記用語に関連した説明文を生成する
ものである。[0028] In a video search method according to claim 8, in the video search method according to any one of claims 2 to 7, a plurality of pieces of attribute information are added to the structure index and the event index. In the state transition pattern corresponding to the term, character string definition information for generating a description sentence related to the term using each attribute information of the structure index and the event index is added. A description sentence related to the term is generated by referring to the attribute information in the structural unit output as the search result based on the character string definition information of the transition pattern.

【００２９】また、請求項９に係る映像検索方法は、少
なくとも映像を意味的な纏まりで分割するための構造イ
ンデックスおよび映像中で発生した事象の内容および場
所を特定するための事象インデックスが付与され、構造
インデックスで分割された区間の映像シーンを映像の構
造単位とし、かつ、複数の階層化した構造単位を用いて
構造化した映像を対象として、前記構造インデックスお
よび事象インデックスを用いて前記映像中から所望の映
像シーンを検索する映像検索方法において、映像検索を
行う前の処理として、予め、複数の事象の組み合わせに
よって表現可能な用語に対して、前記映像を検索する際
の検索対象単位として適当な構造単位を検索粒度として
設定し、前記用語を表現する事象の発生パターンに基づ
いて、前記検索粒度の中で連続して発生する複数の事象
インデックスの入力列として前記用語に対応した状態遷
移パターンを定義し、さらに、前記用語に対応した状態
遷移パターン毎に、それぞれの状態遷移パターン中に存
在する事象インデックスのうち、少なくとも１つの事象
インデックスを検索の取り掛かりとなるキーインデック
スとして指定して、前記用語、検索粒度、状態遷移パタ
ーンおよびキーインデックスからなる状態遷移テーブル
を生成する状態遷移テーブル生成工程を含み、映像検索
を行う際の処理として、所望の映像シーンを表現した用
語を入力する用語入力工程と、前記状態遷移テーブルを
参照して、前記用語入力工程で入力した用語に対応する
検索粒度を検索対象単位とし、かつ、前記用語に対応す
るキーインデックスを用いて、前記キーインデックスと
一致する事象インデックスを有する構造単位を検索する
検索工程と、前記状態遷移テーブルを参照して、前記検
索工程で検索した構造単位中に前記用語に対応する状態
遷移パターン中に含まれる事象インデックスが全て存在
するか否かを判定する第１の判定工程と、前記第１の判
定工程で全て存在すると判定された構造単位に対して、
入力した用語に対応した状態遷移パターンと構造単位中
の事象インデックスの発生パターンが一致するか否かを
判定する第２の判定工程と、前記第２の判定工程で一致
すると判定された構造単位に基づいて、前記映像中から
映像シーンを切り出して、検索結果として出力する検索
結果出力工程と、を含むものである。In the video search method according to the ninth aspect, at least a structure index for dividing the video into a semantic group and an event index for specifying the content and location of an event occurring in the video are provided. The video scene of the section divided by the structure index is used as the structure unit of the video, and the video structured using the plurality of hierarchical structure units is targeted. In a video search method for searching for a desired video scene from a video, as a process before performing a video search, a term that can be expressed in advance by a combination of a plurality of events is appropriately set as a search target unit when searching for the video. Is set as a search granularity, and the search granularity is set based on the occurrence pattern of an event expressing the term. A state transition pattern corresponding to the term is defined as an input sequence of a plurality of event indexes that occur consecutively in the form. Further, for each state transition pattern corresponding to the term, a state transition pattern exists in each state transition pattern. A state transition table generating step of specifying at least one event index among the event indexes as a key index for starting a search, and generating a state transition table including the term, search granularity, state transition pattern, and key index. A process of inputting a term representing a desired video scene as a process of performing a video search; and searching for a search granularity corresponding to the term input in the term input process with reference to the state transition table. Using the key index corresponding to the term as the target unit and A search step of searching for a structural unit having an event index that matches the index; and an event included in a state transition pattern corresponding to the term in the structural unit searched in the search step with reference to the state transition table. A first determination step of determining whether or not all the indexes are present; and a structural unit determined to be all present in the first determination step,
A second determining step of determining whether or not the state transition pattern corresponding to the input term matches the occurrence pattern of the event index in the structural unit; and determining whether the structural unit has been determined to match in the second determining step. A search result output step of cutting out a video scene from the video based on the search result and outputting it as a search result.

【００３０】さらに、請求項１０のコンピュータ読み取
り可能な記録媒体は、前記請求項１〜９のいずれか一つ
に記載の映像検索方法をコンピュータに実行させるため
のプログラムを記録したものである。A computer-readable recording medium according to a tenth aspect records a program for causing a computer to execute the video search method according to any one of the first to ninth aspects.

【００３１】また、請求項１１に係る映像検索処理装置
は、少なくとも映像を意味的な纏まりで分割するための
構造インデックスおよび映像中で発生した事象の内容お
よび場所を特定するための事象インデックスが付与さ
れ、構造インデックスで分割された区間の映像シーンを
映像の構造単位とし、かつ、複数の階層化した構造単位
を用いて構造化した映像を対象として、前記構造インデ
ックスおよび事象インデックスを用いて前記映像中から
所望の映像シーンを検索する映像検索処理装置におい
て、検索対象である前記構造化した映像を入力する映像
入力手段と、検索する所望の映像シーンを指定するため
の用語と、前記用語で映像を検索する際の検索対象単位
としての構造単位を指定する検索粒度と、前記用語を前
記検索粒度の中で連続して発生する複数の事象インデッ
クスの入力列として定義した状態遷移パターンと、前記
状態遷移パターン中に存在する事象インデックスのうち
の少なくとも１つの事象インデックスを指定したキーイ
ンデックスと、を状態遷移テーブルとして記憶した記憶
手段と、前記所望の映像シーンを検索するための問い合
わせ用語を入力または指定するための操作入力手段と、
前記操作入力手段を介して問い合わせ用語が入力または
指定された場合に、前記記憶手段の状態遷移テーブルを
参照し、前記問い合わせ用語に対応するキーインデック
スを用いて、前記映像入力手段で入力した映像から前記
検索粒度と一致し、かつ、前記キーインデックスと一致
する事象インデックスを有する構造単位を検索する検索
手段と、前記検索手段で検索された構造単位を入力し、
前記問い合わせ用語に対応した状態遷移パターンと構造
単位中の事象インデックスの発生パターンが一致するか
否かを判定する判定手段と、前記判定手段で一致すると
判定された構造単位に基づいて、前記映像中から映像シ
ーンを切り出して、検索結果として出力する検索結果出
力手段と、を備えたものである。In the video search processing device according to the present invention, at least a structure index for dividing the video into a semantic group and an event index for specifying the content and location of an event occurring in the video are added. The video scene of the section divided by the structure index is used as the structure unit of the video, and the video is structured using the plurality of hierarchical structure units, and the video is structured using the structure index and the event index. In a video search processing device for searching for a desired video scene from the inside, a video input means for inputting the structured video to be searched, a term for specifying a desired video scene to be searched, and a video A search granularity that specifies a structural unit as a search target unit when searching for, and the term is continuous in the search granularity A state transition pattern defined as an input sequence of a plurality of event indexes generated by the operation and a key index designating at least one event index among the event indexes existing in the state transition pattern are stored as a state transition table. Storage means, operation input means for inputting or specifying an inquiry term for searching for the desired video scene,
When a query term is input or specified via the operation input means, refer to the state transition table of the storage means, and use a key index corresponding to the query term, from the video input by the video input means. A search unit that matches the search granularity, and searches for a structural unit having an event index that matches the key index, and inputs the structural unit searched by the search unit,
Determining means for determining whether a state transition pattern corresponding to the query term matches an occurrence pattern of an event index in a structural unit; and And a search result output unit that cuts out a video scene from and outputs it as a search result.

【００３２】また、請求項１２に係る映像検索処理装置
は、請求項１１記載の映像検索処理装置において、前記
判定手段が、前記検索手段で検索された構造単位中に、
前記問い合わせ用語と対応する状態遷移パターン中に含
まれる事象インデックスが全て存在するか否かを判定す
る第１の判定手段と、前記第１の判定手段で全て存在す
ると判定された構造単位に対して、前記問い合わせ用語
に対応した状態遷移パターンと構造単位中の事象インデ
ックスの発生パターンが一致するか否かを判定する第２
の判定手段と、から構成されるものである。According to a twelfth aspect of the present invention, in the video search processing device according to the eleventh aspect, the determining means includes:
First determining means for determining whether or not all the event indexes included in the state transition pattern corresponding to the query term exist; and for the structural unit determined to be all present by the first determining means. Determining whether the state transition pattern corresponding to the query term matches the occurrence pattern of the event index in the structural unit.
And determination means.

【００３３】さらに、請求項１３に係る映像インデック
ス付与方法は、映像を構造化する際に、少なくとも映像
を意味的な纏まりで分割するための構造インデックスお
よび映像中で発生した事象の内容および場所を特定する
ための事象インデックスを含む映像インデックスを付与
する映像インデックス付与方法において、予め、複数の
事象が連続して発生することによって意味が成立する用
語と、前記用語を表現する複数の事象インデックスの入
力列を用いて前記用語を定義した状態遷移パターンと、
所定の構造インデックスで定義される映像の構造単位を
前記用語に対応させて指定した検索粒度とを、前記用語
毎に対応させた状態遷移テーブルを設定しておき、前記
映像に映像インデックスを付与する際に、前記状態遷移
テーブルを参照して、前記構造インデックスによって特
定される構造単位毎に、対象となる構造単位と前記検索
粒度が一致し、かつ、対象となる構造単位内に付与され
た事象インデックスの付与順序と前記状態遷移パターン
の複数の事象インデックスの入力列とが一致する用語を
検索して、一致する用語が存在する場合に、前記一致し
た用語の意味が発生したと判定し、該当する用語の成立
を示す事象インデックスまたは属性情報を付与するもの
である。Further, in the video index assigning method according to the thirteenth aspect, at the time of structuring the video, at least the structure index for dividing the video into a semantic group and the content and location of an event occurring in the video are defined. In a video index assigning method for assigning a video index including an event index for specifying, in advance, a term whose meaning is established by a plurality of events occurring successively, and a plurality of event indexes expressing the term are input. A state transition pattern that defines the term using a sequence,
A state transition table is set in which a search granularity specified by associating the structural unit of an image defined by a predetermined structural index with the term is associated with each term, and a video index is assigned to the image. At this time, referring to the state transition table, for each structural unit specified by the structural index, the target structural unit and the search granularity match, and an event given in the target structural unit A search is made for a term in which the input sequence of the index and the input sequence of the plurality of event indexes of the state transition pattern match, and if a matching term exists, it is determined that the meaning of the matched term has occurred, An event index or attribute information indicating the establishment of the term is given.

【００３４】また、請求項１４に係る映像インデックス
付与方法は、映像を構造化する際に、少なくとも映像を
意味的な纏まりで分割するための構造インデックスおよ
び映像中で発生した事象の内容および場所を特定するた
めの事象インデックスを含む映像インデックスを付与す
る映像インデックス付与方法において、予め、複数の事
象が連続して発生することによって意味が成立する用語
と、前記用語を表現する複数の事象インデックスの入力
列を用いて前記用語を定義した状態遷移パターンと、所
定の構造インデックスで定義される映像の構造単位を前
記用語に対応させて指定した検索粒度とを、前記状態遷
移パターン中に存在する事象インデックスのうち、少な
くとも１つの事象インデックスを指定した中心インデッ
クスとを、前記用語毎に対応させた状態遷移テーブルを
設定しておき、前記映像に映像インデックスを付与する
際に、前記状態遷移テーブルを参照して、前記中心イン
デックスと一致する事象インデックスが付与された場合
に、前記構造インデックスによって特定される構造単位
毎に、対象となる構造単位と前記検索粒度が一致し、か
つ、対象となる構造単位内に付与された事象インデック
スの付与順序と前記状態遷移パターンの複数の事象イン
デックスの入力列とが一致する用語を検索して、一致す
る用語が存在する場合に、前記一致した用語の意味が発
生したと判定し、該当する用語の成立を示す事象インデ
ックスまたは属性情報を付与するものである。According to a fourteenth aspect of the present invention, at the time of structuring a video, at least a structure index for dividing the video into a semantic group and the content and location of an event occurring in the video are defined. In a video index assigning method for assigning a video index including an event index for specifying, in advance, a term whose meaning is established by a plurality of events occurring successively, and a plurality of event indexes expressing the term are input. A state transition pattern that defines the term using a column, and a search granularity that specifies a structural unit of a video defined by a predetermined structure index in association with the term, an event index existing in the state transition pattern And a central index designating at least one event index. A state transition table corresponding to each image is set, and when an image index is assigned to the video, the event index matching the center index is assigned with reference to the state transition table. For each of the structural units specified by the structural index, the target structural unit and the search granularity match, and the assignment order of the event index assigned in the target structural unit and the plurality of events of the state transition pattern Search for a term that matches the input string of the index, and if there is a matching term, determine that the meaning of the matched term has occurred, and add an event index or attribute information indicating the establishment of the corresponding term Is what you do.

【００３５】また、請求項１５に係るコンピュータ読み
取り可能な記録媒体は、前記請求項１３または１４に記
載の映像インデックス付与方法をコンピュータに実行さ
せるためのプログラムを記録したものである。According to a fifteenth aspect of the present invention, there is provided a computer-readable recording medium on which a program for causing a computer to execute the video index assigning method according to the thirteenth or fourteenth aspect is recorded.

【００３６】また、請求項１６に係る映像内容の説明文
生成方法は、少なくとも映像を意味的な纏まりで分割す
るための構造インデックスおよび映像中で発生した事象
の内容および場所を特定するための事象インデックスが
付与され、構造インデックスで分割された区間の映像シ
ーンを映像の構造単位とし、かつ、複数の階層化した構
造単位を用いて構造化した映像を対象として、映像内容
を説明する説明文を生成する映像内容の説明文生成方法
において、前記説明文の生成に使用する情報が設定され
た状態遷移テーブルと、予め前記構造インデックスおよ
び事象インデックスに付与されている文字列または文字
列に変換可能な属性情報とを用いて、前記説明文を生成
する映像内容の説明文生成方法であって、前記状態遷移
テーブルには、複数の事象の組み合わせによって表現可
能な用語毎に、前記説明文を生成する際の映像単位とし
て適当な構造単位を設定した生成粒度と、前記用語を表
現する事象の発生パターンに基づいて、前記生成粒度の
中で連続して発生する複数の事象インデックスの入力列
として前記用語を定義した状態遷移パターンと、前記状
態遷移パターン毎に、それぞれの状態遷移パターン中に
存在する事象インデックスのうち、少なくとも１つを選
択して設定したキーインデックスと、前記状態遷移パタ
ーン毎に、前記説明文の生成に使用する構文要素および
前記構文の構文要素として使用する文字列の入力元を設
定した文字列定義情報と、が設定されており、前記映像
内容の説明文を生成する際に、前記状態遷移テーブルを
参照して、前記説明文を生成する対象となる構造単位と
一致する生成粒度を検索し、該当する生成粒度に対応す
る状態遷移パターンのキーインデックスと一致する事象
インデックスが前記対象となる構造単位中に存在するか
否かを判定し、前記一致する事象インデックスが存在す
る場合に、前記対応する状態遷移パターンと前記対象と
なる構造単位中の事象インデックスの発生パターンが一
致するか否かを判定し、前記発生パターンが一致した場
合に、前記対応する状態遷移パターンで定義された用語
が成立したと判定し、成立した用語の状態遷移パターン
の文字列定義情報を用いて前記対象となる構造単位の映
像シーンの説明文を生成するものである。[0036] According to a still further aspect of the present invention, there is provided a method for generating a description of a video content, wherein at least a structure index for dividing the video into a semantic group and an event for specifying the content and location of an event occurring in the video. An index is provided, and a video scene of a section divided by the structure index is used as a structure unit of the video, and a description sentence describing the video content is provided for a video structured using a plurality of hierarchical structure units. In the method for generating a description of a video content to be generated, a state transition table in which information to be used for generating the description is set, and a character string or a character string previously assigned to the structure index and the event index can be converted. A description method for generating a description of a video content, wherein the description is generated using attribute information. For each term that can be expressed by a combination of the above events, the generation granularity is set based on a generation granularity in which an appropriate structural unit is set as a video unit when generating the description, and the generation granularity based on an occurrence pattern of an event expressing the term. At least one of a state transition pattern in which the term is defined as an input string of a plurality of event indexes that occur consecutively in the state transition pattern, and an event index that exists in each state transition pattern for each of the state transition patterns A key index set by selecting and, for each state transition pattern, a character string definition information that sets a syntax element used for generating the explanatory note and a character string input source used as a syntax element of the syntax, Is set, and when generating a description of the video content, a pair for generating the description with reference to the state transition table. Search for a generation granularity that matches the structural unit to be determined, and determine whether an event index that matches the key index of the state transition pattern corresponding to the relevant generation granularity exists in the target structural unit, If there is a matching event index, determine whether the corresponding state transition pattern and the occurrence pattern of the event index in the target structural unit match, if the occurrence pattern matches, the It is determined that the term defined in the corresponding state transition pattern is established, and a description of the video scene of the target structural unit is generated using the character string definition information of the state transition pattern of the established term. .

【００３７】また、請求項１７に係る映像内容の説明文
生成方法は、請求項１６に記載の映像内容の説明文生成
方法において、さらに、前記構造単位には、特定の用語
を定義した状態遷移パターンが成立した場合に、その用
語を表すインデックスを抽象インデックスとして付与す
ることが可能であり、前記状態遷移テーブルには、前記
状態遷移パターン毎に、対応する用語を表す抽象インデ
ックスがキーインデックスの一つとして設定されてお
り、前記該当する生成粒度に対応する状態遷移パターン
のキーインデックスと一致する事象インデックスが前記
対象となる構造単位中に存在するか否かを判定する際
に、キーインデックスとして抽象インデックスを優先し
て用いて、前記対象となる構造単位に該当する抽象イン
デックスが付与されているか否かを判定し、抽象インデ
ックスが設定されている場合には、前記対応する状態遷
移パターンで定義された用語が成立したと判定し、成立
した用語の状態遷移パターンの文字列定義情報を用いて
前記対象となる構造単位の映像シーンの説明文を生成す
る。[0037] According to a seventeenth aspect of the present invention, in the method of generating a description of a video content according to the sixteenth aspect, the structure unit further includes a state transition defining a specific term. When a pattern is established, it is possible to add an index representing the term as an abstract index. In the state transition table, for each of the state transition patterns, an abstract index representing a corresponding term is included in the key index. When it is determined whether an event index that matches the key index of the state transition pattern corresponding to the corresponding generation granularity exists in the target structural unit, the event index is abstracted as the key index. The index is preferentially used, and an abstract index corresponding to the target structural unit is given. Is determined, if the abstract index is set, it is determined that the term defined in the corresponding state transition pattern has been established, the character string definition information of the state transition pattern of the established term Then, a description of the video scene of the target structural unit is generated.

【００３８】また、請求項１８に係る映像内容の説明文
生成方法は、請求項１６または１７に記載の映像内容の
説明文生成方法において、前記文字列定義情報は、前記
構文要素として使用する文字列の入力元として、前記構
造インデックスまたは事象インデックスの属性情報が設
定されているものである。[0038] A video content description generating method according to claim 18 is the video content description generating method according to claim 16 or 17, wherein the character string definition information is a character used as the syntax element. The attribute information of the structure index or the event index is set as a column input source.

【００３９】また、請求項１９に係る映像内容の説明文
生成方法は、請求項１６〜１８のいずれか一つに記載の
映像内容の説明文生成方法において、前記文字列定義情
報は、「何時（Ｗｈｅｎ）、どこで（Ｗｈｅｒｅ）、な
ぜ（Ｗｈｙ）、誰の（Ｗｈｏ）、何（Ｗｈａｔ）で、ど
のように（Ｈｏｗ）なった」の５Ｗ１Ｈを基本とした構
文要素が設定されているものである。According to a nineteenth aspect of the present invention, in the method for generating a description of a video content according to any one of the sixteenth to eighteenth aspects, the character string definition information may include a "what time" (When), where (Where), why (Why), who (Who), what (What), and how (How) it is.) The syntax element based on 5W1H is set. is there.

【００４０】また、請求項２０に係る映像内容の説明文
生成方法は、請求項１９に記載の映像内容の説明文生成
方法において、前記該当する生成粒度に対応する状態遷
移パターンが複数存在する場合、各状態遷移パターンに
対して用語が成立するか否かを判定し、複数の用語が成
立すると、各用語の状態遷移パターンに設定された文字
列定義情報の構文要素を組み合わせて前記対象となる構
造単位の映像シーンの説明文を生成するものである。According to a twentieth aspect of the present invention, there is provided the video content description generating method according to the nineteenth aspect, wherein a plurality of state transition patterns corresponding to the corresponding generation granularity exist. It is determined whether or not a term is established for each state transition pattern, and when a plurality of terms are established, the target is obtained by combining the syntax elements of the character string definition information set in the state transition pattern of each term. This is for generating a description of a video scene in structural units.

【００４１】また、請求項２１に係る映像内容の説明文
生成方法は、請求項２０に記載の映像内容の説明文生成
方法において、各用語の状態遷移パターンに設定された
文字列定義情報の構文要素を組み合わせて前記対象とな
る構造単位の映像シーンの説明文を生成する際に、５Ｗ
１Ｈの構文要素の中に重複する構文要素がある場合、時
間的に後に発生する事象インデックスを参照する構文要
素を優先するものである。According to a twenty-first aspect of the present invention, in the method of generating a description of a video content according to the twentieth aspect, the syntax of the character string definition information set in the state transition pattern of each term is provided. When generating the description of the video scene of the target structural unit by combining the elements, 5W
If there are duplicate syntax elements in the 1H syntax elements, the syntax element that refers to the event index that occurs later in time takes precedence.

【００４２】また、請求項２２に係る映像内容の説明文
生成方法は、請求項２０または２１に記載の映像内容の
説明文生成方法において、各用語の状態遷移パターンに
設定された文字列定義情報の構文要素を組み合わせて前
記対象となる構造単位の映像シーンの説明文を生成する
際に、５Ｗ１Ｈの構文要素の中に重複する構文要素があ
る場合、各用語の状態遷移パターンを比較し、より多く
の事象インデックスを用いて定義された状態遷移パター
ンの構文要素を優先するものである。According to a twenty-second aspect of the present invention, in the method of generating a description of a video content according to the twentieth or twenty-first aspect, the character string definition information set in the state transition pattern of each term is provided. When generating the description of the video scene of the target structural unit by combining the syntax elements of the above, if there are duplicate syntax elements among the syntax elements of 5W1H, the state transition patterns of each term are compared, and The priority is given to the syntax elements of the state transition pattern defined using many event indexes.

【００４３】また、請求項２３に係る映像内容の説明文
生成方法は、請求項１９〜２２のいずれか一つに記載の
映像内容の説明文生成方法において、各用語の状態遷移
パターンに設定された文字列定義情報の構文要素を組み
合わせて前記対象となる構造単位の映像シーンの説明文
を生成する際に、５Ｗ１Ｈの構文要素の中に重複する構
文要素がある場合、必要に応じて構文要素を並列に並べ
るものである。A video content description generating method according to claim 23 is the video content description generating method according to any one of claims 19 to 22, wherein the video content description generating method is set in the state transition pattern of each term. When generating the description of the video scene of the target structural unit by combining the syntax elements of the character string definition information, if there are duplicate syntax elements in the 5W1H syntax elements, Are arranged in parallel.

【００４４】また、請求項２４に係るコンピュータ読み
取り可能な記録媒体は、前記請求項１６〜２３のいずれ
か一つに記載の映像内容の説明文生成方法をコンピュー
タに実行させるためのプログラムを記録したものであ
る。According to a twenty-fourth aspect of the present invention, a computer-readable recording medium records a program for causing a computer to execute the method of generating a description of a video content according to any one of the sixteenth to twenty-third aspects. Things.

【００４５】[0045]

【発明の実施の形態】以下、本発明の映像検索方法、そ
の方法をコンピュータに実行させるためのプログラムを
記録したコンピュータ読み取り可能な記録媒体、映像検
索処理装置、映像インデックス付与方法、その方法をコ
ンピュータに実行させるためのプログラムを記録したコ
ンピュータ読み取り可能な記録媒体、映像内容の説明文
生成方法およびその方法をコンピュータに実行させるた
めのプログラムを記録したコンピュータ読み取り可能な
記録媒体の実施の形態について、添付の図面を参照しつ
つ詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a video search method of the present invention, a computer-readable recording medium storing a program for causing a computer to execute the method, a video search processing device, a video index assigning method, and a computer-readable program And a computer-readable recording medium storing a program for causing a computer to execute the method, a method for generating a description of a video content, and a computer-readable recording medium storing a program for causing a computer to execute the method. This will be described in detail with reference to the drawings.

【００４６】（実施の形態１）実施の形態１に係る映像
検索方法、映像検索処理装置および映像インデックス付
与方法について、（１）映像検索処理装置の装置構成（２）構造インデックスと事象インデックスとの関係（３）状態遷移テーブルの構造（４）状態遷移パターン（状態遷移図）の定義例（５）具体的な映像検索処理アルゴリズムの例（６）具体的な動作例の順で説明する。(Embodiment 1) A video search method, a video search processing device, and a video index assigning method according to Embodiment 1 are as follows: (1) Device configuration of video search processing device (2) Structure index and event index Relationship (3) Structure of State Transition Table (4) Example of Definition of State Transition Pattern (State Transition Diagram) (5) Example of Specific Video Search Processing Algorithm (6) Specific Operation Example

【００４７】（１）映像検索処理装置の装置構成図１は、実施の形態１の映像検索処理装置の構成図を示
し、同図（ａ）が映像検索処理装置１００のハード構成
の一例を示し、同図（ｂ）が映像検索処理装置１００の
機能ブロック図を示している。映像検索処理装置１００
は、少なくとも映像を意味的な纏まりで分割するための
構造インデックスおよび映像中で発生した事象の内容お
よび場所を特定するための事象インデックスが付与さ
れ、構造インデックスで分割された区間の映像シーンを
映像の構造単位とし、かつ、複数の階層化した構造単位
を用いて構造化した映像を対象として、構造インデック
スおよび事象インデックスを用いて映像中から所望の映
像シーンを検索するものである。(1) Apparatus Configuration of Video Search Processing Apparatus FIG. 1 shows a configuration diagram of a video search processing apparatus according to the first embodiment, and FIG. 1A shows an example of a hardware configuration of the video search processing apparatus 100. FIG. 2B shows a functional block diagram of the video search processing device 100. Video search processing device 100
At least a structure index for dividing a video into a semantic group and an event index for specifying the content and location of an event that has occurred in the video are added, and the video scene of the section divided by the structure index is added to the video. With respect to a video structured using a plurality of hierarchically structured units, a desired video scene is searched from the video using the structure index and the event index.

【００４８】この映像検索処理装置１００のハード構成
としては、少なくともＣＰＵ（中央演算処理装置）、デ
ィスプレー、キーボード、磁気ディスクを有する装置で
あれば良く、例えば、同図（ａ）に示すようなパーソナ
ルコンピュータを利用することができる。The hardware configuration of the video search processing device 100 may be any device having at least a CPU (Central Processing Unit), a display, a keyboard, and a magnetic disk. For example, a personal computer as shown in FIG. Computer can be used.

【００４９】また、映像検索処理装置１００は、同図
（ｂ）に示すように、検索対象である構造化した映像を
入力するための映像入力部１０１と、後述する用語、検
索粒度、状態遷移パターンおよびキーインデックスを状
態遷移テーブルとして記憶した状態遷移テーブル記憶部
１０２と、所望の映像シーンを検索するための問い合わ
せ用語を入力または指定するための操作入力部１０３
と、操作入力部１０３を介して問い合わせ用語が入力ま
たは指定された場合に、状態遷移テーブル記憶部１０２
の状態遷移テーブルを参照し、問い合わせ用語に対応す
るキーインデックスを用いて、映像入力部１０１で入力
した映像から検索粒度と一致し、かつ、キーインデック
スと一致する事象インデックスを有する構造単位を検索
する映像検索部１０４と、映像検索部１０４で検索され
た構造単位を入力し、問い合わせ用語に対応した状態遷
移パターンと構造単位中の事象インデックスの発生パタ
ーンが一致するか否かを判定する判定部１０５と、判定
部１０５で一致すると判定された構造単位に基づいて、
映像中から映像シーンを切り出して、検索結果として出
力する検索結果出力部１０６と、を備えている。As shown in FIG. 3B, the video search processing device 100 includes a video input unit 101 for inputting a structured video to be searched, and a term, search granularity, and state transition described later. A state transition table storage unit 102 storing a pattern and a key index as a state transition table, and an operation input unit 103 for inputting or specifying an inquiry term for searching for a desired video scene
When a query term is input or specified via the operation input unit 103, the state transition table storage unit 102
And using the key index corresponding to the query term to search for a structural unit that matches the search granularity and has an event index that matches the key index, using the key index corresponding to the query term. The video search unit 104 and the structural unit searched by the video search unit 104 are input, and the determination unit 105 determines whether the state transition pattern corresponding to the query term matches the occurrence pattern of the event index in the structural unit. And the structural unit determined to match by the determination unit 105,
A search result output unit 106 that cuts out a video scene from the video and outputs it as a search result.

【００５０】なお、判定部１０５は、映像検索部１０４
で検索された構造単位中に、問い合わせ用語と対応する
状態遷移パターン中に含まれる事象インデックスが全て
存在するか否かを判定する第１の判定部１０５ａと、第
１の判定部１０５ａで全て存在すると判定された構造単
位に対して、問い合わせ用語に対応した状態遷移パター
ンと構造単位中の事象インデックスの発生パターンが一
致するか否かを判定する第２の判定部１０５ｂと、を有
している。It is to be noted that the judging section 105 includes the video searching section 104
The first determination unit 105a that determines whether or not all the event indexes included in the state transition pattern corresponding to the query term exist in the structural unit searched for by the first and second determination units 105a A second determination unit 105b that determines whether the state transition pattern corresponding to the query term matches the occurrence pattern of the event index in the structural unit with respect to the determined structural unit. .

【００５１】（２）構造インデックスと事象インデック
スとの関係次に、本発明の映像検索処理装置（映像検索方法）が、
検索の対象とする構造化された映像について説明する。
構造化された映像には、映像を意味的な纏まりで分割す
るための構造インデックスと、映像中で発生した事象の
内容および場所を特定するための事象インデックスが付
与されている。(2) Relationship between structure index and event index Next, the video search processing device (video search method) of the present invention
The structured video to be searched will be described.
The structured video is provided with a structure index for dividing the video into a semantic group and an event index for specifying the content and location of an event that has occurred in the video.

【００５２】図２および図３は、構造インデックスと事
象インデックスとの関係を示す説明図である。図２にお
いて、例えば、構造１の構造インデックスで示される区
間の映像データ（映像シーン）が１つの映像全体を表す
ものとした場合、この映像は、構造１の下位に構造２−
ａ、構造２−ｂ、構造２−ｃ…等の複数の構造インデッ
クスを有している。ここで、構造２−ａ、構造２−ｂ、
構造２−ｃ…等の各構造インデックスで示される区間
は、構造１で示される映像全体を分割した区間であり、
かつ、これらの分割された区間を全て繋げると上位の区
間である映像全体の区間と一致する。FIGS. 2 and 3 are explanatory diagrams showing the relationship between the structure index and the event index. In FIG. 2, for example, if the video data (video scene) in the section indicated by the structure index of the structure 1 represents the entirety of one video, this video is provided below the structure 1 with the structure 2-
a, a structure 2-b, a structure 2-c... Here, the structure 2-a, the structure 2-b,
The section indicated by each structure index such as the structure 2-c is a section obtained by dividing the entire video shown by the structure 1,
In addition, when all of these divided sections are connected, they match the section of the entire video which is a higher section.

【００５３】また、構造２−ａの下位には、構造３−ａ
ａ、構造３−ａｂ、構造３−ａｃ…等の複数の構造イン
デックスが設けられている。同様に、構造３−ａａ、構
造３−ａｂ、構造３−ａｃ…等の各構造インデックスで
示される区間は、構造２−ａで示される区間を分割した
区間であり、かつ、これらの分割された区間を全て繋げ
ると上位の区間である構造２−ａの区間と一致する。Further, below the structure 2-a, the structure 3-a
a, a structure 3-ab, a structure 3-ac, etc. are provided. Similarly, the section indicated by each structure index such as the structure 3-aa, the structure 3-ab, the structure 3-ac... Is a section obtained by dividing the section indicated by the structure 2-a, and When all the sections are connected, they match the section of the structure 2-a which is the upper section.

【００５４】なお、最上位の構造１の構造インデックス
で示される区間が、映像全体を示す構造単位となり、次
の構造２−ａ、構造２−ｂ、構造２−ｃ…等の各構造イ
ンデックスで示される区間が、構造２レベルの構造単位
となり、さらに下位の構造３−ａａ、構造３−ａｂ、構
造３−ａｃ…等の各構造インデックスで示される区間
が、構造３レベルの構造単位となる。The section indicated by the structure index of the highest-order structure 1 is a structure unit indicating the entire video, and is defined by the following structure indexes such as structure 2-a, structure 2-b, structure 2-c, etc. The indicated section is a structural unit of the structure 2 level, and the section indicated by each structure index such as a lower structure 3-aa, structure 3-ab, structure 3-ac... Is a structural unit of the structure 3 level. .

【００５５】このように映像（構造１）は、構造インデ
ックスで分割された区間の映像シーンを映像の構造単位
とし、かつ、複数の階層化した構造単位を用いて構造化
されている。As described above, the video (structure 1) is structured using the video scene of the section divided by the structure index as the video structural unit and using a plurality of hierarchical structural units.

【００５６】上記映像の構造を表す構造インデックスに
対して、事象インデックスは、映像上で起こった事象を
表すものである。前述したように構造インデックスが映
像の論理的な区間を示すのに対して、事象インデックス
は基本的に区間を持たないインデックスである。この事
象インデックスは、基本的には、映像中で事象が発生し
た場所にその事象の内容を示す情報として付与される。
例えば、図３のように、映像の流れ（時系軸での変化）
において事象の発生した時に事象インデックス３ａａ−
１〜３ａａ−４のように付与しても良いし、図２のよう
に、事象の発生した構造インデックス３ａａに事象イン
デックス３ａａ−１〜３ａａ−４を付与しても良い。な
お、詳細な説明は省略するが、構造インデックスおよび
事象インデックスで構成される構造化されたインデック
ス情報の部分と、実際の映像シーンの部分とをそれぞれ
別々に保存したり、管理したりすることもできるのは勿
論である。In contrast to the structure index representing the structure of the video, the event index represents an event that has occurred on the video. As described above, the structure index indicates a logical section of a video, whereas the event index is basically an index having no section. This event index is basically added to a place where an event has occurred in a video as information indicating the content of the event.
For example, as shown in FIG. 3, the flow of the image (change in the time axis)
When an event occurs in the event index 3aa-
2 to 3aa-4, or the event indexes 3aa-1 to 3aa-4 may be added to the structural index 3aa where the event has occurred, as shown in FIG. Although a detailed description is omitted, it is also possible to separately store and manage a part of structured index information composed of a structure index and an event index and a part of an actual video scene, respectively. Of course you can.

【００５７】（３）状態遷移テーブルの構造次に、本発明の重要な要素である状態遷移テーブルの構
造について詳細に説明する。本発明では、抽象度の高い
意味（用語）を用いて所望の映像シーンを検索するため
に、その用語を表現する状態遷移パターンを映像の構造
情報（構造インデックスおよび事象インデックス）を利
用して定義しておく。ここでの処理が、本発明の状態遷
移テーブル生成工程に相当する。(3) Structure of State Transition Table Next, the structure of the state transition table, which is an important element of the present invention, will be described in detail. According to the present invention, in order to search for a desired video scene using a meaning (term) having a high level of abstraction, a state transition pattern expressing the term is defined using video structure information (structure index and event index). Keep it. This processing corresponds to the state transition table generation step of the present invention.

【００５８】状態遷移テーブルは、図４に示すように、
以下の（３−１）〜（３−４）を対応させて設定したも
のである。（３−１）用語：抽象度の高い検索用語であり、該当す
る映像シーンの意味を表す文字列として設定したもので
ある。抽象度の高い用語（検索用語）とは、事象インデ
ックスとして使用される断片的な意味の用語が映像上の
１つの事象と対応付けることができるのに対して、映像
上の１つの事象のみに対応させることはできず、複数の
事象の組み合わせによって表現可能な用語である。換言
すれば、抽象度の高い用語（検索用語）は、ある区間の
複数の事象の発生が明らかになって初めて意味が成立す
るものである。The state transition table is as shown in FIG.
The following (3-1) to (3-4) are set correspondingly. (3-1) Term: A search term having a high degree of abstraction, which is set as a character string representing the meaning of the corresponding video scene. A term with a high degree of abstraction (a search term) is a term having a fragmentary meaning used as an event index that can be associated with one event on a video, but corresponds to only one event on a video It is a term that can not be expressed and can be expressed by a combination of a plurality of events. In other words, a term with a high degree of abstraction (search term) is meaningful only when the occurrence of a plurality of events in a certain section becomes clear.

【００５９】（３−２）検索粒度：映像を検索する際の
検索対象単位として適当な構造単位を設定したものであ
る。すなわち、用語（上記（３−１））の意味が成立す
る最小の構造単位を検索粒度として設定することによ
り、検索対象単位を狭い範囲に絞って効率的に検索でき
るようにするものである。（３−３）状態遷移パターン：用語（上記（３−１））
を表現する事象の発生パターンに基づいて、検索粒度
（上記（３−２））の中で連続して発生する複数の事象
インデックスの入力列として用語（上記（３−１））に
対応した事象リストを定義したものである。（３−４）キーインデックス：状態遷移パターン（上記
（３−３））中に存在する事象インデックスのうち、少
なくとも１つの事象インデックスを検索の取り掛かりと
なるキー（中心事象）として指定したものである。(3-2) Search granularity: An appropriate structural unit is set as a search target unit when searching for a video. That is, by setting the minimum structural unit that satisfies the meaning of the term ((3-1)) as the search granularity, the search target unit can be narrowed down and the search can be performed efficiently. (3-3) State transition pattern: Term ((3-1) above)
Based on the occurrence pattern of the event expressing the event, an event corresponding to the term ((3-1)) as an input string of a plurality of event indexes that occur continuously in the search granularity ((3-2)) Defines a list. (3-4) Key index: At least one event index among the event indexes existing in the state transition pattern ((3-3)) is specified as a key (center event) for starting a search. .

【００６０】すなわち、状態遷移テーブルは、映像内容
の意味定義を、ある構造上の論理単位（検索粒度）にお
ける事象インデックスのリストと、内容の意味を表す文
字列と、検索を効率的に行うためのキーインデックスと
の組として表現し、作成したものである。この状態遷移
テーブル上の状態遷移パターンを、インデックスにより
構造化された映像に対して、発見（パース）していくこ
とが検索となる。In other words, the state transition table is used to efficiently define the meaning of the video contents by searching for a list of event indexes in a logical unit (search granularity) on a certain structure and a character string representing the meaning of the contents. And expressed as a pair with the key index of A search is to discover (parse) a state transition pattern on the state transition table for an image structured by an index.

【００６１】図４を参照して、さらに説明すると、意味
を表す文字列『××××・・×』に対して、先ず、その
用語を複数の事象の入力列で表現して事象リスト『事象
１、事象２、事象３』を設定する。この事象リストは、
図５に示すように、状態Ａにおいて事象１が発生し、次
に事象２が発生し、次に事象３が発生し、状態Ｂに移る
ことを示している。To explain further with reference to FIG. 4, for a character string "xxx" representing a meaning, first, the term is represented by an input sequence of a plurality of events, and an event list " Event 1, Event 2, Event 3 "are set. This event list is
As shown in FIG. 5, event 1 occurs in state A, then event 2 occurs, then event 3 occurs, and the state transitions to state B.

【００６２】用語に対する事象リストが作成されると、
この事象リストに存在する事象１〜３の中で最も用語を
象徴的に表している事象または用語を特定するのに相応
しい事象をキーインデックスに指定する。ここでは事象
１がキーインデックスとして指定されている。続いて、
構造化された映像の構造単位において、事象リストに存
在する事象１〜３が用語の意味する内容として発生可能
な最小の構造単位を検索粒度として選択し、設定する。
ここでは粒度１として記述する。When an event list for a term is created,
An event that symbolically represents the term among the events 1 to 3 existing in the event list or an event suitable for specifying the term is designated as a key index. Here, event 1 is specified as a key index. continue,
In the structural unit of the structured video, the smallest structural unit in which events 1 to 3 existing in the event list can occur as the meaning of the term is selected and set as the search granularity.
Here, it is described as particle size 1.

【００６３】同様に、意味を表す文字列『００００・・
０』に対して、先ず、その用語を複数の事象の入力列で
表現して事象リスト『事象１、事象２、事象４』を設定
する。この事象リストは、図５に示すように、状態Ａに
おいて事象１が発生し、次に事象２が発生し、次に事象
４が発生し、状態Ｃに移ることを示している。Similarly, the character string "0000 ...
For “0”, the term is first represented by an input sequence of a plurality of events, and an event list “event 1, event 2, event 4” is set. This event list indicates that event 1 occurs in state A, then event 2 occurs, then event 4 occurs, and the state transitions to state C, as shown in FIG.

【００６４】用語に対する事象リストが作成されると、
この事象リストに存在する事象１、２、４の中で最も用
語を象徴的に表している事象または用語を特定するのに
相応しい事象をキーインデックスに指定する。ここでは
事象４がキーインデックスとして指定されている。続い
て、構造化された映像の構造単位において、事象リスト
に存在する事象１、２、４が用語の意味する内容として
発生可能な最小の構造単位を検索粒度として選択し、設
定する。ここでは粒度２として記述する。When an event list for a term is created,
Among the events 1, 2, and 4 present in the event list, an event that symbolically represents a term or an event suitable for specifying the term is designated as a key index. Here, event 4 is specified as a key index. Subsequently, in the structured unit of the structured video, the smallest structural unit that can be generated as the meaning of the terms of events 1, 2, and 4 present in the event list is selected and set as the search granularity. Here, it is described as particle size 2.

【００６５】なお、図４に示すように、異なる用語に対
して検索粒度として設定された粒度１と粒度２が同一の
構造単位であることもあり得る。検索粒度は用語毎に最
適なものを選択すれば良く、検索粒度に同一の構造単位
が多数存在していてもかまわない。As shown in FIG. 4, the granularity 1 and the granularity 2 set as the search granularity for different terms may be the same structural unit. An optimal search granularity may be selected for each term, and a plurality of structural units having the same search granularity may exist.

【００６６】状態遷移テーブルを利用して用語の意味す
る映像内容を検索する場合、例えば、用語に対応した検
索粒度の構造単位に絞り込んで検索をすることで検索効
率の向上を図ることができる。さらに、キーインデック
スを用いて該当する事象をパースした後、指定された検
索粒度の構造単位で、事象リストが成り立つか判定する
ことで、高速かつ効率的な検索を行うことができる。In the case of retrieving video content that means a term using the state transition table, for example, the search efficiency can be improved by narrowing down the search to a structural unit having a search granularity corresponding to the term. Furthermore, after the corresponding event is parsed using the key index, it is possible to perform a high-speed and efficient search by determining whether or not the event list is established in the structural unit of the specified search granularity.

【００６７】（４）状態遷移パターン（状態遷移図）の
定義例本発明では、前述したように、予め、人間が検索に用い
る抽象度の高い意味をもつ用語（および概念）を、状態
遷移のパス正規表現パターン（状態遷移パターン）とし
て定義しておく。(4) Example of Definition of State Transition Pattern (State Transition Diagram) In the present invention, as described above, terms (and concepts) having a high level of abstraction used by humans for retrieval are previously defined as state transition patterns. It is defined as a path regular expression pattern (state transition pattern).

【００６８】前提条件として、＊映像には事象インデックスが付加されているが、事象
インデックスが付加されたことにより、それを入力記号
として新たな状態に遷移するとする。＊また、事象インデックスは時間的幅をもたないと定義
する。＊そして、２つの着目する事象インデックスの間の映像
データをシーンと定義する。＊シーンは時間に沿って流れていく。As a prerequisite, * an event index is added to a video, and it is assumed that the event index is added and a new state is entered using the event index as an input symbol. * Also, the event index is defined as having no temporal width. * The video data between the two event indexes of interest is defined as a scene. * Scenes flow over time.

【００６９】このシーンの流れ、すなわち、状態遷移の
様子は、事象インデックスをラベルとする有向グラフに
よって表現できる。グラフのノードはシーンを表す。各
シーンには、状況を表現する各種のパラメータ値が属性
として付加されている。例えば、映像が野球の試合を記
録したものである場合、この属性としては、スコア、守
備側選手のポジションと選手名、打者名、ＳＢＯ（スト
ライク・ボール・アウト）などである。The flow of the scene, that is, the state of the state transition can be represented by a directed graph using the event index as a label. The nodes of the graph represent scenes. Various parameter values representing the situation are added to each scene as attributes. For example, when the video is a recorded baseball game, the attributes include a score, a position and a player name of a defensive player, a batter's name, an SBO (strike ball out), and the like.

【００７０】検索とは、インデックス付けされた映像に
対して、状態遷移パターンを発見することである。発見
された映像部分（シーンあるいはインデックス）が検索
結果となる。該当するシーンが他の該当するシーンを包
含する場合は、それらのうち最短であるシーンの流れを
検索結果とする。The search is to find a state transition pattern in the indexed video. The found video part (scene or index) is the search result. If the relevant scene includes other relevant scenes, the shortest scene flow among them is used as the search result.

【００７１】状態遷移パターンのパス正規表現として
は、『シーンｓ０から事象インデックスＩにより、シー
ンｓ１に遷移する場合』、“ｓ０−Ｉ−＞ｓ１”と表現
することとする。また、“．”は任意のシーンおよび任
意の事象インデックスを表現するものとする。また、
“−．−＞”は“→”と略して記述する。また、時間的
に連続する２つのシーン間の遷移は“＝＝＞”で表す。
また、“＊”は０回以上の繰り返しを表し、“＋”は１
回以上の繰り返しを表すものとする。The path regular expression of the state transition pattern is expressed as "s0-I->s1" when "transition from scene s0 to scene s1 by event index I". “.” Represents an arbitrary scene and an arbitrary event index. Also,
“−.−>” is abbreviated as “→”. The transition between two temporally consecutive scenes is represented by “==>”.
“*” Indicates 0 or more repetitions, and “+” indicates 1
It represents the repetition of at least times.

【００７２】例えば、シーンｓ０から出発して、事象イ
ンデックス列“Ａ．＊Ｂ．＊（Ｃ．＊）＊Ｃ”によって
シーンｓ１に至るシーンの流れは、以下のパス正規表現
で表現される。 s0-A->.( →.)*-B->. (→.)*(-C->(→.)*)*-C ->s1 このパス表現を以下のように略記することにする。 s0-ABC+->s1For example, the flow of a scene starting from the scene s0 and reaching the scene s1 by the event index sequence "A. * B. * (C. *) * C" is represented by the following path regular expression. s0-A->. (→.) *-B->. (→.) * (-C-> (→.) *) *-C-> s1 This path expression is abbreviated as follows: I do. s0-ABC +-> s1

【００７３】厳密には、正規表現“Ａ．＊Ｂ．＊（Ｃ．
＊）＊Ｃ”と、正規表現“ＡＢＣ＋”の表現内容は同一
ではないが、事象インデックスの場合、無視できる事象
インデックスも多く、それらを一々記述するのは煩雑で
あるため、このような略記法を適用する。図６は、この
パス正規表現に対応する有限状態オートマトンの状態遷
移図を示している。Strictly speaking, the regular expression “A. * B. * (C.
Although the expression contents of *) * C "and the regular expression" ABC + "are not the same, in the case of an event index, there are many ignorable event indexes, and it is troublesome to describe each of them. FIG. 6 shows a state transition diagram of the finite state automaton corresponding to the path regular expression.

【００７４】また、図７に示すように、映像（映像シー
ン）に対して、Ａ，Ｂ，Ｃ，Ｄ等の事象インデックスが
付加されているとする。この映像に対して、図６に示す
シーンｓ０，ｓ１を検索すると、ｓ０，ｓ１は図７に示
すように求まる。It is also assumed that, as shown in FIG. 7, event indexes such as A, B, C, and D are added to a video (video scene). When the scenes s0 and s1 shown in FIG. 6 are searched for this video, s0 and s1 are obtained as shown in FIG.

【００７５】この状態遷移パターン（状態遷移図）にさ
らに、条件式を記述することにより、構造情報利用によ
る検索の効率化を図ることができる。例えば、状態遷移
パターンの後ろの〔〕内に条件式を書くこととする。 “ s0-ABC+->s1〔打席(s0) is the same as 打席(s
1)〕”By describing conditional expressions in this state transition pattern (state transition diagram), it is possible to improve the efficiency of retrieval by using structural information. For example, it is assumed that a conditional expression is written in [] after the state transition pattern. “S0-ABC +-> s1 [bats (s0) is the same as batting (s
1)] "

【００７６】構造情報を利用した例：ホームチームの逆
転シーンを探せ。ただし、検索結果のシーン列の最後
は、そのイニングの最後までとってくること。 Example using structure information: Search for a reversal scene of a home team. However, the end of the scene sequence in the search results must be taken to the end of the inning.

【００７７】ここで、ｄｅｆ以下の定義は、単なる文字
列の置き換えであり、実際にはｓ０．ｘなどのシーン環
境が設定されたとき、値の評価が起こる。Here, the definition after def is a simple replacement of a character string. When a scene environment such as x is set, value evaluation occurs.

【００７８】（５）具体的な映像検索処理アルゴリズム
の例ここで、具体的な映像検索処理アルゴリズムについて説
明する前に、映像検索を行う前の処理について確認して
おく。映像検索処理装置１００の状態遷移テーブル記憶
部１０２には、前述した状態遷移テーブル生成工程を介
して、既に状態遷移テーブルが記憶されているものとす
る。(5) Example of Specific Video Search Algorithm Before explaining the specific video search algorithm, the processing before video search is confirmed. It is assumed that the state transition table has already been stored in the state transition table storage unit 102 of the video search processing device 100 through the above-described state transition table generation step.

【００７９】図８は、実施の形態１の映像検索処理のア
ルゴリズムを示すフローチャートである。図１で示した
映像検索処理装置１００を用いて所望の映像部分を検索
する場合、利用者は、映像入力部１０１を介して検索し
たい映像を映像検索処理装置１００へ入力する。なお、
映像入力部１０１として装置の磁気ディスクが使用され
ている場合には、検索対象となる映像を指定するだけで
良い。FIG. 8 is a flowchart showing an algorithm of the video search processing according to the first embodiment. When searching for a desired video portion using the video search processing device 100 shown in FIG. 1, a user inputs a video to be searched to the video search processing device 100 via the video input unit 101. In addition,
When the magnetic disk of the apparatus is used as the video input unit 101, it is only necessary to specify the video to be searched.

【００８０】映像検索を行う際の処理として、先ず、利
用者が操作入力部１０３を介して、所望の映像シーンを
表現した用語を入力すると、検索の取り掛かりとして状
態遷移テーブルから該当する用語に対応したキーインデ
ックスを求める（Ｓ８０１）。As a process for performing a video search, first, when a user inputs a term expressing a desired video scene via the operation input unit 103, the user starts searching for a corresponding term from the state transition table in the state transition table. The obtained key index is obtained (S801).

【００８１】次に、キーインデックスを利用して、キー
インデックスと一致する事象インデックスを検索し、結
果としてキーインデックスの集合を得る（Ｓ８０２）。
ステップＳ８０２で得られたキーインデックスの集合に
対して、状態遷移テーブルで指定されている構造の制約
条件（状態遷移パターン）から、そのキーインデックス
を含む構造インスタンス（検索粒度）を求める（Ｓ８０
３）。Next, using the key index, an event index that matches the key index is searched, and as a result, a set of key indexes is obtained (S802).
With respect to the set of key indexes obtained in step S802, a structure instance (search granularity) including the key index is obtained from the constraint (state transition pattern) of the structure specified in the state transition table (S80).
3).

【００８２】続いて、一つの構造インスタンス（検索粒
度と一致する構造単位）に対し（Ｓ８０４）、状態遷移
パターン中に含まれる事象インデックスが全て存在する
か否かを判定し（Ｓ８０５）、含まれていない構造イン
スタンスについては、処理を行わない。Subsequently, for one structure instance (structure unit that matches the search granularity) (S804), it is determined whether or not all event indexes included in the state transition pattern exist (S805). No processing is performed for the structure instance that has not been executed.

【００８３】また、ある構造インスタンスに複数のキー
インデックスが含まれている場合は、キーインデックス
の集合から、該当するキーインデックスを除去し（Ｓ８
０６）。続いて、全て存在すると判定された構造単位に
対して、状態遷移が成立するか否かを判定し、換言すれ
ば、入力した用語に対応した状態遷移パターンと構造単
位中の事象インデックスの発生パターンが一致するか否
かを判定し（Ｓ８０７）、成立するならば、抽象度の高
い用語によって指定されている意味が成立したと判定す
る（Ｓ８０８）。また、このとき、必要に応じて、得ら
れた構造インスタンスに対して、成立した意味に該当す
る新たな事象インデックスを追加するか、または既存の
事象インデックスの属性としてその情報を付加する。If a certain structural instance includes a plurality of key indexes, the corresponding key index is removed from the set of key indexes (S8).
06). Subsequently, it is determined whether or not a state transition is established for the structural units determined to be all present, in other words, the state transition pattern corresponding to the input term and the occurrence pattern of the event index in the structural unit. Are determined (S807), and if they are satisfied, it is determined that the meaning specified by the term with a high degree of abstraction is realized (S808). At this time, if necessary, a new event index corresponding to the established meaning is added to the obtained structure instance, or the information is added as an attribute of the existing event index.

【００８４】次に、得られたキーインデックスの集合の
全てのキーインデックスに対して、上記ステップＳ８０
３〜Ｓ８０８を繰り返す（Ｓ８０９）。Next, for all the key indices of the obtained set of key indices, the above-mentioned step S80 is performed.
Steps 3 to S808 are repeated (S809).

【００８５】その後、得られた構造インスタンス（構造
単位）を返り値として返し（Ｓ８１０）、検索結果とし
て映像シーンの切り出しを行う（Ｓ８１１）。なお、検
索結果の切り出しは、デフォルトで指定された検索粒度
（構造単位）であっても良く、その他に、その切り出し
部分を含む任意の構造単位や、キーインデックスを基準
として前後にオフセットを指定した切り出し等が指定で
きるものとする。Thereafter, the obtained structure instance (structure unit) is returned as a return value (S810), and a video scene is cut out as a search result (S811). The extraction of the search result may be performed using the search granularity (structure unit) specified by default. In addition, an arbitrary structure unit including the extracted portion, or an offset specified before or after the key index. It is assumed that clipping can be specified.

【００８６】また、検索の応用の一つとして、検索条件
をパースして、状態遷移パターンと一致するパターンを
発見した場合、その意味を説明する解説文（説明文）を
生成する手続きを定義しておき、実行することで解説文
を自動生成することができる。すなわち、状態遷移テー
ブルに指定されている映像内容の意味定義（状態遷移パ
ターン）を利用して、映像を説明する文字列を生成する
ことが可能となる。断片的に振られているインデックス
を単に羅列しただけでは、単なる用語が並列に並んでい
るだけとなるが、状態遷移パターンを利用すれば、利用
者に分かりやすい文書を生成できる。As one of the applications of the search, a procedure for generating a commentary (explanatory sentence) which explains the meaning when a pattern matching the state transition pattern is found by parsing the search condition is defined. The commentary can be automatically generated by executing it. That is, it is possible to generate a character string describing a video by using the meaning definition (state transition pattern) of the video content specified in the state transition table. Simply arranging the indices that are fragmented simply means that the terms are simply arranged in parallel. However, if a state transition pattern is used, a user-friendly document can be generated.

【００８７】（６）具体的な動作例以上の構成および映像検索処理アルゴリズムを用いた具
体的な動作例について説明する。図９は、野球映像を例
とした時の映像インデックス（構造インデックスおよび
事象インデックス）の例を示す説明図である。構造イン
デックスによって、ゲーム開始から、回、イニング、表
・裏、打席、投球と言った構造単位で分割され階層化さ
れている。このような構造は、映像インデックスを定義
する際に予めプロファイルとして設定されている。(6) Specific Operation Example A specific operation example using the above configuration and the video search processing algorithm will be described. FIG. 9 is an explanatory diagram showing an example of a video index (structure index and event index) when a baseball video is taken as an example. According to the structure index, the game is divided into hierarchies such as times, innings, front / back, bats, and pitches from the start of the game, and is hierarchized. Such a structure is set in advance as a profile when defining a video index.

【００８８】さらに、ヒット、アウト、ホームランなど
の事象インデックスも必要に応じて振られている。この
ように映像インデックスにより、構造化され、説明事象
が断片的に設定された映像に対し、図６に示したような
状態遷移パターンを用意する。例えば、打席粒度（打席
レベルの構造単位）において、ヒットの後に１回以上の
加点イベントがあった場合、『タイムリー』という意味
となる。また、アウトもしくはフライイベントに続い
て、加点イベントが発生した場合は、『犠打』という意
味となる。Further, event indexes such as hit, out, and home run are assigned as needed. As described above, a state transition pattern as shown in FIG. 6 is prepared for an image structured by the image index and in which the explanatory event is set in a fragmentary manner. For example, in the at-bat granularity (at-bat level structural unit), if there is at least one additional event after a hit, it means "timely". Further, if an additional event occurs after the out or fly event, it means "sacrifice hit".

【００８９】このとき、中心事象（キーインデックス）
として定義されているイベントが検索の手掛かりとなる
インデックスである。このキーインデックスを頼りに映
像を検索し、指定された粒度において、状態遷移パター
ンが発見されれば、その粒度は検索結果の候補となる。At this time, the central event (key index)
The event defined as is an index that can be used as a search key. A video is searched by relying on this key index, and if a state transition pattern is found at a specified granularity, the granularity becomes a candidate for a search result.

【００９０】さらに、事象インデックスが映像の状況を
表す環境パラメータ対して及ぼした効果についても、状
態遷移を定義することができる。例えば、野球の場合、
打席粒度に対して、加点イベントが発生していたとき、
打席開始直前の加点状況と打席終了直後の加点状況の変
化に対して、『先制』、『同点』、『逆転』などの意味
を表すことで指定できる。Furthermore, a state transition can be defined for the effect that the event index has on the environmental parameters representing the state of the video. For example, in the case of baseball,
When an additional event has occurred for the at-bat granularity,
The change in the point addition status immediately before the start of the turn at bat and the change in the point addition status immediately after the end of the turn at bat can be designated by expressing the meaning such as "first open", "tie", "reverse", and the like.

【００９１】ここで、状態遷移パターンの例について具
体的に示す。＊タイムリーヒットｓ０＝＞打席イン＝＞ｓ０’−ヒット・加点＋→ｓ１＝
＞打席イン＝＞ｓ１’ ＊犠打ｓ０＝＞投席イン＝＞ｓ０’−アウト・加点＋→ｓ１＝
＞打席イン＝＞ｓ１’ ＊併殺打ｓ０＝＞打席イン＝＞ｓ０’−アウト・アウト→ｓ１＝
＞打席イン＝＞ｓ１’ ＊逆転ｓ０＝＞打席イン＝＞ｓ０’−加点＋ →ｓ１＝＞打席
イン＝＞ｓ１’ 〔ｓ０．ホームラン＞ｓ０．アウェイスコア＆＆ｓ１．’ホームスコア＞ｓ１．’アウェイスコア｜｜ｓ０．ホームラン＜ｓ０．アウェイスコア＆＆ｓ１．’ホームスコア＜ｓ１．’アウェイスコア〕Here, an example of the state transition pattern will be specifically described. * Timely hit s0 => bat in == s0'-hit / addition + → s1 =
> Bat in == s1 '* sacrifice bat s0 => pitch in = = s0'-out / addition + → s1 =
> At bat in => s1 '* Combat kill s0 => At bat in = = s0'-out / out → s1 =
> At-bat-in => s1 '* Reversal s0 => At-bat-in =>s0'-additional point + → s1 => At-bat in => s1 '[s0. Home run> s0. Away score && s1. 'Home score> s1. 'Away score || s0. Home run <s0. Away score && s1. 'Home score <s1. 'Away score]

【００９２】上記のような状態遷移パターンを設定して
おけば、抽象度の高い意味の検索に対応できる。逆に言
えば、この状態遷移パターンを定義するだけで、映像の
インデックス情報の構造や事象定義などに依存せずに映
像の内容の意味に基づいた検索が可能となる。すなわ
ち、半構造データとしての映像インデックス情報（イン
デックスの構造は、作成者ごとに異なり、固定していな
いという半構造データの特徴を備えた映像インデック
ス）に対して、統一した検索問い合わせのための環境を
用意することが可能となる。By setting the state transition pattern as described above, it is possible to cope with a search of a meaning having a high degree of abstraction. Conversely, by simply defining this state transition pattern, a search based on the meaning of the content of the video can be performed without depending on the structure of the index information of the video or the event definition. That is, an environment for a unified search query for video index information as semi-structured data (a video index having the characteristic of semi-structured data that the structure of the index differs for each creator and is not fixed). Can be prepared.

【００９３】また、解説文（説明文）の自動生成に関し
ては、抽象度の高い意味を表す状態遷移テーブルに、そ
の主語となるデータをどこから取ってくるかという情報
（文字列定義情報）を併せて定義しておき、その主語と
意味情報、さらに状況変化により発生した意味情報を接
続詞を挟んで組み合わせることにより、利用者にとって
違和感のない解説文が生成できる。例えば、タイムリー
ヒットという用語に対して、説明文の生成条件（文字列
定義情報）は次のように指定される。In addition, as for the automatic generation of the commentary (explanatory text), information (character string definition information) as to where the subject data is to be taken from is added to the state transition table representing the meaning of the abstraction. By combining the subject, the semantic information, and the semantic information generated by the situation change with a connective in between, it is possible to generate a commentary that is comfortable for the user. For example, for the term “timely hit”, the conditions for generating the description (character string definition information) are specified as follows.

【００９４】タイムリーヒットｓ０＝＞打席イン＝＞ｓ０’−ヒット・加点→ｓ１＝＞
打席イン＝＞ｓ１’ 文字列：＜回．回数＞＜イニング・表または裏＞＜打
席．打者名＞の〔加点〕点タイムリー〔で同点｜逆転｜
加点〕＃１＜回．回数＞＜イニング・表または裏＞ ‥‥‥いつ＃２＜打席．打者名＞ ‥‥‥誰が（主語）＃３の〔加点〕点タイムリー ‥‥‥何を＃４「で同点｜逆転｜加点」 ‥‥‥どのようにTimely hit s0 => Battery in == s0'-hit / addition → s1 =>
At-bat in => s1 'Character string: <times. Number of times><Inning, front or back><Battery.Batter'sname> [addition] point timely [tie in | reversal |
Additional points] # 1 <times. Number of times><Inning, front or back> ‥‥‥ When # 2 Batter's name> ‥‥‥ Who (subject) # 3 [addition] point timely ‥‥‥ What ’s the # 4 “same | reversal | addition” ‥‥‥ how

【００９５】ここで、＜インデックス．属性名＞は指定
された映像インデックスの属性を示す。〔インデック
ス〕は、指定された粒度においてインデックスが発生し
た回数を示す。「用語」は、中に記述されている抽象度
の高い用語が成立した場合に記述することを示す。Here, <index. Attribute name> indicates the attribute of the specified video index. [Index] indicates the number of times an index has been generated at the specified granularity. "Term" indicates that a term having a high degree of abstract described therein is established.

【００９６】上記の指定により検索結果が得られたとき
の説明文は、１回裏 ×× の２点タイムリーで逆転といった形態となる。The description sentence when the search result is obtained by the above designation is in the form of one time reverse XX two-point timely and reversed.

【００９７】また、リアルタイムオーサリングの際は、
インデックスを振りながら、上記の状態遷移パターンの
キーインデックス（中心事象）が発生したときに、その
前後において、指定された粒度内で状態遷移パターンを
満足しているかの検証を行う。状態遷移パターンを満足
するような事象が連続して起きたときには、そこで定義
されている意味が発生したとし、新たなインデックスを
付加したり、キーインデックスの属性として、その情報
を加えるなどの処理を行う。Also, in the case of real-time authoring,
When the key index (central event) of the above-mentioned state transition pattern occurs while the index is being assigned, it is verified whether the state transition pattern is satisfied within the specified granularity before and after the key index (central event). When events that satisfy the state transition pattern occur consecutively, it is assumed that the meaning defined there has occurred, and processing such as adding a new index or adding the information as an attribute of the key index is performed. Do.

【００９８】例えば、打席粒度の中で、ヒットの後に加
点インデックスが１回以上続いた場合は、そのヒットは
タイムリーであったと判定し、タイムリーヒットという
インデックス（抽象インデックス）を新たに付加する
か、ヒットインデックス（事象インデックス）の属性と
して『タイムリー』を加えるなどの処理を行う。For example, if the additional point index continues one or more times after the hit in the at-bat granularity, the hit is determined to be timely, and an index (abstract index) called a timely hit is newly added. Alternatively, processing such as adding “timely” as an attribute of the hit index (event index) is performed.

【００９９】また、タイムリーヒットというインデック
ス（抽象インデックス）をキーインデックスとして、状
態遷移テーブルに予め設定しておいても良い。Further, an index called a timely hit (abstract index) may be set as a key index in the state transition table in advance.

【０１００】以上説明した実施の形態１に係る映像検索
方法および映像インデックス付与方法は、前述した説明
および各フローチャートに示した手順に従って、予めプ
ログラムをコンピュータで実行することによって実現さ
れる。このプログラムは、ハードディスク、フロッピー
（登録商標）ディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ等
のコンピュータで読み取り可能な記録媒体に記録され、
コンピュータによって記録媒体から読み出されることに
よって実行される。また、このプログラムは、上記記録
媒体を介して、またはネットワークを介して配布するこ
とができる。The video search method and video index assigning method according to the first embodiment described above are realized by executing a program in advance by a computer according to the above-described description and the procedure shown in each flowchart. This program is recorded on a computer-readable recording medium such as a hard disk, a floppy (registered trademark) disk, a CD-ROM, an MO, and a DVD.
It is executed by being read from a recording medium by a computer. This program can be distributed via the recording medium or via a network.

【０１０１】（実施の形態２）実施の形態２の映像内容
説明文生成装置は、基本的には実施の形態１の映像検索
処理装置１００の検索の応用の一つとして、検索条件を
パースして、映像中（構造単位中）に状態遷移パターン
と一致するパターンを発見した場合、その意味を説明す
る解説文（説明文）を生成する手続きを定義し、実行す
ることで映像内容の説明文生成を自動生成するものであ
る。すなわち、状態遷移テーブルに指定されている映像
内容の意味定義（状態遷移パターン）を利用して、映像
内容を説明する文字列を生成するものである。(Embodiment 2) The video content explanation generating apparatus of Embodiment 2 basically parses search conditions as one of the applications of the search of the video search processing apparatus 100 of Embodiment 1. When a pattern that matches a state transition pattern is found in a video (in a structural unit), a procedure for generating a commentary (explanatory text) that describes the meaning of the pattern is defined and executed, and a description of the video content is executed. The generation is automatically generated. That is, a character string describing the video content is generated by using the semantic definition (state transition pattern) of the video content specified in the state transition table.

【０１０２】しかし、映像内容の説明文生成は、必ずし
も映像検索処理装置１００における映像検索処理の後処
理または追加的な機能として実行されるだけでなく、前
述した構造インデックスおよび事象インデックスが付与
され、構造インデックスで分割された区間の映像シーン
を映像の構造単位とし、かつ、複数の階層化した構造単
位を用いて構造化した映像であれば、映像検索処理とは
別に、独立した機能として映像内容の説明文生成処理を
行うことができる。したがって、本発明の映像内容の説
明文生成方法を明確に表現するために、ここでは適用す
る装置を映像内容説明文生成装置２００として呼ぶこと
にする。However, the generation of the explanation of the video content is not always executed as a post-processing or an additional function of the video search processing in the video search processing apparatus 100, and the structure index and the event index described above are added. If the video scene in the section divided by the structure index is used as the video structure unit and the video is structured using multiple hierarchical structure units, the video content is provided as an independent function separately from the video search processing. Can be performed. Therefore, in order to clearly express the video content description generation method of the present invention, the apparatus to which the present invention is applied will be referred to as a video content description generation device 200 here.

【０１０３】また、実施の形態２においては、構造イン
デックス、事象インデックスおよび抽象インデックスで
構成される構造化されたインデックス情報の部分と、実
際の映像シーンの部分とがそれぞれ別々に管理されてお
り、映像内容の説明文を生成する際には、インデックス
情報の部分を入力して、説明文の生成を行うものとす
る。これにより、データ量が多い映像シーン部分と切り
離して処理を行えるので、装置の負荷を軽減して説明文
の生成処理を高速に行うことができる。In the second embodiment, a part of structured index information constituted by a structure index, an event index and an abstract index and a part of an actual video scene are separately managed. When a description of the video content is generated, a part of the index information is input to generate the description. As a result, the processing can be performed separately from the video scene part having a large data amount, so that the load on the apparatus can be reduced and the processing for generating the description can be performed at high speed.

【０１０４】以下、実施の形態２の映像内容説明文生成
装置２００について、（７）映像内容説明文生成装置の装置構成（８）事象インデックスと抽象インデックスとの関係（９）実施の形態２の状態遷移テーブルの構造（１０）映像内容の説明文生成処理のアルゴリズムの例（１１）構文および文字列定義情報を用いて生成した説
明文の具体例の順で説明する。Hereinafter, the video content description generating apparatus 200 according to the second embodiment will be described. (7) Device configuration of the video content description generating apparatus (8) Relationship between event index and abstract index (9) Structure of State Transition Table (10) Example of Algorithm for Generating Description of Video Content (11) Specific Example of Description Generated Using Syntax and Character String Definition Information

【０１０５】（７）映像内容説明文生成装置の装置構成図１０は、実施の形態２の映像内容説明文生成装置の構
成図を示し、同図（ａ）が映像内容説明文生成装置２０
０のハード構成の一例を示し、同図（ｂ）が映像内容説
明文生成装置２００の機能ブロック図を示している。映
像内容説明文生成装置２００のハード構成としては、少
なくともＣＰＵ（中央演算処理装置）、ディスプレー、
キーボード、磁気ディスクを有する装置であれば良く、
例えば、同図（ａ）に示すようなパーソナルコンピュー
タを利用することができる。(7) Apparatus Configuration of Video Content Description Generation Apparatus FIG. 10 shows a configuration diagram of a video content description generation apparatus according to the second embodiment, and FIG.
FIG. 2B is a functional block diagram of the video content description generating apparatus 200. The hardware configuration of the video content description generation device 200 includes at least a CPU (Central Processing Unit), a display,
Any device that has a keyboard and a magnetic disk may be used.
For example, a personal computer as shown in FIG.

【０１０６】また、映像内容説明文生成装置２００は、
同図（ｂ）に示すように、説明文生成の対象である構造
化した映像を入力するための映像入力部２０１と、後述
する用語、生成粒度、状態遷移パターン、文字列定義情
報およびキーインデックスを状態遷移テーブルとして記
憶した状態遷移テーブル記憶部２０２と、映像内容の説
明文生成処理に必要な各種指定を入力するための操作入
力部２０３と、操作入力部２０３を介して説明文を生成
する映像シーンの範囲（構造単位）が指定された場合
に、状態遷移テーブル記憶部２０２の状態遷移テーブル
を参照し、説明文を生成する対象となる構造単位と一致
する生成粒度を検索し、該当する生成粒度を用いて構造
単位中で成立する用語を検索する用語検索部２０４と、
用語検索部２０４で検索された用語（成立した用語）の
状態遷移パターンの文字列定義情報を用いて、映像シー
ンの説明文を生成する説明文生成部２０５と、説明文生
成部２０５で生成した説明文を表示する説明文表示部２
０６と、を備えている。Also, the video content description generating apparatus 200
As shown in FIG. 3B, a video input unit 201 for inputting a structured video to be generated for an explanatory note, and terms, generation granularity, state transition patterns, character string definition information, and key indexes described later. As a state transition table, a state transition table storage unit 202, an operation input unit 203 for inputting various designations required for a process of generating a description of video content, and a description sentence via the operation input unit 203. When the range (structural unit) of the video scene is specified, the state transition table in the state transition table storage unit 202 is referenced to search for a generation granularity that matches the structural unit for which the description is to be generated. A term search unit 204 that searches for terms that are satisfied in the structural unit using the generation granularity;
An explanation generating section 205 that generates an explanation of a video scene using character string definition information of a state transition pattern of a term (established term) searched by the term search section 204, and an explanation generating section 205 generates the explanation. Explanation display part 2 for displaying explanations
06.

【０１０７】なお、実施の形態２の映像内容説明文生成
装置２００は、説明文の生成に使用する情報が設定され
た状態遷移テーブルと、予め構造インデックスおよび事
象インデックスに付与されている文字列または文字列に
変換可能な属性情報とを用いて、説明文を生成する。Note that the video content explanation generating apparatus 200 according to the second embodiment includes a state transition table in which information used for generating an explanation is set, and a character string or a character string previously assigned to a structure index and an event index. A description is generated using attribute information that can be converted into a character string.

【０１０８】（８）事象インデックスと抽象インデック
スとの関係次に、実施の形態２の映像内容説明文生成装置２００
が、説明文生成の対象とする構造化された映像について
説明する。実施の形態２で使用する構造化された映像に
は、映像を意味的な纏まりで分割するための構造インデ
ックスと、映像中で発生した事象の内容および場所を特
定するための事象インデックスと、特定の用語を定義し
た状態遷移パターンが成立した場合に、その用語の意味
が成立していることを表すための抽象インデックスと、
が付与されている。(8) Relationship between Event Index and Abstract Index Next, the video content explanation generating apparatus 200 according to the second embodiment
However, a structured video to be used for generating a description will be described. The structured video used in the second embodiment includes a structure index for dividing the video into a semantic group, an event index for specifying the content and location of an event that has occurred in the video, and When a state transition pattern that defines the term is established, an abstract index for indicating that the meaning of the term is established,
Is given.

【０１０９】なお、構造インデックスおよび事象インデ
ックスとの関係は、実施の形態１の「（２）構造インデ
ックスと事象インデックスとの関係」で説明した内容と
同一であるため、ここでは、事象インデックスと抽象イ
ンデックスとの関係について説明する。また、説明文を
生成する際の映像単位として、構造インデックスによっ
て定義された構造単位を指定したものが生成粒度であ
る。Since the relationship between the structure index and the event index is the same as that described in “(2) Relationship between structure index and event index” in the first embodiment, the event index and the abstract are used here. The relationship with the index will be described. In addition, as a video unit used when generating a description, a structure unit defined by a structure index is a generation granularity.

【０１１０】事象インデックスは、映像中で発生した事
象の内容および場所を特定するためのインデックスであ
る。換言すれば、映像上の１つの事象と対応付けて付与
され、かつ、事象の内容（意味）を示す断片的な情報を
属性情報として有するものである。An event index is an index for specifying the content and location of an event that has occurred in a video. In other words, it is provided in association with one event on the video and has fragmentary information indicating the content (meaning) of the event as attribute information.

【０１１１】抽象インデックスは、複数の事象の組み合
わせによって表現される意味（すなわち、実施の形態１
で説明した抽象度の高い用語）を示すインデックスであ
る。また、抽象インデックスは、ある区間の複数の事象
の発生が明らかになって初めて意味が成立するものであ
り、複数の事象インデックスの発生パターンによって表
現可能な意味を有するものである。An abstract index is a meaning represented by a combination of a plurality of events (that is, an abstract index according to the first embodiment).
This is an index that indicates a term with a high degree of abstraction described in (1). The abstract index has meaning only when the occurrence of a plurality of events in a certain section is clarified, and has a meaning that can be expressed by the occurrence pattern of the plurality of event indexes.

【０１１２】一方、実施の形態２では、状態遷移パター
ンの定義を、事象（事象インデックス）の発生パターン
に基づいて、生成粒度の中で連続して発生する複数の事
象インデックスの入力列として抽象度の高い用語を定義
したものとする。On the other hand, in the second embodiment, the definition of the state transition pattern is defined as an input sequence of a plurality of event indexes that occur continuously in the generation granularity based on the event (event index) occurrence pattern. Is defined.

【０１１３】換言すれば、抽象インデックスが、複数の
事象インデックスの発生パターンによって表現可能な意
味を有するものであり、状態遷移パターンが、複数の事
象インデックスの入力列として用語を定義したものある
ため、１つの抽象インデックスと１つの状態遷移パター
ンは、事象インデックスの入力列（または発生パター
ン）を構成要件として１対１の対応関係で存在すること
になる。したがって、抽象インデックスを用いて、その
抽象インデックスと１対１で対応する状態遷移パターン
（複数の事象インデックスの入力列）を表現することが
できるので、多数の事象インデックスで表現された状態
遷移パターンの場合に、抽象インデックスと事象インデ
ックスとを用いて表現することにより、状態遷移パター
ンをより少ない数のインデックスで表現することができ
るようになる。また、抽象インデックスと事象インデッ
クスとを用いて状態遷移パターンを表現することによ
り、より抽象度の高い用語の定義が容易になるという効
果を奏する。In other words, since the abstract index has a meaning that can be represented by the occurrence pattern of a plurality of event indexes, and the state transition pattern defines a term as an input sequence of the plurality of event indexes, One abstract index and one state transition pattern exist in a one-to-one correspondence with the input sequence (or occurrence pattern) of the event index as a constituent requirement. Therefore, a state transition pattern (input sequence of a plurality of event indexes) corresponding to the abstract index on a one-to-one basis can be expressed using the abstract index. In this case, the state transition pattern can be represented by a smaller number of indexes by expressing the state transition pattern using the abstract index and the event index. Also, by expressing the state transition pattern using the abstract index and the event index, it is possible to easily define a term having a higher degree of abstraction.

【０１１４】また、状態遷移パターンが成立した場合に
のみ、対応する抽象インデックスを映像に付与すること
が可能であり、映像中に抽象インデックスが付与されて
いる場合には、対応する状態遷移パターンが成立してい
ることを意味している。Also, the corresponding abstract index can be added to the video only when the state transition pattern is established. When the abstract index is added to the video, the corresponding state transition pattern is added to the video. It means that it holds.

【０１１５】次に、図１１（ａ）〜（ｃ）を参照して、
抽象インデックスを用いて状態遷移パターンを定義した
例について説明する。例えば、図１１（ａ）に示すよう
に、ある用語Ｗの状態遷移パターンが、事象１（事象イ
ンデックス）→事象２→事象３→事象８→事象１２で示
す事象インデックスの入力列で定義されているとする。Next, referring to FIGS. 11 (a) to 11 (c),
An example in which a state transition pattern is defined using an abstract index will be described. For example, as shown in FIG. 11A, a state transition pattern of a certain term W is defined by an input column of an event index represented by event 1 (event index) → event 2 → event 3 → event 8 → event 12. Suppose you have

【０１１６】また、図１１（ｂ）に示すように、ある用
語Ｚの状態遷移パターンが、事象１→事象２→事象３→
事象８→事象１２→事象１３で示す事象インデックスの
入力列で定義されているとする。Further, as shown in FIG. 11B, the state transition pattern of a certain term Z is changed from event 1 → event 2 → event 3 →
It is assumed that the event index is defined by an input column of an event index indicated by event 8 → event 12 → event 13.

【０１１７】このような場合に、用語Ｗに対応する抽象
インデックスＷを用いて、用語Ｗの状態遷移パターン
（事象１→事象２→事象３→事象８→事象１２）を表現
するものと定義しておくと、用語Ｚの状態遷移パターン
は、抽象Ｗ（抽象インデックスＷ）と事象１３（事象イ
ンデックス）とを用いることにより、図１１（ｃ）に示
すように簡略化して記述することができる。In such a case, it is defined that the state transition pattern of the term W (event 1 → event 2 → event 3 → event 8 → event 12) is expressed by using the abstract index W corresponding to the term W. In addition, the state transition pattern of the term Z can be described in a simplified manner as shown in FIG. 11C by using the abstract W (abstract index W) and the event 13 (event index).

【０１１８】状態遷移パターンを定義する際に、事象イ
ンデックスだけでなく、抽象インデックスを使用可能と
することにより、より複雑な用語（より抽象的な概念の
用語）の状態遷移パターンの定義が容易となる。In defining the state transition pattern, not only the event index but also the abstract index can be used, so that the state transition pattern of a more complex term (a term of a more abstract concept) can be easily defined. Become.

【０１１９】（９）実施の形態２の状態遷移テーブルの
構造図１２は、実施の形態２の状態遷移テーブルの構造例を
示す。状態遷移テーブルには、用語と、生成粒度と、状
態遷移パターンと、キーインデックスと、文字列定義情
報と、が設定されている。(9) Structure of State Transition Table of Second Embodiment FIG. 12 shows an example of the structure of a state transition table of the second embodiment. A term, a generation granularity, a state transition pattern, a key index, and character string definition information are set in the state transition table.

【０１２０】用語は、複数の事象の組み合わせによって
表現可能な用語であり、実施の形態１で説明した内容と
同一である。生成粒度には、説明文を生成する際の映像
単位として適当な構造単位を設定したものである。構造
インデックスによって定義された構造単位を指定するこ
とができる。すなわち、生成粒度は実施の形態１の検索
粒度と同じものである。The term is a term that can be expressed by a combination of a plurality of events, and is the same as the content described in the first embodiment. The generation granularity is obtained by setting an appropriate structural unit as a video unit when generating a description. A structural unit defined by a structural index can be specified. That is, the generation granularity is the same as the search granularity in the first embodiment.

【０１２１】状態遷移パターンは、用語を表現する事象
の発生パターンに基づいて、生成粒度の中で連続して発
生する複数の事象インデックスの入力列として用語を定
義したものである。また、前述したように状態遷移パタ
ーンは、事象インデックスだけでなく、抽象インデック
スを用いて表現できるものとする。例えば、図１２にお
いて、用語Ｃの状態遷移パターンは、事象インデックス
のみで表すと、『事象１→事象４→事象６→事象７→事
象８』で示される入力列として表現できるが、用語Ｂの
抽象インデックス（抽象Ｂ）を用いて表すと、『抽象Ｂ
→事象７→事象８』で示される入力列として表現でき
る。The state transition pattern defines a term as an input sequence of a plurality of event indexes that occur continuously in the generation granularity based on the occurrence pattern of an event expressing the term. In addition, as described above, the state transition pattern can be expressed using not only the event index but also the abstract index. For example, in FIG. 12, the state transition pattern of the term C can be expressed as an input sequence represented by “event 1 → event 4 → event 6 → event 7 → event 8” when expressed only by the event index. When expressed using an abstract index (abstract B), “abstract B
→ Event 7 → Event 8 ”.

【０１２２】また、キーインデックスは、状態遷移パタ
ーン中に存在する事象インデックスのうち、少なくとも
１つを選択して設定したものである。なお、実施の形態
２では、キーインデックスの一つとして、対応する用語
を表す抽象インデックス、換言すれば、該当する状態遷
移パターン全体を表す抽象インデックスが設定されてい
るものとする。The key index is set by selecting at least one of the event indexes existing in the state transition pattern. In the second embodiment, it is assumed that an abstract index representing a corresponding term, in other words, an abstract index representing the entire corresponding state transition pattern is set as one of the key indexes.

【０１２３】文字列定義情報には、説明文の生成に使用
する構文要素および構文の構文要素として使用する文字
列の入力元が設定されている。また、構文要素として使
用する文字列の入力元として、構造インデックスまたは
事象インデックスの属性情報が設定されている。例え
ば、事象１（事象インデックス）の○○○という属性情
報の内容を文字列の入力元とする、というような設定が
なされている。また、文字列の入力元としては、後述す
るように単に属性情報の種類を指定するだけでも良い
し、用語Ｃの構文要素（Ｗｈａｔ）のように「“」
と「”」の間に文字列（××××××）を直接記述して
おき、この文字列を入力元としても良い。In the character string definition information, a syntax element used for generating an explanatory note and an input source of a character string used as a syntax element of the syntax are set. Also, attribute information of a structure index or an event index is set as an input source of a character string used as a syntax element. For example, a setting is made such that the content of the attribute information of １ of event 1 (event index) is a character string input source. As the input source of the character string, it is possible to simply specify the type of the attribute information as described later, or to use ““ ”as in the syntax element (What) of the term C.
A character string (xxxxxx) may be directly described between "and"", and this character string may be used as an input source.

【０１２４】構文要素としては、「何時（Ｗｈｅｎ）、
どこで（Ｗｈｅｒｅ）、なぜ（Ｗｈｙ）、誰の（Ｗｈ
ｏ）、何（Ｗｈａｔ）で、どのように（Ｈｏｗ）なっ
た」の５Ｗ１Ｈを基本とした構文要素が設定されてい
る。なお、文字列定義情報には、５Ｗ１Ｈの構文要素が
全て設定されていても良く、または一部のみが設定され
ていても良い。また、その他の構文要素を設定すること
も可能である。As syntax elements, “when (When),
Where (Where), Why (Why), Whose (Wh)
o), what (what) and how (how) the syntax element based on 5W1H is set. In the character string definition information, all 5W1H syntax elements may be set, or only some of them may be set. It is also possible to set other syntax elements.

【０１２５】ここで、図１２に示した状態遷移テーブル
を、各用語を表す抽象インデックスが定義されたテーブ
ルとして捉えて、野球映像を対象とした場合の抽象イン
デックスの定義の例を具体的に説明する。Here, the state transition table shown in FIG. 12 is regarded as a table in which an abstract index representing each term is defined, and an example of the definition of the abstract index in the case of a baseball video will be specifically described. I do.

【０１２６】例えば、用語『１塁打』の抽象インデック
スを定義すると、 &ABSINDEX １塁打 &RANGE pitch &PATTERN ヒット &K
EY ヒット &EXP<When:inning＿time, Who:batter＿name, What:"１
塁打"> と記述することができる。For example, if the abstract index of the term "1 base hit" is defined, & ABSINDEX 1 base hit & RANGE pitch & PATTERN hit & K
EY hit & EXP <When: inning_time, Who: batter_name, What: "1
Base hit ">.

【０１２７】なお、上記の記述は以下のルールで作成さ
れている。「&ABSINDEX 文字列」：&ABSINDEX は抽象インデックス
の宣言子（識別子）であり、後続の『文字列』が抽象イ
ンデックスの名称（用語）および意味が『文字列』であ
ることを示している。「&RANGE 文字列」：&RANGEは構造単位（生成粒度）の
宣言子であり、後続の『文字列』で構造単位（生成粒
度）を指定している。The above description has been created according to the following rules. "& ABSINDEX character string": & ABSINDEX is a declarator (identifier) of the abstract index, and the following "character string" indicates that the name (term) and meaning of the abstract index is "character string". “& RANGE character string”: & RANGE is a declarator of the structural unit (generation granularity), and the following “character string” specifies the structural unit (generation granularity).

【０１２８】「&PATTERN 文字列」：&PATTERNは、状態
遷移パターンの宣言子であり、後続の『文字列』で状態
遷移パターンを表現している。例えば、「 &PATTERN ヒ
ット」は、状態遷移パターンが１つの事象（ヒットの事
象インデックス）で構成されていることを示している。「&KEY 文字列」：&KEYはキーインデックスの宣言子で
あり、後続の『文字列』でキーインデックスを指定して
いる。なお、キーインデックスとしては、後続の『文字
列』の他に、「&ABSINDEX 文字列」で宣言されている抽
象インデックスそのものが自動的に指定される。"& PATTERN character string": & PATTERN is a state transition pattern declarator, and the following "character string" expresses the state transition pattern. For example, "& PATTERN hit" indicates that the state transition pattern is composed of one event (event index of hit). "& KEY character string": & KEY is a key index declarator, and the following "character string" specifies the key index. As the key index, in addition to the following “character string”, the abstract index itself declared in “& ABSINDEX character string” is automatically specified.

【０１２９】「&EXP <文字列> 」：&EXPは、文字列定義
情報の宣言子であり、後続の『 <文字列> 』で構文要素
およびその入力元を定義している。例えば、<When:inni
ng＿time, Who:batter＿name, What:"１塁打">の場合、
構文要素（Ｗｈｅｎ）は属性情報（inning＿time）を入
力元とし、構文要素（Ｗｈｏ）は属性情報（batter＿na
me）を入力元とし、構文要素（Ｗｈａｔ）は“１塁打”
（“，”の間の文字を入力することを示す）を入力元と
することを示している。"& EXP <character string>": & EXP is a declarator of character string definition information, and the following "<character string>" defines a syntax element and its input source. For example, <When: inni
ng_time, Who: batter_name, What: "1 base hit">
The syntax element (When) has attribute information (inning_time) as an input source, and the syntax element (Who) has attribute information (batter_na).
me) as the input source and the syntax element (What) is “1 base hit”
(Indicating that a character between “,” is to be input).

【０１３０】次に、他の抽象インデックスの定義例を以
下に示す。例：２塁打の抽象インデックス &ABSINDEX ２塁打 &RANGE pitch &PATTERN ヒット，２
塁進塁 &KEY ヒット &EXP<When:inning＿time, Who:batter＿name, What:"２
塁打"> 例：タイムリーヒットの抽象インデックス &ABSINDEX タイムリーヒット &RANGE pitch &PATTER
N １塁打，加点＋ &KEY 加点 &EXP<When:inning＿time, Who:batter＿na
me,What:"タイムリーヒット"> 例：タイムリーツーベースの抽象インデックス &ABSINDEX タイムリーツーベース &RANGE pitch &PA
TTERN ２塁打，加点＋ &KEY 加点 &EXP<When:inning＿time, Who:batter＿na
me,What:"タイムリーツーベース"> 例：逆転の抽象インデックス &ABSINDEX 逆転 &RANGE pitch &PATTERN 加点＋｛#o
ffense＿score ＜#defense＿score && #result＿offens
e ＿score ＞#defense＿score ｝ &KEY 加点 &EXP<How:"逆転">Next, another example of the definition of an abstract index is shown below. Example: Abstract index of 2 run & ABSINDEX 2 run & RANGE pitch & PATTERN hit, 2
Base advance & KEY hit & EXP <When: inning_time, Who: batter_name, What: "2
Base hit "> Example: Abstract index of timely hit & ABSINDEX Timely hit & RANGE pitch & PATTER
N 1 base hit, additional point + & KEY additional point & EXP <When: inning_time, Who: batter_na
me, What: "Timely hit"> Example: Abstract index of timely two bass & ABSINDEX Timely two bass & RANGE pitch & PA
TTERN 2 base hits, additional points + & KEY additional points & EXP <When: inning_time, Who: batter_na
me, What: "Timely two base"> Example: Abstract index of reversal & ABSINDEX Reversal & RANGE pitch & PATTERN Additional point + ｛# o
ffense_score <#defense_score &&#result_offens
e_score>#defense_score｝& KEY additional point & EXP <How: "reverse">

【０１３１】（１０）映像内容の説明文生成処理のアル
ゴリズムの例次に、図１３のフローチャートを参照して、映像内容の
説明文生成処理のアルゴリズムについて説明する。実施
の形態２の映像内容説明文生成装置２００を用いて映像
内容の説明文を生成する場合、先ず、映像内容の説明文
を生成したい映像、すなわち、対象となる構造単位の映
像（インデックス情報のみでも良い）を操作入力部２０
３および映像入力部２０１を介して入力する（Ｓ１３０
１）。(10) Example of Algorithm for Generating Explanation of Video Content Next, an algorithm for generating the description of video content will be described with reference to the flowchart in FIG. When generating a description of a video content using the video content description generation device 200 of the second embodiment, first, a video for which a description of the video content is to be generated, that is, a video of a target structural unit (only index information) Operation input unit 20).
3 and input via the video input unit 201 (S130).
1).

【０１３２】ここで、対象となる構造単位の映像の入力
方法としては、例えば、利用者が操作入力部２０３を介
して、説明文を生成したい映像シーンの範囲を指定する
ことにより、指定された映像シーンの範囲に対応する構
造単位で映像（インデックス情報）を入力する方法や、
あるいは予め説明文を生成する範囲として特定の構造単
位を設定しておき、特定の構造単位の映像を自動的に切
り出して入力する方法でも良い。前者の方法では、利用
者が所望の映像シーンを選択して、所望の映像シーンの
みの説明文を生成させることができ、後者の方法では、
特定の構造単位毎に連続して、かつ、自動的に説明文を
生成させることができる。Here, as a method of inputting a video of a target structural unit, for example, the user specifies, via the operation input unit 203, a range of a video scene in which a description is desired to be generated. How to input video (index information) in structural units corresponding to the range of video scene,
Alternatively, a method may be used in which a specific structural unit is set in advance as a range for generating an explanatory note, and a video of the specific structural unit is automatically cut out and input. In the former method, the user can select a desired video scene and generate an explanatory note of only the desired video scene. In the latter method,
An explanatory note can be generated automatically and continuously for each specific structural unit.

【０１３３】次に、用語検索部２０４が、状態遷移テー
ブル記憶部２０２の状態遷移テーブルを参照して、説明
文を生成する映像の構造単位と一致する生成粒度を検索
する（Ｓ１３０２）。この検索の結果、一致する生成粒
度があるか否かを判定し（Ｓ１３０３）、一致する生成
粒度があれば、ステップＳ１３０４へ進み、一致する生
成粒度がなければステップＳ１３０７へ進む。なお、用
語検索部２０４は、状態遷移テーブルを先頭から順番に
一致する生成粒度がなくなるまで検索するか、または一
致する生成粒度が検索される度に、ステップＳ１３０２
からステップＳ１３０３へ移行する。Next, the term retrieval unit 204 refers to the state transition table in the state transition table storage unit 202 and searches for a generation granularity that matches the structural unit of the video for which the description is generated (S1302). As a result of this search, it is determined whether there is a matching generation granularity (S1303). If there is a matching generation granularity, the process proceeds to step S1304, and if there is no matching generation granularity, the process proceeds to step S1307. The term search unit 204 searches the state transition table in order from the top until there is no matching generation granularity, or every time a matching generation granularity is searched, the process proceeds to step S1302.
Then, control goes to a step S1303.

【０１３４】続いて、用語検索部２０４は、該当する生
成粒度に対応する状態遷移パターンのキーインデックス
と一致する事象インデックスが、前記対象となる構造単
位中に存在するか否かを判定し（Ｓ１３０４）、一致す
る事象インデックスが存在しない場合には、ステップＳ
１３０２へ戻って次の一致する生成粒度の検索を行う。
一方、一致する事象インデックスが存在する場合には、
状態遷移パターンの事象インデックスの入力列と、対象
となる構造単位中の事象インデックスの発生パターンが
一致するか否かを判定する（Ｓ１３０５）。Subsequently, the term search unit 204 determines whether an event index that matches the key index of the state transition pattern corresponding to the corresponding generation granularity exists in the target structural unit (S1304). ), If there is no matching event index, step S
Returning to step 1302, the next matching generation granularity is searched.
On the other hand, if there is a matching event index,
It is determined whether or not the input sequence of the event index of the state transition pattern matches the occurrence pattern of the event index in the target structural unit (S1305).

【０１３５】ステップＳ１３０５において発生パターン
が一致した場合には、状態遷移パターンで定義された用
語が成立したと判定し、成立した用語の抽象インデック
スをステップＳ１３０１で入力した映像（インデックス
情報）中に付与し（Ｓ１３０６）、ステップＳ１３０２
へ戻る。なお、成立した用語の抽象インデックスを映像
に付与することにより、例えば、同じ映像から再度、説
明文を生成する際に抽象インデックスを利用することで
きるようになる。If the occurrence patterns match in step S1305, it is determined that the term defined in the state transition pattern has been established, and the abstract index of the established term is added to the video (index information) input in step S1301. (S1306), and step S1302.
Return to By adding the abstract index of the established term to the video, for example, it becomes possible to use the abstract index when generating the description again from the same video.

【０１３６】上記ステップＳ１３０２〜ステップＳ１３
０６の処理は、状態遷移テーブル中の一致する生成粒度
がなくなるまで実行される。換言すれば、状態遷移テー
ブル中の全ての一致する生成粒度に対して、各生成粒度
の用語が成立するか否かが判定され、成立する用語の集
合が抽出されたことになる。The above steps S1302 to S13
The process of step 06 is executed until there is no matching generation granularity in the state transition table. In other words, it is determined whether or not a term of each generation granularity holds for all matching generation granularities in the state transition table, and a set of satisfied terms is extracted.

【０１３７】ステップＳ１３０３において、一致する生
成粒度がないと判定された場合に、説明文生成部２０５
が、成立した用語の文字列定義情報を用いて、説明文を
生成し（Ｓ１３０７）、生成した説明文を説明文表示部
２０６に表示して処理を終了する。なお、説明文生成部
２０５には、説明文の生成用に予め５Ｗ１Ｈを基本とし
た構文が準備されている。説明文生成部２０５は、説明
文を生成する際に、５Ｗ１Ｈを基本とした構文に、文字
列定義情報の構文要素を配置することにより、説明文を
生成する。If it is determined in step S1303 that there is no matching generation granularity, the explanation generation unit 205
Generates an explanatory note using the character string definition information of the established term (S1307), displays the generated explanatory note on the explanatory note display unit 206, and ends the process. Note that a syntax based on 5W1H is prepared in advance in the explanatory sentence generating unit 205 for generating an explanatory sentence. When generating the description, the description generation unit 205 generates the description by arranging the syntax elements of the character string definition information in the syntax based on 5W1H.

【０１３８】なお、上記の説明文生成処理のアルゴリズ
ムでは、キーインデックスが事象インデックスであるこ
とを想定して説明したが、ステップＳ１３０３の後に、
キーインデックスとして抽象インデックスを優先して用
いることにより、対象となる構造単位に該当する抽象イ
ンデックス（キーインデックス）が付与されているか否
かを判定する処理を追加し、さらに抽象インデックスが
設定されている場合には、対応する状態遷移パターンで
定義された用語が成立したと判定する処理を追加するこ
とにより、用語が成立するか否かの判定処理の高速化を
図ることができる。その後、抽象インデックスが存在し
ない場合に、キーインデックスとして指定されている事
象インデックスを用いて、ステップＳ１３０４以降を実
行するようにする。In the above description sentence generation algorithm, the description has been made on the assumption that the key index is an event index. However, after step S1303,
By preferentially using an abstract index as a key index, a process of determining whether an abstract index (key index) corresponding to a target structural unit is added is added, and an abstract index is set. In this case, by adding a process of determining that the term defined in the corresponding state transition pattern has been established, it is possible to speed up the process of determining whether or not the term is established. After that, if there is no abstract index, step S1304 and subsequent steps are executed using the event index specified as the key index.

【０１３９】（１１）構文および文字列定義情報を用い
て生成した説明文の具体例次に、５Ｗ１Ｈを基本とした構文の例と、構文および文
字列定義情報を用いて生成した説明文の例を具体的に挙
げる。なお、説明を簡単するために構文の例として、以
下の５Ｗ１Ｈの構文例を使用する。 (11) Specific Example of Explanation Text Generated Using Syntax and Character String Definition Information Next, an example of a syntax based on 5W1H and an example of an explanation text generated using the syntax and character string definition information Are specifically mentioned. For the sake of simplicity, the following 5W1H syntax example is used as a syntax example.

【０１４０】成立した用語が１つである場合（例１）先ず、説明文を生成する場合の最も簡単な例を挙げて説
明する。例えば、成立した用語が、『タイムリーヒッ
ト』であり、その文字列定義情報が、『&EXP<When:inning＿time, Who:batter＿name, What:"
タイムリーヒット">』である場合、成立した用語の文字列定義情報からは、『Ｗｈｅｎ』、『Ｗｈｏ』、『Ｗｈａｔ』の３つの構文要素が得れる。Case where One Term is Satisfied (Example 1) First, the simplest example of generating a description will be described. For example, the established term is “timely hit” and its character string definition information is “& EXP <When: inning_time, Who: batter_name, What:”
If it is timely hit ">", three syntax elements "When", "Who", and "What" can be obtained from the character string definition information of the established term.

【０１４１】また、構文要素の入力元として指定された
属性情報の内容（文字列）が、『inning＿time＝１回裏』『batter＿name＝××』『What:"タイムリーヒット" 』であるとすると、上記構文および文字列定義情報から、
以下の説明文が生成される。１回裏（なし）（なし） ××のタイムリーヒット（なし） When Where Why Who What How なお、（なし）は該当する情報がない部分を示し、必ず
しも５Ｗ１Ｈの全て構文要素が存在する必要はない。こ
のように文字列定義情報から得られる構文要素を使用し
て説明文を生成することができる。また、文字列定義情
報中で５Ｗ１Ｈ以外の構文要素が指定されている場合に
は、構文を適宜調整して説明文を長くしても良く、ある
いはその他の構文要素が組み込まれた構文を予め複数準
備しておき、適宜選択して使用するようにしても良い。Assume that the content (character string) of the attribute information specified as the input source of the syntax element is “inning_time = 1 back”, “batter_name = xx”, “What:“ timely hit ”” , From the above syntax and string definition information,
The following description is generated. One time back (None) (None) XX Timely hit (None) When Where Why Who What How (None) indicates the part without the applicable information, and it is not necessary that all 5W1H syntax elements exist. Absent. As described above, an explanatory note can be generated using the syntax element obtained from the character string definition information. When a syntax element other than 5W1H is specified in the character string definition information, the syntax may be adjusted as appropriate to lengthen the description, or a plurality of syntaxes incorporating other syntax elements may be specified in advance. It may be prepared and used appropriately by selecting.

【０１４２】成立した用語が複数である場合（例２）また、意味の成立した用語が複数ある場合には、説明文
生成部２０５は、各用語の文字列定義情報に定義されて
いる構文要素を抽出して、抽出した構文要素を組み合わ
せて、５Ｗ１Ｈの構文の該当する構文要素の位置に配置
し、説明文を生成する。In the case where there are a plurality of established terms (Example 2) In the case where there are a plurality of terms which have a meaning, the explanatory sentence generating unit 205 sets the syntax element defined in the character string definition information of each term. Is extracted, and the extracted syntax elements are combined, arranged at the position of the corresponding syntax element of the 5W1H syntax, and an explanatory sentence is generated.

【０１４３】例えば、成立した用語が、『タイムリーヒット』『逆転』の２つであり、それぞれの文字列定義情報が、『&EXP<When:inning＿time, Who:batter＿name, What:"
タイムリーヒット">』『&EXP<How:"逆転">』の２つである場合、成立した用語の文字列定義情報から
は、構文要素として、『Ｗｈｅｎ』、『Ｗｈｏ』、『Ｗｈａｔ』、『Ｈｏｗ』の４つの構文要素が得られる。For example, the terms that have been established are “timely hit” and “reverse”, and the character string definition information is “& EXP <When: inning_time, Who: batter_name, What:”
In the case of "Timely hit"> and "& EXP <How:" Reverse ">", from the character string definition information of the established term, the syntax elements "When", "Who", "What", The four syntax elements “How” are obtained.

【０１４４】また、構文要素の入力元として指定された
属性情報の内容（文字列）が、『inning＿time＝１回裏』『batter＿name＝××』『What:"タイムリーヒット" 』『How:" 逆転" 』であるとすると、上記構文および文字列定義情報から、
以下の説明文が生成される。１回裏（なし）（なし） ××のタイムリーヒットで逆転 When Where Why Who What HowAlso, the content (character string) of the attribute information specified as the input source of the syntax element is “inning_time = 1 back”, “batter_name = xx”, “What:“ timely hit ””, “How:” Inversion "], from the above syntax and character string definition information,
The following description is generated. One time back (None) (None) Reversal with timely hit of XX When Where Why Who What How

【０１４５】成立した用語が複数である場合（例３）意味の成立した用語が複数あり、このときに、重複する
構文要素がある場合（例えば、『Ｗｈｏ』の構文要素が
複数ある場合）には、重複した構文要素を含む各用語の
状態遷移パターンを参照し、状態遷移パターン中でより
時間的に後の事象インデックスを参照している構文要素
を選択する。When there are a plurality of established terms (Example 3) When there are a plurality of terms that have a meaning and there are duplicate syntax elements (for example, when there are a plurality of "Who" syntax elements), Refers to the state transition pattern of each term that includes a duplicated syntax element, and selects the syntax element that refers to the later event index in the state transition pattern.

【０１４６】例えば、成立した用語が、『１塁打』『タイムリーヒット』の２つであり、それぞれの文字列定義情報が、『&EXP<When:inning＿time, Who:batter＿name, What:"
１塁打">』『&EXP<When:inning＿time, Who:batter＿name, What:"
タイムリーヒット">』の２つである場合、成立した用語の文字列定義情報から
は、構文要素として、『Ｗｈｅｎ』、『Ｗｈｏ』、『Ｗｈａｔ』の３つの構文要素が得られるが、３つの構文要素が全て
重複している。For example, two terms that have been established are “1 base hit” and “timely hit”, and the character string definition information is “& EXP <When: inning_time, Who: batter_name, What:”
"1 base hit">] "& EXP <When: inning_time, Who: batter_name, What:"
In the case of the timely hit ">", three syntax elements of "Wen", "Who", and "What" are obtained as syntax elements from the character string definition information of the established term. Syntax elements are all duplicated.

【０１４７】また、構文要素の入力元として指定された
属性情報の内容（文字列）が、１塁打の場合：『inning＿time＝１回裏』『batter＿name＝××』『What:"一塁打" 』タイムリーヒットの場合：『inning＿time＝１回裏』『batter＿name＝××』『What:"タイムリーヒット" 』であるとする。ここで『inning＿time』の内容は両方と
もに『１回裏』で同一であるので特に選択する必要がな
いが、『batter＿name』と『What』の内容が異なるので
何れかを選択しなければならない。When the content (character string) of the attribute information specified as the input source of the syntax element is a single-base hit: “inning_time = 1 back”, “batter_name = xx”, “What:“ single-base hit ”” In the case of a timely hit: “inning_time = 1 back”, “batter_name = xx”, “What:“ timely hit ””. Here, since the contents of "inning_time" are the same for both "one time back", there is no need to particularly select them, but since the contents of "batter_name" and "What" are different, one must be selected.

【０１４８】このような場合に、重複した構文要素を含
む各用語の状態遷移パターンを参照し、状態遷移パター
ン中でより時間的に後の事象インデックスを参照してい
る構文要素を選択する。例えば、１塁打の状態遷移パタ
ーンが、 &PATTERN ヒットタイムリーヒットの状態遷移パターンが、 &PATTERN １塁打，加点＋である場合、映像シーンの範囲における事象インデック
ス（ヒット）、事象インデックス（１塁打）、事象イン
デックス（加点＋）が付与されている位置を比較し、よ
り時間的に後の位置に付与されている事象インデックス
を特定する。ここでは、事象インデックス（加点＋）が
時間的に後の位置に付与されているものとして説明す
る。In such a case, the state transition pattern of each term including the duplicated syntax element is referred to, and a syntax element which refers to a later event index in the state transition pattern is selected. For example, when the state transition pattern of the first base hit is & PATTERN hit, the state transition pattern of the timely hit is & PATTERN 1 base hit, additional point +, the event index (hit), the event index (1 base hit), and the event in the range of the video scene. The positions at which the index (addition point +) is assigned are compared, and the event index assigned at a later position in time is specified. Here, a description will be given assuming that the event index (addition point +) is assigned to a position that is later in time.

【０１４９】特定された事象インデックス（加点＋）を
参照している用語は、タイムリーヒットであるので、状
態遷移パターン中でより時間的に後の事象インデックス
を参照している構文要素として、タイムリーヒットの構
文要素を選択する。Since the term referring to the specified event index (addition point +) is a timely hit, the syntax element referring to the later event index in the state transition pattern is time Select the syntax element for the lead hit.

【０１５０】上記構文および文字列定義情報から、以下
の説明文が生成される。１回裏（なし）（なし） ××のタイムリーヒット（なし） When Where Why Who What HowThe following explanatory text is generated from the syntax and the character string definition information. Once back (none) (none) XX timely hit (none) When Where Why Who What How

【０１５１】さらに、成立した用語が複数である場合に
説明文を生成する際の条件を追加することにより、以下
のように重複する構文要素を発生時間順に並べて説明文
を生成することもできる。１回裏 ××の１塁打（に続いて） ××のタイムリーヒット When Who What Who What Further, by adding a condition for generating an explanatory note when there are a plurality of established terms, an explanatory note can be generated by arranging overlapping syntax elements in the order of occurrence time as follows. 1 time back XX 1 base hit (following) XX timely hit When Who What Who What

【０１５２】成立した用語が複数である場合（例４）また、成立した用語が複数である場合の説明文の他の生
成例として、５Ｗ１Ｈの構文要素の中に重複する構文要
素がある場合、各用語の状態遷移パターンを比較し、よ
り多くの事象インデックスを用いて定義された状態遷移
パターンの構文要素を優先して、説明文を生成する。In the case where there are a plurality of approved terms (Example 4) As another example of generating a description sentence in the case where there are a plurality of established terms, when there is a duplicate syntax element in the 5W1H syntax element, The state transition pattern of each term is compared, and a description is generated by giving priority to the syntax elements of the state transition pattern defined using more event indexes.

【０１５３】例えば、成立した用語が、『タイムリーヒット』『タイムリーツーベース』『逆転』の３つであり、それぞれの文字列定義情報が、『&EXP<When:inning＿time, Who:batter＿name, What:"
タイムリーヒット">』『&EXP<When:inning＿time, Who:batter＿name,What:"
タイムリーツーベース">』『&EXP<How:"逆転">』の３つである場合、成立した用語の文字列定義情報から
は、構文要素として、『Ｗｈｅｎ』、『Ｗｈｏ』、『Ｗｈａｔ』、『Ｈｏｗ』の４つの構文要素が得られるが、『Ｈｏｗ』以外の３つ
の構文要素が重複している。For example, the established terms are three: “timely hit”, “timely to base”, and “reverse”, and the character string definition information is “& EXP <When: inning_time, Who: batter_name, What: "
Timely hit ">""& EXP <When: inning_time, Who: batter_name, What:"
In the case of "Timely two base"> and "& EXP <How:" Reverse ">", from the character string definition information of the term that has been established, the syntax elements "When", "Who", "What", Although four syntax elements “How” are obtained, three syntax elements other than “How” are duplicated.

【０１５４】また、構文要素の入力元として指定された
属性情報の内容（文字列）が、タイムリーヒットの場
合：『inning＿time＝１回裏』『batter＿name＝××』『What:"タイムリーヒット" 』タイムリーヒットの場合：『inning＿time＝１回裏』『batter＿name＝××』『What:"タイムリーツーベース" 』であるとする。ここで『inning＿time』の内容は両方と
もに『１回裏』で同一であり、『batter＿name』が『×
×』で同一であるので特に選択する必要がないが、『Wh
at』の内容が異なるので何れかを選択しなければならな
い。When the content (character string) of the attribute information specified as the input source of the syntax element is a timely hit: "inning_time = 1 back""batter_name = xxx""What:" timely hit "" Timely hit: "inning_time = back once""batter_name = xx""What:" timely to base "". Here, the contents of “inning_time” are the same for both “one time back” and “batter_name” is “×
×) is the same, so there is no need to select it.
at ”is different, so you have to choose one.

【０１５５】このような場合に、重複する構文要素を含
む各用語の状態遷移パターンを比較し、より多くの事象
インデックスを用いて定義された状態遷移パターンの構
文要素を優先して選択する。例えば、タイムリーヒット
の状態遷移パターンが、 &PATTERN １塁打，加点＋タイムリーツーベースの状態遷移パターンが、 &PATTERN ヒット，２塁進塁，加点＋である場合、状態遷移パターンで参照している事象イン
デックスの数は、『タイムリーヒット』が事象インデッ
クス（１塁打）と事象インデックス（加点＋）の２つで
あり、『タイムリーツーベース』が事象インデックス
（ヒット）と事象インデックス（２塁進塁）と事象イン
デックス（加点＋）との３つである。したがって、タイムリーツーベースの参照数（３）＞タイムリーヒッ
トの参照数（２）となり、ここでは、タイムリーツーベースの構文要素を
選択する。In such a case, the state transition patterns of the terms including the overlapping syntax elements are compared, and the syntax element of the state transition pattern defined using more event indexes is preferentially selected. For example, if the state transition pattern of the timely hit is & PATTERN 1 base hit, additional point + timely two base state transition pattern is & PATTERN hit, 2nd advance, additional point +, the event index referenced in the state transition pattern The number of "timely hits" is the event index (1 base hit) and the event index (additional point +), and "timely two bases" is the event index (hit) and the event index (second base). Event index (additional point +). Therefore, the number of references of the timely two base (3)> the number of references of the timely hit (2), and here, the syntax element of the timely two base is selected.

【０１５６】上記構文および文字列定義情報から、以下
の説明文が生成される。１回裏（なし）（なし） ××のタイムリーツーベースで逆転 When Where Why Who What HowThe following explanatory text is generated from the syntax and the character string definition information. Once back (None) (None) XX timely two-base reversing When Where Why Who What How

【０１５７】また、他の説明文の生成例として、以下に
示すように、単純に重複する構文要素を並列に列挙する
ようにしても良い。１回裏２・３塁間 ××のヒットで逆転突風で ××のエラー When Where Why Who What How As another example of generating a description, as shown below, simply overlapping syntax elements may be listed in parallel. 1st back between 2nd and 3rd base XX hits and reverses Gusts of XX errors When Where Why Who What How

【０１５８】すなわち、説明文は『１回裏、２・３塁間
で突風のため、（××のエラー，××のヒット）で逆
転』という記述になる。このように状況によっては、単
純に並列に列挙する方が、映像シーンの中で発生した事
象をより正確な情報として伝えることができる場合があ
る。That is, the description is described as "Reverse due to a gust between the first and second and third bases, (error in xx, hit in xx)". As described above, depending on the situation, simply enumerating in parallel may be able to convey events occurring in the video scene as more accurate information.

【０１５９】以上説明した実施の形態２に係る映像内容
の説明文生成方法は、前述した説明および各フローチャ
ートに示した手順に従って、予めプログラムをコンピュ
ータで実行することによって実現される。このプログラ
ムは、ハードディスク、フロッピーディスク、ＣＤ−Ｒ
ＯＭ、ＭＯ、ＤＶＤ等のコンピュータで読み取り可能な
記録媒体に記録され、コンピュータによって記録媒体か
ら読み出されることによって実行される。また、このプ
ログラムは、上記記録媒体を介して、またはネットワー
クを介して配布することができる。[0159] The method of generating a description of a video content according to the second embodiment described above is realized by executing a program in advance by a computer according to the procedure described in the above description and each flowchart. This program is for hard disk, floppy disk, CD-R
The program is recorded on a computer-readable recording medium such as OM, MO, and DVD, and is executed by being read from the recording medium by the computer. This program can be distributed via the recording medium or via a network.

【０１６０】[0160]

【発明の効果】以上説明したように、本発明の映像検索
方法（請求項１〜８）によれば、予め、複数の事象の組
み合わせによって表現可能な用語に対して、映像を検索
する際の検索対象単位として適当な構造単位を検索粒度
として設定し、用語を表現する事象の発生パターンに基
づいて、検索粒度の中で連続して発生する複数の事象イ
ンデックスの入力列として用語に対応した状態遷移パタ
ーンを定義しておき、映像中から所望の映像シーンを検
索する場合に、所望の映像シーンを表現した用語を入力
し、入力した用語に対応する検索粒度を検索対象単位と
して、検索粒度と一致する構造単位毎に、入力した用語
に対応した状態遷移パターンと構造単位中の事象インデ
ックスの発生パターンが一致するか否かを判定し、一致
した構造単位を検索結果として出力するため、映像の内
容の意味に対する問い合わせを抽象度の高い用語または
概念を用いて行うことができ、かつ、高速に検索を行う
ことができる。As described above, according to the video search method of the present invention (claims 1 to 8), when a video is searched for a term that can be expressed by a combination of a plurality of events in advance. An appropriate structural unit as the search target unit is set as the search granularity, and the state corresponding to the term as an input string of multiple event indexes that occur consecutively in the search granularity based on the occurrence pattern of the event expressing the term When a transition pattern is defined and a desired video scene is searched from a video, a term expressing the desired video scene is input, and a search granularity corresponding to the input term is used as a search target unit, and a search granularity and For each matching structural unit, it is determined whether the state transition pattern corresponding to the input term matches the occurrence pattern of the event index in the structural unit, and the matching structural unit is detected. For outputting as a result, it can be accomplished using a high term or concept abstract queries for meanings of contents of the video, and can perform high-speed search.

【０１６１】また、用語に対応した状態遷移パターン毎
に、それぞれの状態遷移パターン中に存在する事象イン
デックスのうち、少なくとも１つの事象インデックスが
検索の取り掛かりとなるキーインデックスとして指定し
ておき、映像中から所望の映像シーンを検索する場合
に、検索粒度と一致する構造単位で、かつ、キーインデ
ックスと一致する事象インデックスを有する構造単位を
検索した後、該当する構造単位に対して、状態遷移パタ
ーンと構造単位中の事象インデックスの発生パターンが
一致するか否かの判定を行うことにより、高速にキーイ
ンデックス検索（すなわち、キーワード検索）を行って
対象となる構造単位を絞り込んだ後、状態遷移パターン
を用いた判定を行うので、さらに抽象度の高い用語また
は概念を用いた問い合わせによる映像検索の高速化を図
ることができる。Further, for each state transition pattern corresponding to a term, at least one of the event indexes existing in each state transition pattern is designated as a key index for starting a search, and is used in the video. When searching for a desired video scene from, a search is made for a structural unit having an event index that matches the key index, and a structural unit that matches the search granularity. By determining whether or not the occurrence pattern of the event index in the structural unit matches, the key index search (that is, keyword search) is performed at high speed to narrow down the structural unit to be targeted, and then the state transition pattern is determined. Uses terms or concepts with a higher level of abstraction because of the It is possible to increase the speed of the video search by Align.

【０１６２】また、検索結果として出力された構造単位
に基づいて、映像から所望の映像シーンを取り出す際
に、構造単位で特定された映像シーンを出力するため、
すなわち、予め映像を検索する際の検索粒度として設定
した適当な構造単位で映像シーンを出力するため、利用
者が望む映像部分に近い映像シーンを容易に出力でき
る。When extracting a desired video scene from a video based on the structural unit output as a search result, the video scene specified by the structural unit is output.
That is, since the video scene is output in an appropriate structure unit set in advance as a search granularity when searching for a video, a video scene close to a video portion desired by the user can be easily output.

【０１６３】また、検索結果として出力された構造単位
に基づいて、映像から所望の映像シーンを取り出す際
に、構造化された映像上における上位または下位の任意
の構造単位を指定可能であるため、抽象的な用語を用い
た問い合わせの入力と、出力する構造単位の指定の組み
合わせによって、検索を行う際の利便性の向上を図るこ
とができる。例えば、映像シーンを表現した抽象的な用
語で検索された構造単位の前または／および後に実際に
見たい映像シーンが存在する場合や、映像シーンを表現
した抽象的な用語で検索された構造単位の一部（下位の
構造単位）に実際に見たい映像シーンが存在する場合に
便利である。Further, when a desired video scene is extracted from a video based on the structural unit output as a search result, an arbitrary upper or lower structural unit on the structured video can be designated. By combining the input of the query using the abstract term and the specification of the structural unit to be output, the convenience in performing the search can be improved. For example, when there is a video scene that the user actually wants to see before and / or after the structural unit searched by the abstract term expressing the video scene, or the structural unit searched by the abstract term expressing the video scene This is convenient when there is a video scene that the user actually wants to see in a part (lower-order structural unit).

【０１６４】また、検索結果として出力された構造単位
に基づいて、映像から所望の映像シーンを取り出す際
に、キーインデックスが付与された場所の前後に映像切
り出しのためのオフセットを指定して映像シーンを取り
出すため、検索を行う際の利便性の向上を図ることがで
きる。例えば、映像シーンを表現した抽象的な用語で検
索された構造単位の前または／および後の状況を併せて
確認したい場合に便利である。When extracting a desired video scene from a video based on the structural unit output as a search result, an offset for video clipping is designated before and after the place where the key index is added. Therefore, the convenience in performing the search can be improved. For example, it is convenient when the user wants to confirm the situation before and / or after the structural unit searched by using the abstract term representing the video scene.

【０１６５】また、入力した用語に対応した状態遷移パ
ターンと構造単位中の事象インデックスの発生パターン
が一致するか否かを判定した際に、一致した構造単位
に、その用語を表すインデックスを抽象インデックスと
して定義して新たに付加し、キーインデックスとして再
利用するため、同一の抽象的な用語で再度映像の検索が
行われた場合に、このキーインデックス（抽象インデッ
クス）を利用してさらに検索の高速化を図ることができ
る。また、検索結果として出力した構造単位を後で再度
確認する場合にこのキーインデックス（抽象インデック
ス）を利用して同一の構造単位を確実に検索することが
できる。When it is determined whether the state transition pattern corresponding to the input term matches the occurrence pattern of the event index in the structural unit, an index representing the term is added to the matching structural unit by the abstract index. When the video is searched again using the same abstract term, the key index (abstract index) is used to further increase the search speed. Can be achieved. When the structural unit output as a search result is confirmed again later, the same structural unit can be reliably searched using this key index (abstract index).

【０１６６】また、構造インデックスおよび事象インデ
ックスには、複数の属性情報が付加されており、用語に
対応した状態遷移パターンには、構造インデックスおよ
び事象インデックスの各属性情報を用いて用語に関連し
た説明文を生成するための文字列定義情報が付加されて
おり、状態遷移パターンの文字列定義情報に基づいて、
検索結果として出力された構造単位中の属性情報を参照
して用語に関連した説明文を生成するため、映像の内容
の意味を解釈し、一般的な意味のある文字列に変換し
て、利用者にとって分かりやすい映像内容の説明文字列
を生成することができる。Further, a plurality of attribute information items are added to the structure index and the event index, and the state transition pattern corresponding to the term is explained using the attribute information of the structure index and the event index. Character string definition information for generating a sentence is added, and based on the character string definition information of the state transition pattern,
Interpret the meaning of the video content, convert it to a general meaningful character string, and use it to generate a description related to the term by referring to the attribute information in the structural unit output as the search result It is possible to generate an explanatory character string of the video content that is easy for the user to understand.

【０１６７】また、事象インデックスに加えて抽象イン
デックス用いて、事象インデックスと抽象インデックス
からなる入力列として用語に対応した状態遷移パターン
を定義するため、状態遷移パターンの定義が容易に行え
ると共に、検索処理の高速化を図ることも可能である。Further, since a state transition pattern corresponding to a term is defined as an input string composed of the event index and the abstract index using the abstract index in addition to the event index, the state transition pattern can be easily defined, and the search processing can be performed. It is also possible to increase the speed.

【０１６８】また、本発明の映像検索方法（請求項９）
によれば、映像検索を行う前の処理として、予め、複数
の事象の組み合わせによって表現可能な用語に対して、
映像を検索する際の検索対象単位として適当な構造単位
を検索粒度として設定し、用語を表現する事象の発生パ
ターンに基づいて、検索粒度の中で連続して発生する複
数の事象インデックスの入力列として用語に対応した状
態遷移パターンを定義し、さらに、用語に対応した状態
遷移パターン毎に、それぞれの状態遷移パターン中に存
在する事象インデックスのうち、少なくとも１つの事象
インデックスを検索の取り掛かりとなるキーインデック
スとして指定して、用語、検索粒度、状態遷移パターン
およびキーインデックスからなる状態遷移テーブルを生
成する状態遷移テーブル生成工程を含み、映像検索を行
う際の処理として、所望の映像シーンを表現した用語を
入力する用語入力工程と、状態遷移テーブルを参照し
て、用語入力工程で入力した用語に対応する検索粒度を
検索対象単位とし、かつ、用語に対応するキーインデッ
クスを用いて、キーインデックスと一致する事象インデ
ックスを有する構造単位を検索する検索工程と、状態遷
移テーブルを参照して、検索工程で検索した構造単位中
に用語に対応する状態遷移パターン中に含まれる事象イ
ンデックスが全て存在するか否かを判定する第１の判定
工程と、第１の判定工程で全て存在すると判定された構
造単位に対して、入力した用語に対応した状態遷移パタ
ーンと構造単位中の事象インデックスの発生パターンが
一致するか否かを判定する第２の判定工程と、第２の判
定工程で一致すると判定された構造単位に基づいて、映
像中から映像シーンを切り出して、検索結果として出力
する検索結果出力工程と、を含むため、映像の内容の意
味に対する問い合わせを抽象度の高い用語または概念を
用いて行うことができ、かつ、高速にキーインデックス
検索（すなわち、キーワード検索）を行って対象となる
構造単位を絞り込んだ後、状態遷移パターンを用いた判
定を行うので、さらに抽象度の高い用語または概念を用
いた問い合わせによる映像検索の高速化を図ることがで
きる。A video search method according to the present invention (claim 9)
According to, as a process before performing a video search, in advance, for a term that can be expressed by a combination of a plurality of events,
An input sequence of a plurality of event indexes that occur consecutively in the search granularity based on the occurrence pattern of the event expressing the term by setting the appropriate structural unit as the search target unit when searching for video, based on the occurrence pattern of the event expressing the term A state transition pattern corresponding to the term is defined as, and, for each state transition pattern corresponding to the term, at least one event index among the event indexes present in each state transition pattern is used as a key for starting a search. Including a state transition table generation step of generating a state transition table including a term, a search granularity, a state transition pattern, and a key index by designating as an index, as a process for performing a video search, a term expressing a desired video scene In the term input process, and the state transition table with reference to the state transition table. A search step of searching for a structural unit having an event index that matches the key index by using the search granularity corresponding to the input term as a search target unit and using the key index corresponding to the term, and referring to the state transition table. A first determining step of determining whether all the event indexes included in the state transition pattern corresponding to the term exist in the structural unit searched in the searching step; and determining that all of the event indexes exist in the first determining step. A second determination step of determining whether the state transition pattern corresponding to the input term matches the occurrence pattern of the event index in the structural unit with respect to the determined structural unit; A search result output step of cutting out a video scene from the video based on the structural unit determined to match and outputting the search result as a search result. Inquiries about the meaning of video content can be made using terms or concepts with a high degree of abstraction, and key index searches (ie, keyword searches) are performed at high speed to narrow down the target structural units, and then state Since the determination using the transition pattern is performed, the speed of the video search by the inquiry using the term or concept having a higher degree of abstraction can be increased.

【０１６９】さらに、本発明のコンピュータ読み取り可
能な記録媒体（請求項１０）によれば、請求項１〜９の
いずれか一つに記載の映像検索方法をコンピュータに実
行させるためのプログラムを記録したため、このプログ
ラムをコンピュータに実行させることにより、映像の内
容の意味に対する問い合わせを抽象度の高い用語または
概念を用いて行うことができ、かつ、高速にキーインデ
ックス検索（すなわち、キーワード検索）を行って対象
となる構造単位を絞り込んだ後、状態遷移パターンを用
いた判定を行うので、さらに抽象度の高い用語または概
念を用いた問い合わせによる映像検索の高速化を図るこ
とができる。Furthermore, according to the computer-readable recording medium of the present invention (claim 10), a program for causing a computer to execute the video search method according to any one of claims 1 to 9 is recorded. By causing the computer to execute this program, it is possible to make an inquiry about the meaning of the content of the video using terms or concepts with a high degree of abstraction, and to perform a key index search (that is, a keyword search) at high speed. Since the determination using the state transition pattern is performed after narrowing down the target structural unit, it is possible to speed up the video search by an inquiry using a term or concept with a higher degree of abstraction.

【０１７０】また、本発明の映像検索処理装置（請求項
１１、１２）によれば、検索対象である構造化した映像
を入力する映像入力手段と、検索する所望の映像シーン
を指定するための用語と、用語で映像を検索する際の検
索対象単位としての構造単位を指定する検索粒度と、用
語を検索粒度の中で連続して発生する複数の事象インデ
ックスの入力列として定義した状態遷移パターンと、状
態遷移パターン中に存在する事象インデックスのうちの
少なくとも１つの事象インデックスを指定したキーイン
デックスと、を状態遷移テーブルとして記憶した記憶手
段と、所望の映像シーンを検索するための問い合わせ用
語を入力または指定するための操作入力手段と、操作入
力手段を介して問い合わせ用語が入力または指定された
場合に、記憶手段の状態遷移テーブルを参照し、問い合
わせ用語に対応するキーインデックスを用いて、映像入
力手段で入力した映像から検索粒度と一致し、かつ、キ
ーインデックスと一致する事象インデックスを有する構
造単位を検索する検索手段と、検索手段で検索された構
造単位を入力し、問い合わせ用語に対応した状態遷移パ
ターンと構造単位中の事象インデックスの発生パターン
が一致するか否かを判定する判定手段と、判定手段で一
致すると判定された構造単位に基づいて、映像中から映
像シーンを切り出して、検索結果として出力する検索結
果出力手段と、を備えたため、映像の内容の意味に対す
る問い合わせを抽象度の高い用語または概念を用いて行
うことができ、かつ、高速にキーインデックス検索（す
なわち、キーワード検索）を行って対象となる構造単位
を絞り込んだ後、状態遷移パターンを用いた判定を行う
ので、さらに抽象度の高い用語または概念を用いた問い
合わせによる映像検索の高速化を図ることができる。Further, according to the video search processing device of the present invention (claims 11 and 12), a video input means for inputting a structured video to be searched and a video input means for designating a desired video scene to be searched. A term, a search granularity that specifies a structural unit as a search target unit when searching for video using terms, and a state transition pattern that defines the term as an input sequence of multiple event indexes that occur consecutively in the search granularity And a key index designating at least one event index among the event indexes present in the state transition pattern, a storage means storing a state transition table, and an inquiry term for searching for a desired video scene. Or an operation input unit for specifying, and a storage unit when an inquiry term is input or specified via the operation input unit. Search means for referring to the state transition table and searching for a structural unit having an event index that matches the search granularity and matches the key index from the video input by the video input means using the key index corresponding to the query term And inputting the structural unit searched by the searching means, determining means for determining whether the state transition pattern corresponding to the query term matches the occurrence pattern of the event index in the structural unit, A search result output unit that cuts out a video scene from the video based on the determined structural unit and outputs the video scene as a search result, so that inquiries about the meaning of the video content are made using terms or concepts with a high degree of abstraction. Key index search (ie, keyword search) After narrowing down the structural unit comprising the elephants, since the determination using the state transition patterns, it is possible to increase the speed of video retrieval by query further with higher terms or concepts abstract.

【０１７１】また、本発明の映像インデックス付与方法
（請求項１３）によれば、予め、複数の事象が連続して
発生することによって意味が成立する用語と、用語を表
現する複数の事象インデックスの入力列を用いて用語を
定義した状態遷移パターンと、所定の構造インデックス
で定義される映像の構造単位を用語に対応させて指定し
た検索粒度とを、用語毎に対応させた状態遷移テーブル
を設定しておき、映像に映像インデックスを付与する際
に、状態遷移テーブルを参照して、構造インデックスに
よって特定される構造単位毎に、対象となる構造単位と
検索粒度が一致し、かつ、対象となる構造単位内に付与
された事象インデックスの付与順序と状態遷移パターン
の複数の事象インデックスの入力列とが一致する用語を
検索して、一致する用語が存在する場合に、一致した用
語の意味が発生したと判定し、該当する用語の成立を示
す事象インデックスまたは属性情報を付与するため、抽
象度の高い用語を用いたインデックスの付与を効率良く
高速に行うことができる。Further, according to the video index assigning method of the present invention (claim 13), a term whose meaning is established by a plurality of successive occurrences of an event and a plurality of event indexes expressing the term are determined in advance. Set a state transition table that associates, for each term, a state transition pattern that defines terms using an input string, and a search granularity that specifies a structural unit of video defined by a predetermined structure index in association with the term In addition, when a video index is assigned to a video, the search granularity matches the target structural unit and the search granularity for each structural unit specified by the structural index with reference to the state transition table, and A search is made for a term in which the assignment order of the event indexes assigned in the structural unit matches the input sequence of the multiple event indexes of the state transition pattern, and a match is found. If a term exists, it is determined that the meaning of the matched term has occurred, and an event index or attribute information indicating the establishment of the corresponding term is added. Can be done at high speed.

【０１７２】また、本発明の映像インデックス付与方法
（請求項１４）によれば、予め、複数の事象が連続して
発生することによって意味が成立する用語と、用語を表
現する複数の事象インデックスの入力列を用いて用語を
定義した状態遷移パターンと、所定の構造インデックス
で定義される映像の構造単位を用語に対応させて指定し
た検索粒度とを、状態遷移パターン中に存在する事象イ
ンデックスのうち、少なくとも１つの事象インデックス
を指定した中心インデックスとを、用語毎に対応させた
状態遷移テーブルを設定しておき、映像に映像インデッ
クスを付与する際に、状態遷移テーブルを参照して、中
心インデックスと一致する事象インデックスが付与され
た場合に、構造インデックスによって特定される構造単
位毎に、対象となる構造単位と検索粒度が一致し、か
つ、対象となる構造単位内に付与された事象インデック
スの付与順序と状態遷移パターンの複数の事象インデッ
クスの入力列とが一致する用語を検索して、一致する用
語が存在する場合に、一致した用語の意味が発生したと
判定し、該当する用語の成立を示す事象インデックスま
たは属性情報を付与するため、抽象度の高い用語を用い
たインデックスの付与を効率良く高速に行うことができ
る。Further, according to the video index assigning method of the present invention (claim 14), a term whose meaning is established by a plurality of consecutive occurrences of an event and a plurality of event indexes expressing the term are determined in advance. A state transition pattern in which terms are defined using an input sequence, and a search granularity in which a structural unit of a video defined by a predetermined structure index is specified in association with the term, are included in an event index existing in the state transition pattern. A state transition table in which at least one event index is designated and a center index corresponding to each term is set, and when a video index is assigned to a video, the state index is referred to by referring to the state transition table. When a matching event index is assigned, a target is set for each structural unit specified by the structural index. Search for terms whose structure unit and search granularity match, and in which the order of assignment of event indexes assigned in the target structural unit matches the input sequence of multiple event indexes of the state transition pattern, and match If a term exists, it is determined that the meaning of the matched term has occurred, and an event index or attribute information indicating the establishment of the corresponding term is added. Can be done at high speed.

【０１７３】また、本発明のコンピュータ読み取り可能
な記録媒体（請求項１５）によれば、請求項１３または
１４に記載の映像インデックス付与方法をコンピュータ
に実行させるためのプログラムを記録したため、このプ
ログラムをコンピュータに実行させることにより、抽象
度の高い用語を用いたインデックスの付与を効率良く高
速に行うことができる。According to the computer-readable recording medium of the present invention (claim 15), a program for causing a computer to execute the video index assigning method according to claim 13 or 14 is recorded. By causing the computer to execute, it is possible to efficiently and quickly assign an index using terms having a high degree of abstraction.

【０１７４】また、本発明の映像内容の説明文生成方法
（請求項１６〜２２）によれば、状態遷移テーブルに
は、複数の事象の組み合わせによって表現可能な用語毎
に、説明文を生成する際の映像単位として適当な構造単
位を設定した生成粒度と、用語を表現する事象の発生パ
ターンに基づいて、生成粒度の中で連続して発生する複
数の事象インデックスの入力列として用語を定義した状
態遷移パターンと、状態遷移パターン毎に、それぞれの
状態遷移パターン中に存在する事象インデックスのう
ち、少なくとも１つを選択して設定したキーインデック
スと、状態遷移パターン毎に、説明文の生成に使用する
構文要素および構文の構文要素として使用する文字列の
入力元を設定した文字列定義情報と、が設定されてお
り、映像内容の説明文を生成する際に、状態遷移テーブ
ルを参照して、説明文を生成する対象となる構造単位と
一致する生成粒度を検索し、該当する生成粒度に対応す
る状態遷移パターンのキーインデックスと一致する事象
インデックスが対象となる構造単位中に存在するか否か
を判定し、一致する事象インデックスが存在する場合
に、対応する状態遷移パターンと対象となる構造単位中
の事象インデックスの発生パターンが一致するか否かを
判定し、発生パターンが一致した場合に、対応する状態
遷移パターンで定義された用語が成立したと判定し、成
立した用語の状態遷移パターンの文字列定義情報を用い
て対象となる構造単位の映像シーンの説明文を生成する
ため、映像の内容の意味を解釈し、一般的な意味のある
文字列に変換して、利用者にとって分かりやすい映像内
容の説明文（説明文字列）を生成することができる。換
言すれば、対象となる構造単位（映像）に対して、その
部分で発生した映像の内容の意味による最適な説明文
（文字列）を生成することが可能となる。In addition, according to the method for generating a description of a video content according to the present invention (claims 16 to 22), a description is generated in the state transition table for each term that can be expressed by a combination of a plurality of events. The term is defined as an input sequence of multiple event indexes that occur continuously in the generation granularity based on the generation granularity that sets an appropriate structural unit as the video unit at the time and the occurrence pattern of the event expressing the term A state transition pattern, a key index that selects and sets at least one of the event indexes existing in each state transition pattern for each state transition pattern, and a description for each state transition pattern. And the character string definition information that sets the input source of the character string to be used as the syntax element of the syntax. At the time of generation, search for a generation granularity that matches the structural unit for which the description is to be generated with reference to the state transition table, and an event index that matches the key index of the state transition pattern corresponding to the relevant generation granularity Is determined in the target structural unit, and if there is a matching event index, whether the corresponding state transition pattern matches the occurrence pattern of the event index in the target structural unit Is determined, and when the occurrence patterns match, it is determined that the term defined in the corresponding state transition pattern is established, and the target structural unit is used using the character string definition information of the state transition pattern of the established term In order to generate a description of the video scene, the meaning of the video content is interpreted and converted into a general meaning character string, There video contents of description (description string) can be generated. In other words, for the target structural unit (video), it is possible to generate an optimal description (character string) according to the meaning of the content of the video generated in that part.

【０１７５】また、構造単位には、特定の用語を定義し
た状態遷移パターンが成立した場合に、その用語を表す
インデックスを抽象インデックスとして付与することが
可能であり、状態遷移テーブルには、状態遷移パターン
毎に、対応する用語を表す抽象インデックスがキーイン
デックスの一つとして設定されており、該当する生成粒
度に対応する状態遷移パターンのキーインデックスと一
致する事象インデックスが対象となる構造単位中に存在
するか否かを判定する際に、キーインデックスとして抽
象インデックスを優先して用いて、対象となる構造単位
に該当する抽象インデックスが付与されているか否かを
判定し、抽象インデックスが設定されている場合には、
対応する状態遷移パターンで定義された用語が成立した
と判定し、成立した用語の状態遷移パターンの文字列定
義情報を用いて対象となる構造単位の映像シーンの説明
文を生成するため、映像の内容の意味に相当する用語の
特定（成立の有無）を効率的に短時間で行うことが可能
となる。When a state transition pattern defining a specific term is established, an index representing the term can be assigned to the structural unit as an abstract index. An abstract index representing the corresponding term is set as one of the key indexes for each pattern, and an event index that matches the key index of the state transition pattern corresponding to the corresponding generation granularity exists in the target structural unit. When determining whether or not to perform an abstract index, it is determined whether or not the abstract index corresponding to the target structural unit is assigned by preferentially using the abstract index as the key index, and the abstract index is set. in case of,
It is determined that the term defined in the corresponding state transition pattern has been established, and a description of the target structural unit video scene is generated using the character string definition information of the state transition pattern of the established term. It is possible to efficiently specify a term corresponding to the meaning of the content (whether or not the term is satisfied) in a short time.

【０１７６】また、文字列定義情報には、構文要素とし
て使用する文字列の入力元として、構造インデックスま
たは事象インデックスの属性情報が設定されているた
め、状態遷移パターンが一致して用語が成立した場合に
は、必ず必要な構造インデックスおよび状態遷移パター
ンに属性情報が存在するので、文字列の入力を確実に行
って説明文を生成することができる。In the character string definition information, since the attribute information of the structure index or the event index is set as the input source of the character string used as the syntax element, the state transition pattern matches and the term is established. In such a case, since the attribute information is always present in the necessary structure index and state transition pattern, the description can be generated by reliably inputting the character string.

【０１７７】また、文字列定義情報には、「何時（Ｗｈ
ｅｎ）、どこで（Ｗｈｅｒｅ）、なぜ（Ｗｈｙ）、誰の
（Ｗｈｏ）、何（Ｗｈａｔ）で、どのように（Ｈｏｗ）
なった」の５Ｗ１Ｈを基本とした構文要素が設定されて
いるため、これらの構文要素を用いて５Ｗ１Ｈの情報を
盛り込んだ説明文を簡単に生成することができる。The character string definition information includes “what time (Wh
en), where (Where), why (Why), who (Who), what (What), how (How)
Since the syntax elements based on 5W1H of "Nita" have been set, it is possible to easily generate an explanatory sentence including information on 5W1H using these syntax elements.

【０１７８】また、該当する生成粒度に対応する状態遷
移パターンが複数存在する場合、各状態遷移パターンに
対して用語が成立するか否かを判定し、複数の用語が成
立すると、各用語の状態遷移パターンに設定された文字
列定義情報の構文要素を組み合わせて対象となる構造単
位の映像シーンの説明文を生成するため、より内容の詳
細に説明した説明文を生成することできる。When there are a plurality of state transition patterns corresponding to the corresponding generation granularity, it is determined whether or not a term is established for each state transition pattern. Since the description of the video scene of the target structural unit is generated by combining the syntax elements of the character string definition information set in the transition pattern, it is possible to generate the description in more detail.

【０１７９】また、各用語の状態遷移パターンに設定さ
れた文字列定義情報の構文要素を組み合わせて対象とな
る構造単位の映像シーンの説明文を生成する際に、５Ｗ
１Ｈの構文要素の中に重複する構文要素がある場合、時
間的に後に発生する事象インデックスを参照する構文要
素を優先するため、より新しい情報（属性情報）を用い
た説明文を生成することができる。When generating a description of a video scene of a target structural unit by combining syntax elements of character string definition information set in a state transition pattern of each term, 5W
If there is a duplicate syntax element in the 1H syntax element, a description element using newer information (attribute information) may be generated to give priority to a syntax element that refers to an event index that occurs later in time. it can.

【０１８０】また、各用語の状態遷移パターンに設定さ
れた文字列定義情報の構文要素を組み合わせて対象とな
る構造単位の映像シーンの説明文を生成する際に、５Ｗ
１Ｈの構文要素の中に重複する構文要素がある場合、各
用語の状態遷移パターンを比較し、より多くの事象イン
デックスを用いて定義された状態遷移パターンの構文要
素を優先するため、より新しい情報（属性情報）を用い
た説明文を生成すると共に、その部分で発生した映像の
内容の意味をより適切な説明文（文字列）で表現するこ
とができる。すなわち、より多くの事象インデックスを
用いて定義された状態遷移パターンの構文要素を優先す
ることとは、例えば、第１の状態遷移パターンの一部分
で構成される第２の状態遷移パターンがある場合、第１
の状態遷移パターンが成立している場合には、常に第２
の状態遷移パターンも成立しているが、常に事象インデ
ックスの多い第１の状態遷移パターンの構文要素が選択
されるので、その部分の映像における最終的な結果を構
文要素として選択できることになる。Also, when generating a description of a video image of a target structural unit by combining syntax elements of character string definition information set in the state transition pattern of each term, 5W
If there is a duplicate syntax element in the 1H syntax element, newer information is used to compare the state transition pattern of each term and give priority to the state transition pattern syntax element defined using more event indexes. It is possible to generate a description using (attribute information) and express the meaning of the content of the video generated in that part with a more appropriate description (character string). That is, giving priority to the syntax element of the state transition pattern defined by using more event indexes means, for example, when there is a second state transition pattern that is configured by a part of the first state transition pattern. First
If the state transition pattern of
However, since the syntax element of the first state transition pattern having a large event index is always selected, the final result in the video of that portion can be selected as the syntax element.

【０１８１】また、各用語の状態遷移パターンに設定さ
れた文字列定義情報の構文要素を組み合わせて対象とな
る構造単位の映像シーンの説明文を生成する際に、５Ｗ
１Ｈの構文要素の中に重複する構文要素がある場合、必
要に応じて構文要素を並列に並べるため、より情報量の
多い、換言すれば、分かりやすい説明文を生成すること
ができる。When a syntactic element of character string definition information set in the state transition pattern of each term is combined to generate a description of a video scene of a target structural unit, 5 W
If there are duplicate syntax elements in the 1H syntax elements, the syntax elements are arranged in parallel as necessary, so that a more informative, in other words, an easy-to-understand explanation can be generated.

【０１８２】また、請求項２４に係るコンピュータ読み
取り可能な記録媒体は、請求項１６〜２３のいずれか一
つに記載の映像内容の説明文生成方法をコンピュータに
実行させるためのプログラムを記録したため、映像の内
容の意味を解釈し、一般的な意味のある文字列に変換し
て、利用者にとって分かりやすい映像内容の説明文（説
明文字列）を生成することができる。According to a twenty-fourth aspect of the present invention, a computer-readable recording medium records a program for causing a computer to execute the method for generating a description of a video content according to any one of the sixteenth to twenty-third aspects. By interpreting the meaning of the content of the video and converting it into a character string having a general meaning, it is possible to generate a description (description character string) of the video content that is easy for the user to understand.

[Brief description of the drawings]

【図１】本発明の実施の形態１の映像検索処理装置の構
成図である。FIG. 1 is a configuration diagram of a video search processing device according to a first embodiment of the present invention.

【図２】構造インデックスと事象インデックスとの関係
を示す説明図である。FIG. 2 is an explanatory diagram showing a relationship between a structure index and an event index.

【図３】構造インデックスと事象インデックスとの関係
を示す説明図である。FIG. 3 is an explanatory diagram showing a relationship between a structure index and an event index.

【図４】実施の形態１の状態遷移テーブルを示す説明図
である。FIG. 4 is an explanatory diagram illustrating a state transition table according to the first embodiment;

【図５】実施の形態１の状態遷移テーブルを示す説明図
である。FIG. 5 is an explanatory diagram illustrating a state transition table according to the first embodiment;

【図６】実施の形態１のパス正規表現に対応する有限状
態オートマトンの状態遷移図である。FIG. 6 is a state transition diagram of a finite state automaton corresponding to a path regular expression according to the first embodiment.

【図７】実施の形態１のパス正規表現に対応する有限状
態オートマトンの状態遷移を説明するための図である。FIG. 7 is a diagram illustrating a state transition of the finite state automaton corresponding to the path regular expression according to the first embodiment.

【図８】実施の形態１の映像検索処理のアルゴリズムを
示すフローチャートである。FIG. 8 is a flowchart illustrating an algorithm of a video search process according to the first embodiment.

【図９】野球映像を例とした時の映像インデックス（構
造インデックスおよび事象インデックス）の例を示す説
明図である。FIG. 9 is an explanatory diagram showing an example of a video index (structure index and event index) when a baseball video is taken as an example.

【図１０】実施の形態２の映像内容説明文生成装置の構
成図である。FIG. 10 is a configuration diagram of a video content description generating device according to a second embodiment.

【図１１】実施の形態２の抽象インデックスを用いて状
態遷移パターンを定義した例を示す説明図である。FIG. 11 is an explanatory diagram showing an example in which a state transition pattern is defined using an abstract index according to the second embodiment.

【図１２】実施の形態２の状態遷移テーブルを示す説明
図である。FIG. 12 is an explanatory diagram illustrating a state transition table according to the second embodiment;

【図１３】実施の形態２の映像内容の説明文生成処理の
アルゴリズムを示すフローチャートである。FIG. 13 is a flowchart showing an algorithm of a process for generating a description of a video content according to the second embodiment.

[Explanation of symbols]

１００映像検索処理装置１０１映像入力部１０２状態遷移テーブル記憶部１０３操作入力部１０４映像検索部１０５判定部１０５ａ第１の判定部１０５ｂ第２の判定部１０６検索結果出力部２００映像内容説明文生成装置２０１映像入力部２０２状態遷移テーブル記憶部２０３操作入力部２０４用語検索部２０５説明文生成部２０６説明文表示部 REFERENCE SIGNS LIST 100 video search processing device 101 video input unit 102 state transition table storage unit 103 operation input unit 104 video search unit 105 determination unit 105 a first determination unit 105 b second determination unit 106 search result output unit 200 video content description generation device 201 video input unit 202 state transition table storage unit 203 operation input unit 204 term search unit 205 explanatory sentence generation unit 206 explanatory sentence display unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者白田由香利東京都大田区中馬込１丁目３番６号株式会社リコー内 (72)発明者真野博子東京都大田区中馬込１丁目３番６号株式会社リコー内 (72)発明者飯沢篤志東京都大田区中馬込１丁目３番６号株式会社リコー内Ｆターム(参考） 5B075 ND12 ND35 NK10 NS01 PP02 PQ02 PR06 5C052 AA01 AB03 AC08 CC06 CC11 DD04 (54)【発明の名称】映像検索方法、その方法をコンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体、映像検索処理装置、映像インデックス付与方法、その方法をコンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体、映像内容の説明文生成方法およびその方法をコンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Yukari Shirata 1-3-6 Nakamagome, Ota-ku, Tokyo Inside Ricoh Company (72) Inventor Hiroko Mano 1-3-6 Nakamagome, Ota-ku, Tokyo Stock Ricoh Company (72) Inventor Atsushi Iizawa 1-3-6 Nakamagome, Ota-ku, Tokyo F-term in Ricoh Company (reference) 5B075 ND12 ND35 NK10 NS01 PP02 PQ02 PR06 5C052 AA01 AB03 AC08 CC06 CC11 DD04 (54) [ Patent application title: VIDEO SEARCH METHOD, COMPUTER-READABLE RECORDING MEDIUM CONTAINING PROGRAM FOR CAUSING COMPUTER TO EXECUTE THE METHOD, VIDEO SEARCH PROCESSOR, VIDEO INDEX ADDING METHOD, AND PROGRAM FOR CAUSING COMPUTER TO EXECUTE THE METHOD Computer readable record Body, computer-readable recording medium recording a program for executing a description generation method and method of the video content to the computer

Claims

[Claims]

At least a structure index for dividing a video into a semantic group and an event index for specifying the content and location of an event that has occurred in the video are provided, and a video of a section divided by the structure index is provided. A video search in which a scene is a structural unit of a video, and a video structured using a plurality of hierarchical structural units is searched, and a desired video scene is searched from the video using the structure index and the event index. In the method, in advance, for a term that can be expressed by a combination of a plurality of events, an appropriate structural unit is set as a search granularity as a search target unit when searching for the video, and an occurrence pattern of an event expressing the term Based on the above, as an input column of a plurality of event indexes that occur consecutively in the search granularity, A state transition pattern corresponding to a word is defined, and when searching for a desired video scene from the video, a term expressing the desired video scene is input, and a search granularity corresponding to the input term is searched. As a unit, for each structural unit that matches the search granularity, it is determined whether the state transition pattern corresponding to the input term matches the occurrence pattern of the event index in the structural unit, and the matched structural unit is used as a search result. A video search method characterized by outputting.

Further, for each state transition pattern corresponding to the term, at least one of the event indexes present in each of the state transition patterns is designated as a key index for starting a search. When searching for a desired video scene from the video, after searching for a structural unit that matches the search granularity and that has an event index that matches the key index, 2. The video search method according to claim 1, wherein it is determined whether or not the state transition pattern matches an occurrence pattern of an event index in a structural unit.

3. The video scene specified by the structural unit is output when a desired video scene is extracted from the video based on the structural unit output as the search result. Or the video search method according to 2.

4. When extracting a desired video scene from the video based on the structural unit output as the search result, an arbitrary upper or lower structural unit on the structured video can be designated. 3. The video search method according to claim 1, wherein:

5. When extracting a desired video scene from the video based on the structural unit output as the search result, an offset for video clipping is designated before and after the location where the key index is added. 3. The video search method according to claim 2, wherein the video scene is extracted by using the video scene.

6. When it is determined whether a state transition pattern corresponding to the input term matches an occurrence pattern of an event index in a structural unit, an index representing the term is abstracted in the matching structural unit. The video search method according to any one of claims 2 to 5, wherein the video search method is defined as an index, newly added, and reused as a key index.

7. An abstract index is set in advance for each of the terms that can be expressed by a combination of a plurality of events as an index representing each term, and when defining the state transition pattern, an abstract index is set in addition to the event index. 7. The video search method according to claim 1, wherein a state transition pattern corresponding to the term is defined as an input string including an event index and an abstract index using the abstract index.

8. A plurality of attribute information items are added to the structure index and the event index, and a state transition pattern corresponding to the term is expressed by using each attribute information of the structure index and the event index. Character string definition information for generating a description sentence related to is added, and based on the character string definition information of the state transition pattern, referring to attribute information in the structural unit output as the search result The method according to any one of claims 2 to 7, wherein a description sentence related to the term is generated.

9. At least a structure index for dividing a video into a semantic group and an event index for specifying the content and location of an event that has occurred in the video are provided, and the video of a section divided by the structure index is provided. A video search in which a scene is a structural unit of a video, and a video structured using a plurality of hierarchical structural units is searched, and a desired video scene is searched from the video using the structure index and the event index. In the method, as a process before performing a video search, in advance, for a term that can be expressed by a combination of a plurality of events, an appropriate structural unit as a search target unit when searching for the video is set as a search granularity, A plurality of events that occur consecutively in the search granularity based on the occurrence pattern of the event expressing the term Define the state transition pattern corresponding to said term as input columns for index,
Further, for each state transition pattern corresponding to the term, at least one event index among event indexes present in each state transition pattern is designated as a key index for starting a search, and the term A state transition table generating step of generating a state transition table composed of a state transition pattern and a key index. As a process for performing a video search, a term input step of inputting a term representing a desired video scene; With reference to the transition table, the search granularity corresponding to the term input in the term input step as a search target unit,
And, using a key index corresponding to the term,
A search step of searching for a structural unit having an event index that matches the key index; and referring to the state transition table, the structural unit searched in the search step is included in a state transition pattern corresponding to the term. A first determination step of determining whether or not all event indexes are present; and a state transition pattern and a structural unit corresponding to the input term for the structural units determined to be all present in the first determination step. A second determining step of determining whether or not the occurrence patterns of the event indexes in the image pattern match; and extracting a video scene from the video based on the structural unit determined to match in the second determining step. And a search result output step of outputting as a search result.

10. A computer-readable recording medium on which a program for causing a computer to execute the video search method according to claim 1 is recorded.

11. A video of a section divided by the structure index, provided with at least a structure index for dividing the video into a semantic group and an event index for specifying the content and location of an event that has occurred in the video. A video search in which a scene is a structural unit of a video, and a video structured using a plurality of hierarchical structural units is searched, and a desired video scene is searched from the video using the structure index and the event index. In the processing device, a video input unit for inputting the structured video to be searched, a term for specifying a desired video scene to be searched, and a structure as a search target unit when searching for a video by the term A search granularity that specifies a unit and an input of a plurality of event indexes that continuously generate the term in the search granularity. Storage means for storing, as a state transition table, a state transition pattern defined as a column and a key index designating at least one event index among the event indexes present in the state transition pattern; Operation input means for inputting or specifying an inquiry term for searching for, and when the query term is input or specified via the operation input means, refer to a state transition table of the storage means, Using a key index corresponding to a term, a search unit that searches the video input by the video input unit for a structural unit that matches the search granularity and that has an event index that matches the key index; Enter the structural unit found in the Determining means for determining whether or not the generated state transition pattern and the occurrence pattern of the event index in the structural unit match; and extracting a video scene from the video based on the structural unit determined to match by the determining means. And a search result output means for outputting the result as a search result.

12. A first determining unit that determines whether all of the event indexes included in the state transition pattern corresponding to the query term exist in the structural unit searched by the searching unit. Determining means for determining whether the state transition pattern corresponding to the query term matches the occurrence pattern of the event index in the structural unit with respect to the structural units determined to be all present by the first determining means 12. The video search processing device according to claim 11, comprising: a second determination unit that performs the determination.

13. When structuring a video, a video index including at least a structure index for dividing the video into a semantic group and an event index for specifying the content and location of an event that has occurred in the video. In the video index adding method to be added, a term in which a meaning is established in advance by a plurality of events occurring consecutively and a state transition in which the term is defined using an input sequence of a plurality of event indexes expressing the term A state transition table is set in which a pattern and a search granularity specified by associating the structural unit of the video defined by the predetermined structural index with the term are associated with each term, and the video index is assigned to the video. At the time of assigning the structure unit, the structure unit specified by the structure index is referred to with reference to the state transition table. In each case, the target structural unit matches the search granularity, and the assignment order of the event indexes assigned in the target structural unit matches the input sequence of the plurality of event indexes of the state transition pattern. A video index characterized by searching for a term, determining that the meaning of the matched term has occurred when a matching term exists, and adding an event index or attribute information indicating the establishment of the corresponding term. Assignment method.

14. When structuring a video, a video index including at least a structure index for dividing the video into a semantic group and an event index for specifying the content and location of an event that has occurred in the video. In the video index adding method to be added, a term in which a meaning is established in advance by a plurality of events occurring consecutively and a state transition in which the term is defined using an input sequence of a plurality of event indexes expressing the term A pattern and a search granularity in which a structural unit of a video defined by a predetermined structural index is specified in association with the term, and at least one event index among event indexes existing in the state transition pattern is specified. A state transition table in which a central index is associated with each term is set in advance. When assigning a video index to the video, by referring to the state transition table, if an event index that matches the central index is assigned, each of the structural units specified by the structural index is targeted. Structural unit and the search granularity match, and search for a term in which the input sequence of the event index assigned in the target structural unit and the input sequence of the plurality of event indexes of the state transition pattern match, A video index assigning method, wherein when a matching term exists, it is determined that the meaning of the matched term has occurred, and an event index or attribute information indicating the establishment of the corresponding term is assigned.

15. A computer-readable recording medium on which a program for causing a computer to execute the video index assigning method according to claim 13 or 14 is recorded.

16. A video of a section divided by the structure index, to which at least a structure index for dividing the video into a semantic group and an event index for specifying the content and location of an event that has occurred in the video are provided. A video content description generating method for generating a description that describes a video content using a scene as a structural unit of a video and a video structured using a plurality of hierarchically structured units, A video for generating the explanatory note using a state transition table in which information used for generating a description is set, and a character string or attribute information that can be converted into a character string that is previously assigned to the structure index and the event index A method of generating a description of a content, wherein the state transition table includes, for each term that can be expressed by a combination of a plurality of events, ,
Based on a generation granularity in which an appropriate structural unit is set as a video unit when generating the explanatory sentence, and based on an occurrence pattern of an event expressing the term, a plurality of event indexes continuously occurring in the generation granularity A state transition pattern that defines the term as an input sequence of, and, for each of the state transition patterns, at least one of an event index existing in each state transition pattern.
And a key index set by selecting one of the following: for each of the state transition patterns, a character string definition information in which a syntax element used for generating the description and a character string input source used as a syntax element of the syntax are set. When generating the description of the video content, the state transition table is searched for a generation granularity that matches the structural unit for which the description is to be generated. It is determined whether or not an event index matching the key index of the state transition pattern corresponding to the generation granularity exists in the target structural unit. If the matching event index exists, the corresponding state transition is determined. It is determined whether the pattern and the occurrence pattern of the event index in the target structural unit match, and if the occurrence pattern matches, Determining that the term defined in the corresponding state transition pattern is established, and generating a description of the video scene of the target structural unit using character string definition information of the state transition pattern of the established term. The description generation method of the video content to be described.

17. When a state transition pattern defining a specific term is established, an index representing the term can be assigned to the structural unit as an abstract index. In each of the state transition patterns, an abstract index representing a corresponding term is set as one of the key indexes, and an event index that matches the key index of the state transition pattern corresponding to the corresponding generation granularity is set as the target index. When it is determined whether or not the structural unit exists in the target structural unit, the abstract index is preferentially used as the key index, and it is determined whether or not the abstract index corresponding to the target structural unit is given. If an abstract index has been set, it is defined by the corresponding state transition pattern. 17. The description according to claim 16, wherein it is determined that the established term is established, and a description of the video scene of the target structural unit is generated using the character string definition information of the state transition pattern of the established term. How to generate a description of the video content

18. The character string definition information according to claim 16, wherein attribute information of the structure index or the event index is set as an input source of a character string used as the syntax element. A method for generating a description of the video content described.

19. The character string definition information may include “what time (Wh
en), where (Where), why (Why), who (Who), what (What), how (How)
19. The method according to claim 16, wherein a syntax element based on 5W1H of "Nita" is set.

20. When there are a plurality of state transition patterns corresponding to the corresponding generation granularity, it is determined whether or not a term is established for each state transition pattern. 20. The method according to claim 19, wherein a syntactic element of the character string definition information set in the state transition pattern is combined to generate a description of the video scene of the target structural unit. .

21. When generating a description of a video scene of a target structural unit by combining syntax elements of character string definition information set in a state transition pattern of each term,
When there are duplicate syntax elements in the syntax elements of W1H,
21. The method according to claim 20, wherein a syntax element referring to an event index occurring later in time is prioritized.

22. When generating a description of a video scene of the target structural unit by combining syntax elements of character string definition information set in a state transition pattern of each term,
When there are duplicate syntax elements in the syntax elements of W1H,
22. The state transition pattern of each term is compared, and a syntax element of the state transition pattern defined using more event indexes is prioritized.
The method of generating the description of the video content described in.

23. When generating a description of a video scene of the target structural unit by combining syntax elements of character string definition information set in a state transition pattern of each term,
When there are duplicate syntax elements in the syntax elements of W1H,
The method according to any one of claims 19 to 22, wherein syntax elements are arranged in parallel as needed.

24. A computer-readable recording medium having recorded thereon a program for causing a computer to execute the method for generating a description of a video content according to any one of claims 16 to 23.