JP2004343781A

JP2004343781A - Video content caption generating method, video content caption generating unit, digest video programming method, digest video programming unit, and computer-readable recording medium on which program for making computer perform method is stored

Info

Publication number: JP2004343781A
Application number: JP2004162504A
Authority: JP
Inventors: Yukari Yoshiura; 由香利吉浦; Takako Hashimoto; 隆子橋本; Atsushi Iizawa; 篤志飯沢
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2000-01-21
Filing date: 2004-05-31
Publication date: 2004-12-02
Anticipated expiration: 2020-05-08
Also published as: JP4137007B2

Abstract

<P>PROBLEM TO BE SOLVED: To generate a caption with a reasonable and smooth sentence line for a viewer (user) by making clear previous-and-next connection and relation of the caption generated from each video scene. <P>SOLUTION: A video content caption generating unit 100 is provided with a caption generating section 101, into which two or more textual information consisting of fragmented strings that describe content of each searched video scene are input to generate the caption which describes the video content of the video scene using the textual information; a video content determination section 102 which determines the content of each video scene from the input textual information; and a connection expression select section 103 which selects a connection expression from "SO", "BUT", "AND", "FURTHERMORE", "OR", according to the relation between the previous and the next video scenes, based on the determination result by the video content determination section 102; and the caption generating section 101 uses the connection expression selected with the connection expression select section 103 to connect the captions for the appropriate previous and next video scenes and output them. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、放送のデジタル化に伴い、映像（映像情報）の補足情報をインデックスとして付加し、そのインデックスを用いて映像のダイジェスト版を作成する場合に、切り出した各映像シーンの映像内容を説明する説明文を生成する映像内容の説明文生成方法、映像内容説明文生成装置および記録媒体、ならびに、生成した映像内容の説明文を用いてダイジェスト映像の番組を作成するダイジェスト映像の番組化方法、ダイジェスト映像の番組化装置および記録媒体に関する。 The present invention describes the video content of each clipped video scene when adding a supplementary information of video (video information) as an index with the digitization of broadcasting and creating a digest version of the video using the index. Method of generating a video content description, a video content description generation device and a recording medium, and a digest video programming method of creating a digest video program using the generated video content description; The present invention relates to a digest video programmer and a recording medium.

近年、放送のデジタル化が世界的規模で急速に進展しており、ＢＳ（ＢｒｏａｄｃａｓｔＳａｔｅｌｌｉｔｅ）デジタル放送や地上波デジタル放送の準備が着々と進んでいる。これによりテレビの視聴形態も急激に変化し、従来のリアルタイム視聴だけでなく、蓄積型視聴およびノンリニア視聴形態も可能となる。
ここで、本出願人らが、これまで提案してきたノンリニア視聴形態におけるダイジェスト作成システムについて説明する。本出願人は、まず、補足情報がインデックスとして付加された映像を対象として、そのインデックスを用いて重要場面と想定される映像シーンを検索し、映像のダイジェスト版（ダイジェスト映像）を作成するダイジェスト作成システムを考案し、このダイジェスト作成システムにおいて、重要場面と判定された映像シーンには音声解説も含まれているため、断面的なインデックスの概要を説明文として生成するだけで十分であるという考えで映像内容の説明文生成処理を考えてきた。また、インデックスを用いてダイジェスト映像を作成する際に、映像を利用する視聴者（利用者）の嗜好を反映したダイジェスト映像を作成するダイジェスト作成装置の提案も行っている。
なお、上記の技術の詳細は、非特許文献１〜３によって明らかにされている。 2. Description of the Related Art In recent years, digitalization of broadcasting is rapidly progressing on a worldwide scale, and preparations for BS (Broadcast Satellite) digital broadcasting and terrestrial digital broadcasting are steadily progressing. As a result, the television viewing mode changes rapidly, and not only the conventional real-time viewing mode but also the storage-based viewing mode and the non-linear viewing mode are possible.
Here, a description will be given of a digest creation system in a non-linear viewing mode that has been proposed by the present applicants. The present applicant first searches for a video scene to which supplementary information is added as an index, searches for a video scene assumed to be an important scene using the index, and creates a digest version of the video (digest video). We devised a system, and in this digest creation system, the video scenes that were determined to be important scenes also included audio commentary, so it was sufficient to generate a summary of the cross-sectional index as an explanatory sentence. I've been thinking about the process of generating an explanatory note for video content. In addition, when creating a digest video using an index, a digest creation device that creates a digest video that reflects the preference of a viewer (user) who uses the video is also proposed.
The details of the above technology are disclosed in Non-Patent Documents 1 to 3.

橋本隆子、他：「番組インデックスを利用したダイジェスト視聴方式の検討」、映像情報メディア学会放送方式研究会予稿集、１９９９年３月、ｐ．７−１２。Takako Hashimoto, et al .: "Study of Digest Viewing Method Using Program Index", Proceedings of the Society of Image Information and Television Engineers, March 1999, p. 7-12. 橋本隆子、他：「番組インデックスを利用したダイジェスト作成方式の試作」、データ工学ワークショップ（ＤＥＷＳ’９９）予稿集ＣＤ−ＲＯＭ、１９９９年３月。Takako Hashimoto et al .: "Prototype of Digest Creation Method Using Program Index", Data Engineering Workshop (DEWS'99) Proceedings CD-ROM, March 1999. 橋本隆子、他：「ＴＶ受信端末におけるダイジェスト作成方式の試作」、ＡＤＢＳ９９予稿集、１９９９年１２月。Takako Hashimoto, et al .: "Trial Production of Digest Creation Method for TV Receiving Terminal", ADBS99 Proceedings, December 1999.

しかしながら、上記のような映像内容の説明文生成処理には以下の問題点があった。
第１に、検索結果である各映像シーンに対して、それぞれの断片的なインデックスを用いて、独立に説明文を生成するため、前後のつながりや、関連性が不明瞭な説明文となり、視聴者（利用者）にとって違和感のないスムーズな文章の流れの説明文を生成することはできなかった。 However, the above-described process for generating a description of a video content has the following problems.
First, for each video scene that is a search result, a description is generated independently using each fragmentary index, so that the description before and after and the relevance are unclear. It was not possible for a user (user) to generate a description of the flow of a sentence that was smooth without any discomfort.

第２に、検索結果である各映像シーンの断片的なインデックスのみを用いて説明文を生成するため、検索結果である各映像シーンが、前後の映像シーンの中でどのような意味を持つのかを明確するための概要説明的な文章としての前書き文や後書き文を生成することはできなかった。 Second, since the description is generated using only the fragmentary index of each video scene as the search result, what meaning does each video scene as the search result have in the preceding and following video scenes? It was not possible to generate a foreword sentence or an afterword sentence as an explanatory sentence to clarify.

第３に、ダイジェスト作成装置において映像を利用する視聴者（利用者）の嗜好を反映したダイジェスト映像を作成することは可能であるが、上記映像内容の説明文生成処理ではダイジェスト映像（映像シーン）に付与されている断片的なインデックスのみから説明文を生成しており、視聴者（利用者）の嗜好を反映した説明文を生成することはできなかった。 Third, it is possible to create a digest video reflecting the preference of a viewer (user) who uses the video in the digest creation device. However, in the above-described description generation processing of video content, a digest video (video scene) is used. The description is generated only from the fragmentary index given to the user, and the description reflecting the preference of the viewer (user) cannot be generated.

さらに、従来の技術によれば、ダイジェスト作成装置を用いて作成したダイジェスト映像をそのまま再生することで簡単な番組として利用することは可能であるが、ダイジェスト映像から自動的に番組を作成したり、視聴者（利用者）の嗜好を反映させた演出を施して番組を作成したりすることはできなかった。 Furthermore, according to the conventional technology, it is possible to use a digest video created using a digest creation device as a simple program by directly reproducing the digest video, but it is possible to automatically create a program from the digest video, It has not been possible to create a program by giving an effect reflecting the tastes of viewers (users).

本発明は上記に鑑みてなされたものであって、各映像シーンから生成した説明文の前後のつながりや、関連性を明瞭にして、視聴者（利用者）にとって違和感のないスムーズな文章の流れの説明文を生成することを第１の目的とする。 The present invention has been made in view of the above, and clarifies the connection before and after and the relevance of a description generated from each video scene so that a viewer (user) can smoothly flow a sentence without discomfort. The first object is to generate a description sentence.

また、本発明は上記に鑑みてなされたものであって、検索結果である各映像シーンが、前後の映像シーンの中でどのような意味を持つのかを明確するための概要説明的な文章として、前書き文や後書き文の生成を可能とすることを第２の目的とする。 In addition, the present invention has been made in view of the above, and as a brief explanatory sentence for clarifying the meaning of each video scene as a search result in the preceding and following video scenes A second object is to enable generation of a preamble sentence or a postscript sentence.

また、本発明は上記に鑑みてなされたものであって、視聴者（利用者）の嗜好を反映した説明文の生成を可能とすることを第３の目的とする。 Further, the present invention has been made in view of the above, and it is a third object of the present invention to be able to generate an explanatory note reflecting the taste of a viewer (user).

また、本発明は上記に鑑みてなされたものであって、ダイジェスト映像から自動的に番組を作成すると共に、視聴者（利用者）の嗜好を反映させた演出を施した番組を作成するダイジェスト映像の番組化方法またはダイジェスト映像の番組化装置を提供することを第４の目的とする。 In addition, the present invention has been made in view of the above, and a digest video for automatically creating a program from a digest video and creating a program with an effect reflecting the taste of a viewer (user). It is a fourth object of the present invention to provide a method for converting a program or a program for converting a digest video.

上述した課題を解決し、目的を達成するために、請求項１にかかる発明は、１つの映像ストリームの中からダイジェスト映像用のシーンとして検索した各映像シーンに対して、その内容を説明する断片的な文字列または文字列に変換可能な情報からなる複数の文字情報が付加されている場合に、前記文字情報を用いて映像シーンの映像内容を説明する説明文を生成する説明文生成ステップを有する映像内容説明文生成方法において、前記文字情報から各映像シーンの内容を判定する映像内容判定ステップと、前記映像内容判定ステップの判定結果に基づいて、前後の映像シーンの関係により、順接、逆接、並列、添加、選択の中から接続表現を選択する接続表現選択ステップと、を含み、前記説明文生成ステップが、前記接続表現選択ステップで選択した接続表現を用いて、該当する前後の映像シーンの説明文を接続することを特徴とする。 In order to solve the above-mentioned problem and achieve the object, the invention according to claim 1 provides a fragment for explaining the content of each video scene retrieved as a digest video scene from one video stream. When a plurality of character information including a character string or information that can be converted into a character string is added, a description generating step of generating a description describing the video content of the video scene using the character information is performed. In the video content description generation method having, in the video content determination step of determining the content of each video scene from the character information, based on the determination result of the video content determination step, by the relationship between the previous and next video scene, sequential, A connection expression selection step of selecting a connection expression from among inverse connection, parallel, addition, and selection. In using the selected connection representation, characterized by connecting a description of the front and rear of the video scene corresponding.

また、請求項２にかかる発明は、階層構造を用いて構造化された映像ストリームの中からダイジェスト映像用のシーンとして検索した各映像シーンに対して、その内容を説明する断片的な文字列または文字列に変換可能な情報からなる複数の文字情報が付加されている場合に、前記文字情報を用いて映像シーンの映像内容を説明する説明文を生成する説明文生成ステップを有する映像内容説明文生成方法において、前記文字情報から各映像シーンの内容を判定する映像内容判定ステップと、前記映像内容判定ステップの判定結果に基づいて、前後の映像シーンの関係により、順接、逆接、並列、添加、選択の中から接続表現を選択する接続表現選択ステップと、を含み、前記説明文生成ステップが、前記接続表現選択ステップで選択した接続表現を用いて、該当する前後の映像シーンの説明文を接続することを特徴とする。 In addition, the invention according to claim 2 provides, for each video scene retrieved as a digest video scene from a video stream structured using a hierarchical structure, a fragmentary character string or A video content description step having a description generation step of generating a description describing the video content of a video scene using the character information when a plurality of character information composed of information that can be converted to a character string is added; In the generation method, a video content determining step of determining the content of each video scene from the character information, and, based on a determination result of the video content determining step, a sequential connection, a reverse connection, a parallel connection, and an addition based on a relationship between preceding and following video scenes. , A connection expression selecting step of selecting a connection expression from the selections, wherein the explanatory sentence generating step includes the connection selected in the connection expression selecting step. With current, characterized by connecting a description of the front and rear of the video scene corresponding.

また、請求項３にかかる発明は、請求項２に記載の映像内容の説明文生成方法において、前記説明文生成ステップは、ある階層の映像シーンについての説明文を生成する際に、前記階層構造を利用して、当該階層の映像シーンの映像内容を示す説明文と共に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の前書きとなる前書き文を生成することを特徴とする。 According to a third aspect of the present invention, in the method of generating a description of a video content according to the second aspect, the description generating step includes the step of generating the description of a video scene of a certain hierarchy by using the hierarchical structure. And generating an introductory sentence that is an introductory sentence of the explanatory note from the character information of the video scene in the upper hierarchy of the video scene in the hierarchy together with the explanatory text indicating the video content of the video scene in the hierarchy. I do.

また、請求項４にかかる発明は、請求項２または３に記載の映像内容の説明文生成方法において、前記説明文生成ステップは、ある階層の映像シーンについての説明文を生成する際に、前記階層構造を利用して、当該階層の映像シーンの映像内容を示す説明文と共に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の後書きとなる後書き文を生成することを特徴とする。 According to a fourth aspect of the present invention, in the method for generating a description of a video content according to the second or third aspect, the description generation step includes the step of: Utilizing a hierarchical structure, together with a descriptive sentence indicating the video content of the video scene of the relevant layer, and generating a postscript to be a postscript of the descriptive sentence from the character information of the video scene of the higher hierarchical level of the video scene of the relevant layer Features.

また、請求項５にかかる発明は、請求項１〜４のいずれか一つに記載の映像内容の説明文生成方法において、前記説明文生成ステップは、さらに、映像シーンの説明文を生成する際に、予め設定されている利用者の嗜好情報を用いて、前記説明文の文章表現を利用者の嗜好に応じて変化させることを特徴とする。 According to a fifth aspect of the present invention, in the method for generating a description of a video content according to any one of the first to fourth aspects, the description generation step further includes generating a description of the video scene. The sentence expression of the explanatory note is changed according to the preference of the user using preset user preference information.

また、請求項６にかかる発明は、１つの映像ストリームの中からダイジェスト映像用のシーンとして検索した各映像シーンに対して、その内容を説明する断片的な文字列または文字列に変換可能な情報からなる複数の文字情報が付加されている場合に、前記文字情報を用いて映像シーンの映像内容を説明する説明文を生成する説明文生成手段を有する映像内容説明文生成装置において、前記文字情報から各映像シーンの内容を判定する映像内容判定手段と、前記映像内容判定手段の判定結果に基づいて、前後の映像シーンの関係により、順接、逆接、並列、添加、選択の中から接続表現を選択する接続表現選択手段と、を備え、前記説明文生成手段が、前記接続表現選択手段で選択した接続表現を用いて、該当する前後の映像シーンの説明文を接続することを特徴とする。 Further, according to a sixth aspect of the present invention, for each video scene retrieved from one video stream as a scene for a digest video, information that can be converted into a fragmentary character string or a character string that describes the content is described. In the case where a plurality of pieces of text information consisting of: are added, the video content description generation device includes a description generation unit that generates a description that explains the video content of the video scene using the text information. From the video content determination means for determining the content of each video scene from the video content determination means, based on the relationship between the previous and next video scene, connected expression from sequential, reverse connection, parallel, addition, selection And a connection expression selecting unit for selecting the connection expression selected by the connection expression selection unit. Characterized in that it connects.

また、請求項７にかかる発明は、階層構造を用いて構造化された映像ストリームの中からダイジェスト映像用のシーンとして検索した各映像シーンに対して、その内容を説明する断片的な文字列または文字列に変換可能な情報からなる複数の文字情報が付加されている場合に、前記文字情報を用いて映像シーンの映像内容を説明する説明文を生成する説明文生成手段を有する映像内容説明文生成装置において、前記文字情報から各映像シーンの内容を判定する映像内容判定手段と、前記映像内容判定手段の判定結果に基づいて、前後の映像シーンの関係により、順接、逆接、並列、添加、選択の中から接続表現を選択する接続表現選択手段と、を備え、前記説明文生成手段が、前記接続表現選択手段で選択した接続表現を用いて、該当する前後の映像シーンの説明文を接続することを特徴とする。 In addition, according to the invention according to claim 7, for each video scene retrieved as a digest video scene from a video stream structured using a hierarchical structure, a fragmentary character string or When a plurality of character information composed of information that can be converted into a character string is added, a video content description having a description generating means for generating a description describing the video content of the video scene using the character information In the generation device, a video content determining means for determining the content of each video scene from the character information, and, based on a determination result of the video content determining means, a sequential connection, a reverse connection, a parallel connection, an addition, And a connection expression selecting means for selecting a connection expression from the selections, wherein the explanatory sentence generating means uses the connection expression selected by the connection expression selecting means to select a connection expression. Characterized by connecting a description of a video scene.

また、請求項８にかかる発明は、請求項７に記載の映像内容説明文生成装置において、さらに、前記説明文生成手段は、ある階層の映像シーンについての説明文を生成する際に、前記階層構造を利用して、当該階層の映像シーンの映像内容を示す説明文と共に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の前書きとなる前書き文を生成することを特徴とする。 According to an eighth aspect of the present invention, in the video content description generating apparatus according to the seventh aspect, the description generating means further includes: By using the structure, a preamble sentence, which is a preamble of the explanation, is generated from the character information of the video scene of the hierarchy higher than the video scene of the hierarchy together with the description indicating the video content of the video scene of the hierarchy. And

また、請求項９にかかる発明は、請求項７または８に記載の映像内容説明文生成装置において、さらに、前記説明文生成手段は、ある階層の映像シーンについての説明文を生成する際に、前記階層構造を利用して、当該階層の映像シーンの映像内容を示す説明文と共に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の後書きとなる後書き文を生成することを特徴とする。 According to a ninth aspect of the present invention, in the video content description generating apparatus according to the seventh or eighth aspect, the description generating means further includes: when generating a description about a video scene of a certain hierarchy, Using the hierarchical structure, a postscript that is a postscript of the explanatory note is generated from the character information of the video scene of the layer higher than the video scene of the layer, together with the description indicating the video content of the video scene of the layer. It is characterized by.

また、請求項１０にかかる発明は、前記請求項１〜５のいずれか一つに記載の映像内容の説明文生成方法をコンピュータに実行させるためのプログラムを記録したことを特徴とするコンピュータ読み取り可能な記録媒体である。 According to a tenth aspect of the present invention, there is provided a computer readable program recording a program for causing a computer to execute the method for generating a description of a video content according to any one of the first to fifth aspects. Recording medium.

また、請求項１１にかかる発明は、１つの映像ストリームの中からダイジェスト映像用のシーンとして検索された各映像シーンと、前記各映像シーンに対して作成された映像内容の説明文を入力させ、前記各映像シーンの再生に加えて、予め設定された仮想キャラクタを介して前記映像内容の説明文を音声または文字で提供することで、ダイジェスト映像の番組を作成するダイジェスト映像の番組化方法であって、前記映像シーンおよび映像内容の説明文と共に、各映像シーンの映像内容に対する前記仮想キャラクタの感情的な反応の度合い値を入力させる入力ステップと、各映像シーン毎に前記度合い値に基づいて前記仮想キャラクタの感情表現の演出処理を行う演出ステップと、を含むことを特徴とする。 Further, the invention according to claim 11 causes the user to input each video scene retrieved from one video stream as a scene for a digest video, and a description of the video content created for each video scene, In addition to the reproduction of each of the video scenes, a digest video program creation method for creating a digest video program by providing a description of the video content in audio or text via a preset virtual character. An input step of inputting a degree value of the emotional reaction of the virtual character to the video content of each video scene together with the description of the video scene and the video content, and based on the degree value for each video scene. An effecting step of effecting effecting processing of the emotional expression of the virtual character.

また、請求項１２にかかる発明は、ダイジェスト映像用の映像シーンと共に、各映像シーンの前記説明文、前書き文、後書き文および度合い値を入力させ、ダイジェスト映像の番組を作成するダイジェスト映像の番組化方法であって、前記各映像シーンの再生に加えて、予め設定された仮想キャラクタを介して前記説明文、前書き文および後書き文を音声で提供すると共に、各映像シーン毎に前記度合い値に基づいて前記仮想キャラクタの感情表現の演出処理を行う演出ステップ、を含むことを特徴とする。 The invention according to claim 12 is a program for converting a digest video into a digest video program by inputting the description, the preamble, the postscript, and the degree value of each video scene together with the video scene for the digest video. The method, in addition to reproducing each of the video scenes, provides the explanatory note, the preamble sentence, and the postscript sentence via voice through a preset virtual character, and based on the degree value for each video scene. And performing an effecting process of effecting the emotional expression of the virtual character.

また、請求項１３にかかる発明は、１つの映像ストリームの中からダイジェスト映像用のシーンとして検索された各映像シーンと、予め作成された各映像シーンの説明文、前書き文、後書き文およびその映像内容に対する利用者の感情的な変化の度合いをを示す度合い値を入力し、ダイジェスト映像の番組を作成するダイジェスト映像の番組化装置であって、番組化の処理単位として、１つの映像シーン毎に前記説明文、前書き文、後書き文および度合い値を対応させて映像ファイルを生成する映像ファイル生成手段と、少なくとも仮想キャラクタを含む番組の各種構成情報を番組定義ファイルとして記憶した番組定義ファイルデータベースと、感情表現の程度を複数設定し、前記感情表現の程度毎に、それぞれ１つの演出方法を定義した演出テンプレートを記憶した演出定義データベースと、前記映像ファイルを入力し、１つの映像ファイル毎に度合い値に基づいて感情表現の程度を決定し、前記演出定義データベースから前記感情表現の程度に応じた感情表現の演出テンプレートを選択する選択手段と、前記番組定義ファイル、映像ファイルおよび演出テンプレートを入力し、１つの映像ファイル毎に前記選択した演出テンプレートに基づいて、少なくとも前記映像シーンの再生タイミングと、前記仮想キャラクタの音声として出力する説明文、前書き文、後書き文の設定および音声の出力タイミングと、前記仮想キャラクタの動作とを設定することにより、映像ファイル単位の番組演出処理を行う演出処理手段と、を備えたことを特徴とする。 According to a thirteenth aspect of the present invention, each of the video scenes retrieved from one video stream as a scene for a digest video, and a description, an introductory sentence, a postscript sentence of each video scene created in advance, and the video A digest video programmer that inputs a degree value indicating a degree of a user's emotional change with respect to content and creates a digest video program. A video file generating means for generating a video file by associating the description, the preamble, the postscript and the degree value, a program definition file database storing at least various configuration information of a program including at least a virtual character as a program definition file, A plurality of emotion expression levels are set, and one production method is defined for each of the emotion expression levels. An effect definition database storing a template and the video file are input, a degree of emotional expression is determined for each image file based on a degree value, and an emotional expression according to the degree of emotional expression is determined from the effect definition database. Selecting means for selecting an effect template, and inputting the program definition file, the video file, and the effect template. Based on the selected effect template for each video file, at least the reproduction timing of the video scene, An effect processing means for performing a program effect processing for each video file by setting the setting of the explanatory note, the preamble sentence, the postscript sentence and the output timing of the sound output as the sound of the character, and the operation of the virtual character; It is characterized by having.

また、請求項１４にかかる発明は、請求項１３に記載のダイジェスト映像の番組化装置において、前記番組定義ファイルの番組の各種構成情報は、少なくとも１つの仮想キャラクタと、番組のスタジオセット、カメラの台数や位置、ＣＧ照明、ＣＧ小道具、サウンド、番組タイトル、スーパーの設定等の情報から成ることを特徴とする。 According to a fourteenth aspect of the present invention, in the digest video programming apparatus according to the thirteenth aspect, the various configuration information of the program in the program definition file includes at least one virtual character, a studio set of the program, and a camera. It is characterized by comprising information such as the number and position, CG lighting, CG props, sound, program title, supermarket settings, and the like.

また、請求項１５にかかる発明は、請求項１３または１４に記載のダイジェスト映像の番組化装置において、前記度合い値は、喜怒哀楽等の感情の種類を示すための感情種類情報を有し、前記演出定義データベースには、感情種類情報および感情表現の程度をキーインデックスとして分類された複数の演出テンプレートが記憶されており、前記選択手段は、前記演出テンプレートを選択する際に、前記度合い値に基づいて、キーインデックスとして使用する感情種類情報および感情表現の程度を決定し、前記演出定義データベースから該当する全ての演出テンプレートを選択することを特徴とする。 The invention according to claim 15 is the digest video programming device according to claim 13 or 14, wherein the degree value has emotion type information for indicating an emotion type such as emotion, anger, and so on, The effect definition database stores a plurality of effect templates that are classified as emotion type information and the degree of emotional expression as a key index, and the selecting means selects the effect template when the effect value is selected. Based on this, the emotion type information and the degree of emotion expression used as the key index are determined, and all the effect templates are selected from the effect definition database.

また、請求項１６にかかる発明は、請求項１５に記載のダイジェスト映像の番組化装置において、さらに、前記度合い値は、複数の度合い値で構成することが可能であり、前記演出定義データベースには、複数の感情種類情報および前記複数の感情種類情報の感情表現の程度をキーインデックスとして分類された複数の演出テンプレートが記憶されており、前記選択手段は、前記演出テンプレートを選択する際に、前記複数の度合い値に基づいて、キーインデックスとして使用する複数の感情種類情報および複数の感情種類情報の感情表現の程度を決定し、前記演出定義データベースから該当する全ての演出テンプレートを選択することを特徴とする。 According to a sixteenth aspect of the present invention, in the digest video programming apparatus according to the fifteenth aspect, the degree value can be composed of a plurality of degree values. A plurality of effect templates in which a plurality of emotion type information and a degree of the emotional expression of the plurality of emotion type information are classified as a key index are stored, and the selecting means, when selecting the effect template, Based on the plurality of degree values, a plurality of emotion type information to be used as a key index and a degree of emotion expression of the plurality of emotion type information are determined, and all corresponding effect templates are selected from the effect definition database. And

また、請求項１７にかかる発明は、請求項１３〜１６のいずれか一つに記載のダイジェスト映像の番組化装置において、さらに、複数の番組定義ファイルの中から所望の番組定義ファイルを指定するための指定手段を備え、前記番組定義ファイルデータベースには、複数の番組定義ファイルが記憶されており、前記演出処理手段は、前記映像ファイル単位の番組演出処理を行う場合に、前記指定手段を介して指定された番組定義ファイルを入力して、該当する各種構成情報に基づいて、前記映像ファイル単位の番組演出処理を行うことを特徴とする。 According to a seventeenth aspect of the present invention, there is provided the digest video programming apparatus according to any one of the thirteenth to sixteenth aspects, further comprising the step of designating a desired program definition file from a plurality of program definition files. A plurality of program definition files are stored in the program definition file database, and the effect processing means, when performing the program effect processing for each video file, through the specifying means The present invention is characterized in that a designated program definition file is inputted, and the program effect processing for each video file is performed based on various kinds of configuration information.

また、請求項１８にかかる発明は、請求項１７に記載のダイジェスト映像の番組化装置において、前記演出テンプレートには、定義されている演出方法を適用可能な番組環境情報が設定されており、前記演出処理手段は、前記選択手段で選択された演出テンプレートが複数存在する場合、各演出テンプレートの番組環境情報を参照して前記指定手段を介して指定された番組定義ファイルで提供される番組環境において実行可能な演出テンプレートの１つを選択し、前記映像ファイル単位の番組演出処理を行うことを特徴とする。 The invention according to claim 18 is the digest video programming device according to claim 17, wherein the effect template is set with program environment information to which a defined effect method can be applied, The effect processing means, when there are a plurality of effect templates selected by the selecting means, refers to the program environment information of each effect template, in a program environment provided by a program definition file specified through the specifying means. The present invention is characterized in that one of the executable effect templates is selected, and the program effect processing is performed for each video file.

また、請求項１９にかかる発明は、請求項１８に記載のダイジェスト映像の番組化装置において、前記演出テンプレートには、定義されている演出方法を１つのダイジェスト映像の番組化を行う際に使用する回数を限定する使用回数限定情報が設定可能であり、前記演出処理手段は、前記実行可能な演出テンプレートの１つを選択した後、前記演出テンプレートに使用回数限定情報が設定されている場合、選択した演出テンプレートを過去に使用した回数と前記使用回数限定情報とを比較して使用可能であるか否かを判定し、使用可能でない場合には、他の実行可能な演出テンプレートを選択することを特徴とする。 According to a nineteenth aspect of the present invention, in the digest video programming apparatus according to the eighteenth aspect, an effect method defined in the effect template is used when one digest image is programmed. Use number limitation information for limiting the number of times can be set, and the effect processing means selects one of the executable effect templates, and then selects the use number limitation information if the effect template includes use number limitation information. The number of times the effect template has been used in the past is compared with the use count limitation information to determine whether or not the effect template can be used, and if not usable, select another executable effect template. Features.

また、請求項２０にかかる発明は、請求項１５〜１９のいずれか一つに記載のダイジェスト映像の番組化装置において、前記演出処理手段における映像ファイル単位の番組演出処理は、前記選択手段で１つの映像ファイルの演出テンプレートの選択が終了すると、使用する演出テンプレートを選択して処理する逐次処理機能と、前記選択手段で全ての映像ファイルの演出テンプレートの選択が終了するのを待って、各映像ファイルで使用する演出テンプレートを選択した後、処理するバッチ処理機能とを有しており、前記演出処理手段は、前記バッチ処理機能を用いて処理を行う場合、前記選択手段で選択された全ての演出テンプレートを参照して、前記感情種類情報および感情表現の程度が同一である演出テンプレートの集合毎に、その集合が選択された回数を求め、複数回選択された集合のうち、１つの集合の中に異なる演出テンプレートが複数存在する場合、それぞれの演出テンプレートの選択回数が均一になるように演出テンプレートを選択することを特徴とする。 According to a twentieth aspect of the present invention, in the digest video programming apparatus according to any one of the fifteenth to nineteenth aspects, the program effect processing for each image file in the effect processing means is performed by the selecting means. When the selection of the rendering template for one video file is completed, a sequential processing function for selecting and processing the rendering template to be used, and waiting for the selection means to complete the selection of the rendering template for all the video files, and After selecting an effect template to be used in the file, it has a batch processing function for processing, and the effect processing means, when performing processing using the batch processing function, all of the selected by the selection means With reference to the effect templates, for each set of effect templates having the same emotion type information and emotion expression degree, Is obtained, and if a plurality of different effect templates exist in one set from the set selected a plurality of times, the effect templates are selected such that the number of times of selecting each effect template is uniform. It is characterized by the following.

また、請求項２１にかかる発明は、請求項２０に記載のダイジェスト映像の番組化装置において、前記演出テンプレートは、各演出テンプレートの有する感情種類情報および感情表現の程度に対応付けられる度合い値のうち、最も高い度合い値を有する映像ファイルまたは最も低い度合い値を有する映像ファイルの番組演出処理に使用することを指定する指定情報を設定可能であり、前記演出処理手段は、前記指定情報が設定されてる演出テンプレートが存在する場合、該当する演出テンプレートが選択された全ての映像ファイルの度合い値を相対的に比較し、該当する演出テンプレートを最大の度合い値または最小の度合い値を有する映像ファイルの番組演出処理のみに使用することを特徴とする。 According to a twenty-first aspect of the present invention, in the digest video programming apparatus according to the twentieth aspect, the effect template is one of emotion type information and a degree value associated with the degree of emotional expression included in each effect template. It is possible to set designation information that specifies that the video file having the highest degree value or the video file having the lowest degree value be used for the program effect processing, and the effect processing means is provided with the designation information. If the effect template exists, the corresponding effect template is relatively compared with the degree values of all the selected video files, and the corresponding effect template is directed to the video file having the maximum degree value or the minimum degree value. It is characterized in that it is used only for processing.

また、請求項２２にかかる発明は、前記請求項１１または１２に記載のダイジェスト映像の番組化方法をコンピュータに実行させるためのプログラムを記録したことを特徴とするコンピュータ読み取り可能な記録媒体である。 The invention according to claim 22 is a computer-readable recording medium on which is recorded a program for causing a computer to execute the digest video programming method according to claim 11 or 12.

請求項１、２にかかる発明によれば、文字情報から各映像シーンの内容を判定し、前後の映像シーンの関係により、順接、逆接、並列、添加、選択等の中から接続表現を選択し、選択した接続表現を用いて、該当する前後の映像シーンの説明文を接続した映像内容の説明文を生成するため、各映像シーンから生成した説明文の前後のつながりや、関連性を明瞭にして、視聴者（利用者）にとって違和感のないスムーズな文章の流れの説明文を生成することができる。 According to the first and second aspects of the present invention, the content of each video scene is determined from the character information, and a connection expression is selected from sequential connection, reverse connection, parallel, addition, selection, etc., based on the relationship between the preceding and following video scenes. Then, using the selected connection expression, a description of the video content that connects the description of the relevant video scene before and after is generated, so the connection and relevance of the description before and after the description generated from each video scene are clarified. Thus, it is possible to generate a description of the flow of the sentence that is smooth for the viewer (user) without discomfort.

また、請求項３にかかる発明本によれば、階層構造を用いて構造化された映像ストリームから検索結果として得られた各映像シーンのある階層の映像シーンについての説明文を生成する際に、階層構造を利用して、当該階層の映像シーンの映像内容を示す説明文と共に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の前書きとなる前書き文を生成するため、検索結果である各映像シーンが、前後の映像シーンの中でどのような意味を持つのかを明確するための概要説明的な文章として、前書き文を生成することができる。 According to the third aspect of the present invention, when generating a description about a video scene of a certain hierarchy of each video scene obtained as a search result from a video stream structured using a hierarchical structure, Using a hierarchical structure, together with a descriptive sentence indicating the video content of the video scene of the relevant layer, to generate a foreword to be a preamble of the descriptive sentence from the character information of the video scene of the upper layer of the video scene of the relevant layer, An introductory sentence can be generated as a brief explanatory sentence for clarifying the meaning of each video scene as a search result in the preceding and following video scenes.

また、請求項４にかかる発明によれば、階層構造を用いて構造化された映像ストリームから検索結果として得られた各映像シーンのある階層の映像シーンについての説明文を生成する際に、階層構造を利用して、当該階層の映像シーンの映像内容を示す説明文と共に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の後書きとなる後書き文を生成するため、検索結果である各映像シーンが、前後の映像シーンの中でどのような意味を持つのかを明確するための概要説明的な文章として、後書き文を生成することができる。 According to the fourth aspect of the present invention, when generating a description about a video scene of a certain hierarchy of each video scene obtained as a search result from a video stream structured using a hierarchical structure, By using the structure, along with a descriptive sentence indicating the video content of the video scene of the hierarchy and a text postscript to be a postscript of the descriptive sentence from the character information of the video scene of the hierarchy higher than the video scene of the hierarchy, search is performed. A postscript sentence can be generated as a brief explanatory sentence for clarifying the meaning of each of the resulting video scenes in the preceding and following video scenes.

また、請求項５にかかる発明によれば、映像シーンの説明文を生成する際に、予め設定されている利用者の嗜好情報を用いて、説明文の文章表現を利用者の嗜好に応じて変化させるため、視聴者（利用者）の嗜好を反映した説明文を生成することができる。 Further, according to the invention according to claim 5, when generating the description of the video scene, the sentence expression of the description is changed according to the user's preference using the preset user preference information. In order to change the description, it is possible to generate an explanatory note reflecting the taste of the viewer (user).

また、請求項６および７にかかる発明によれば、文字情報から各映像シーンの内容を判定する映像内容判定手段と、映像内容判定手段の判定結果に基づいて、前後の映像シーンの関係により、順接、逆接、並列、添加、選択の中から接続表現を選択する接続表現選択手段と、を備え、説明文生成手段が、接続表現選択手段で選択した接続表現を用いて、該当する前後の映像シーンの説明文を接続するため、各映像シーンから生成した説明文の前後のつながりや、関連性を明瞭にして、視聴者（利用者）にとって違和感のないスムーズな文章の流れの説明文を生成することができる。 According to the inventions according to claims 6 and 7, a video content determining means for determining the content of each video scene from character information, and a relationship between preceding and following video scenes based on a determination result of the video content determining means, Connection expression selecting means for selecting a connection expression from sequential connection, reverse connection, parallel, addition, and selection, and the description generating means uses the connection expression selected by the connection expression selection means to select a connection expression before and after the connection expression. To connect the descriptions of the video scenes, clarify the connection and relevance of the description generated from each video scene and the relatedness, and create a description of the flow of the text that is smooth for the viewer (user) without discomfort Can be generated.

また、請求項８にかかる発明によれば、階層構造を用いて構造化された映像ストリームから検索結果として得られた各映像シーンのある階層の映像シーンについての説明文を生成する際に、階層構造を利用して、当該階層の映像シーンの映像内容を示す説明文と共に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の前書きとなる前書き文を生成するため、検索結果である各映像シーンが、前後の映像シーンの中でどのような意味を持つのかを明確するための概要説明的な文章として、前書き文を生成することができる。 Further, according to the invention according to claim 8, when generating a description about a video scene of a certain hierarchy of each video scene obtained as a search result from a video stream structured using a hierarchical structure, By using the structure, together with a descriptive sentence indicating the video content of the video scene of the layer and a text information of a video scene of a layer higher than the video scene of the layer, a preamble sentence that becomes a preamble of the description is generated by searching. An introductory sentence can be generated as a brief explanatory sentence for clarifying the meaning of each of the resulting video scenes in the preceding and following video scenes.

請求項９にかかる発明によれば、階層構造を用いて構造化された映像ストリームから検索結果として得られた各映像シーンのある階層の映像シーンについての説明文を生成する際に、階層構造を利用して、当該階層の映像シーンの映像内容を示す説明文と共に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の後書きとなる後書き文を生成するため、検索結果である各映像シーンが、前後の映像シーンの中でどのような意味を持つのかを明確するための概要説明的な文章として、後書き文を生成することができる。 According to the ninth aspect of the present invention, when generating a description about a video scene of a certain hierarchy of each video scene obtained as a search result from the video stream structured using the hierarchical structure, Utilizing this, together with the descriptive text indicating the video content of the video scene of the hierarchy, and the character information of the video scene of the hierarchy higher than the video scene of the hierarchy, to generate a postscript that is the postscript of the description, A postscript sentence can be generated as a brief explanatory sentence for clarifying what each video scene has in the preceding and following video scenes.

また、請求項１０にかかる発明によれば、請求項１〜５のいずれか一つに記載の映像内容の説明文生成方法をコンピュータに実行させるためのプログラムを記録しておき、このプログラムをコンピュータで読み出して実行することにより、請求項１〜５のいずれか一つに記載の映像内容の説明文生成方法と同様の効果を奏することができる。 According to a tenth aspect of the present invention, a program for causing a computer to execute the method for generating a description of a video content according to any one of the first to fifth aspects is recorded, and the program is stored in a computer. By reading and executing the method, it is possible to achieve the same effect as the method of generating a description of a video content according to any one of the first to fifth aspects.

また、請求項１１にかかる発明によれば、映像シーンおよび映像内容の説明文と共に、各映像シーンの映像内容に対する仮想キャラクタの感情的な反応の度合い値を入力し、各映像シーン毎に度合い値に基づいて仮想キャラクタの感情表現の演出を行うため、ダイジェスト映像から自動的に番組を作成すると共に、視聴者（利用者）の嗜好を反映させた演出を施した番組を作成するダイジェスト映像の番組化方法を提供することができる。 According to the eleventh aspect of the present invention, a degree value of the emotional reaction of the virtual character to the video content of each video scene is input together with the description of the video scene and the video content. A program of a digest video that automatically creates a program from a digest video and creates a program with an effect reflecting the taste of a viewer (user) in order to produce an emotional expression of a virtual character based on the program. Method can be provided.

また、請求項１２にかかる発明によれば、各映像シーンの再生に加えて、予め設定された仮想キャラクタを介して説明文、前書き文および後書き文を音声で提供すると共に、各映像シーン毎に度合い値に基づいて仮想キャラクタの感情表現の演出を行うため、ダイジェスト映像から自動的に番組を作成すると共に、視聴者（利用者）の嗜好を反映させた演出を施した番組を作成するダイジェスト映像の番組化方法を提供することができる。 According to the twelfth aspect of the present invention, in addition to the reproduction of each video scene, an explanatory sentence, a preamble sentence, and a postscript sentence are provided in audio via a preset virtual character. A digest video that automatically creates a program from a digest video and creates a program that reflects the viewer's (user's) preferences in order to produce an emotional expression of the virtual character based on the degree value. Can be provided.

また、請求項１３にかかる発明によれば、番組化の処理単位として、１つの映像シーン毎に説明文、前書き文、後書き文および度合い値を対応させて映像ファイルを生成する映像ファイル生成手段と、少なくとも仮想キャラクタを含む番組の各種構成情報を番組定義ファイルとして記憶した番組定義ファイルデータベースと、感情表現の程度を複数設定し、感情表現の程度毎に、それぞれ１つの演出方法を定義した演出テンプレートを記憶した演出定義データベースと、映像ファイルを入力し、１つの映像ファイル毎に度合い値に基づいて感情表現の程度を決定し、演出定義データベースから感情表現の程度に応じた感情表現の演出テンプレートを選択する選択手段と、番組定義ファイル、映像ファイルおよび演出テンプレートを入力し、１つの映像ファイル毎に選択した演出テンプレートに基づいて、少なくとも映像シーンの再生タイミングと、仮想キャラクタの音声として出力する説明文、前書き文、後書き文の設定および音声の出力タイミングと、仮想キャラクタの動作とを設定することにより、映像ファイル単位の番組演出処理を行う演出処理手段と、を備えたため、ダイジェスト映像から自動的に番組を作成すると共に、視聴者（利用者）の嗜好を反映させた演出を施した番組を作成するダイジェスト映像の番組化相法を提供することができる。 According to the thirteenth aspect of the present invention, there is provided a video file generating means for generating a video file by associating a description, a preamble, a postscript, and a degree value for each video scene as a processing unit of program conversion. A program definition file database storing at least various configuration information of a program including at least a virtual character as a program definition file, a plurality of degrees of emotion expression, and an effect template defining one effect method for each degree of emotion expression And input a video file, determine the degree of emotional expression based on the degree value for each video file, and from the effect definition database, create an effect expression template corresponding to the degree of emotional expression from the effect definition database Entering a selection means to select, a program definition file, a video file, and a rendering template, Based on the effect template selected for each video file, at least the playback timing of the video scene, the setting of the explanatory note, preamble sentence, postscript sentence to be output as the sound of the virtual character and the output timing of the sound, and the operation of the virtual character Is set, the effect processing means for performing the program effect processing for each video file is provided, so that the program is automatically created from the digest video and the effect reflecting the taste of the viewer (user) is provided. It is possible to provide a method for converting a digest video into a program in which a program is created.

また、請求項１４にかかる発明によれば、番組定義ファイルの番組の各種構成情報が、少なくとも１つの仮想キャラクタと、番組のスタジオセット、カメラの台数や位置、ＣＧ照明、ＣＧ小道具、サウンド、番組タイトル、スーパーの設定等の情報から成るため、これら各種構成情報を設定または変更することにより、多彩な番組を構築することができる。 According to the fourteenth aspect of the present invention, the various types of configuration information of the program in the program definition file include at least one virtual character, a studio set of the program, the number and position of cameras, CG lighting, CG props, sound, and program. Since it is composed of information such as a title and a supermarket setting, various programs can be constructed by setting or changing these various types of configuration information.

また、請求項１５にかかる発明によれば、度合い値が、喜怒哀楽等の感情の種類を示すための感情種類情報を有し、演出定義データベースには、感情種類情報および感情表現の程度をキーインデックスとして分類された複数の演出テンプレートが記憶されており、選択手段は、演出テンプレートを選択する際に、度合い値に基づいて、キーインデックスとして使用する感情種類情報および感情表現の程度を決定し、演出定義データベースから該当する全ての演出テンプレートを選択するため、喜怒哀楽のような細かい感情の表現を演出に反映させることができる。 Further, according to the invention according to claim 15, the degree value has emotion type information for indicating the type of emotion such as emotion, emotion and so on, and the effect definition database stores the emotion type information and the degree of emotion expression in the effect definition database. A plurality of effect templates classified as key indexes are stored, and when selecting an effect template, the selection means determines the emotion type information and the degree of emotion expression to be used as the key index based on the degree value. Since all the effect templates are selected from the effect definition database, expressions of fine emotions such as emotions and emotions can be reflected in the effect.

また、請求項１６にかかる発明によれば、さらに、度合い値は、複数の度合い値で構成することが可能であり、演出定義データベースには、複数の感情種類情報および複数の感情種類情報の感情表現の程度をキーインデックスとして分類された複数の演出テンプレートが記憶されており、選択手段は、演出テンプレートを選択する際に、複数の度合い値に基づいて、キーインデックスとして使用する複数の感情種類情報および複数の感情種類情報の感情表現の程度を決定し、演出定義データベースから該当する全ての演出テンプレートを選択するため、「悔しくて、残念である」のような複数の感情の種類が合わさったさらに細かい複合的な感情表現を演出に反映させることができる。 According to the sixteenth aspect of the present invention, the degree value can be composed of a plurality of degree values, and the effect definition database includes a plurality of emotion type information and a plurality of emotion type information. A plurality of effect templates in which the degree of expression is classified as a key index are stored, and when selecting an effect template, a plurality of emotion type information to be used as a key index based on a plurality of degree values when selecting an effect template is stored. And to determine the degree of emotional expression of multiple emotion type information, and to select all applicable production templates from the production definition database, multiple emotion types such as "Frustrated and unfortunate" Fine and complex emotional expressions can be reflected in the production.

また、請求項１７にかかる発明によれば、さらに、複数の番組定義ファイルの中から所望の番組定義ファイルを指定するための指定手段を備え、番組定義ファイルデータベースには、複数の番組定義ファイルが記憶されており、演出処理手段は、映像ファイル単位の番組演出処理を行う場合に、指定手段を介して指定された番組定義ファイルを入力して、該当する各種構成情報に基づいて、映像ファイル単位の番組演出処理を行うため、ダイジェスト映像の番組の設定または変更を容易に行うことができる。 According to the seventeenth aspect of the present invention, there is further provided a designation means for designating a desired program definition file from among the plurality of program definition files, and the program definition file database stores a plurality of program definition files. When performing a program production process for each video file, the production processing unit inputs the program definition file specified through the specification unit, and based on the corresponding various configuration information, the video processing unit. , The setting or change of the digest video program can be easily performed.

また、請求項１８にかかる発明によれば、演出テンプレートには、定義されている演出方法を適用可能な番組環境情報が設定されており、演出処理手段は、選択手段で選択された演出テンプレートが複数存在する場合、各演出テンプレートの番組環境情報を参照して指定手段を介して指定された番組定義ファイルで提供される番組環境において実行可能な演出テンプレートの１つを選択し、映像ファイル単位の番組演出処理を行うため、演出テンプレートで定義されている演出方法と作成する番組の環境との整合性を簡単にとることができる。換言すれば、常に違和感のない演出で番組化を図ることができる。 According to the eighteenth aspect of the present invention, in the effect template, the program environment information to which the defined effect method can be applied is set, and the effect processing means determines whether the effect template selected by the selecting means is selected. When there are a plurality of effect templates, one of the effect templates executable in the program environment provided by the program definition file specified through the specifying means is referred to by referring to the program environment information of each effect template, and a video file unit is selected. Since the program effect processing is performed, consistency between the effect method defined in the effect template and the environment of the program to be created can be easily achieved. In other words, it is possible to always produce a program with an effect that does not cause discomfort.

また、請求項１９にかかる発明によれば、演出テンプレートには、定義されている演出方法を１つのダイジェスト映像の番組化を行う際に使用する回数を限定する使用回数限定情報が設定可能であり、演出処理手段は、実行可能な演出テンプレートの１つを選択した後、演出テンプレートに使用回数限定情報が設定されている場合、選択した演出テンプレートを過去に使用した回数と使用回数限定情報とを比較して使用可能であるか否かを判定し、使用可能でない場合には、他の実行可能な演出テンプレートを選択するため、同一の演出テンプレート、換言すれば、同一の演出方法をある程度以上繰り返して使用しないように設定でき、演出のマンネリ化を回避したり、演出が飽きられないようにすることができる。さらに、使用回数限定情報に１回だけ使用するように設定すると、１つのダイジェスト映像の番組中で１回だけ効果的に使用することもできる。 According to the invention according to claim 19, in the effect template, use number limitation information for limiting the number of times the defined effect method is used when one digest video is converted into a program can be set. The effect processing means, after selecting one of the executable effect templates, if the effect template is set to use number limitation information, the effect processing means determines the number of times the selected effect template has been used in the past and the use number limitation information. It is determined whether or not it can be used by comparison, and if it is not available, to select another executable effect template, the same effect template, in other words, the same effect method is repeated to some extent or more. It can be set so as not to be used, and it is possible to avoid making the production rut and prevent the production from getting tired. Further, if the usage count limitation information is set to be used only once, it can be effectively used only once in one digest video program.

また、請求項２０にかかる発明によれば、演出処理手段における映像ファイル単位の番組演出処理は、選択手段で１つの映像ファイルの演出テンプレートの選択が終了すると、使用する演出テンプレートを選択して処理する逐次処理機能と、選択手段で全ての映像ファイルの演出テンプレートの選択が終了するのを待って、各映像ファイルで使用する演出テンプレートを選択した後、処理するバッチ処理機能とを有しており、演出処理手段は、バッチ処理機能を用いて処理を行う場合、選択手段で選択された全ての演出テンプレートを参照して、感情種類情報および感情表現の程度が同一である演出テンプレートの集合毎に、その集合が選択された回数を求め、複数回選択された集合のうち、１つの集合の中に異なる演出テンプレートが複数存在する場合、それぞれの演出テンプレートの選択回数が均一になるように演出テンプレートを選択するため、ダイジェスト映像の番組化を行う際に、時間がない場合には逐次処理機能を使用して番組演出処理を実行し、十分な時間がある場合にはバッチ処理機能を使用することにより、さらに繰り返しの目立たない自然な演出を行うことができる。 According to the twentieth aspect of the present invention, in the program effect processing for each video file in the effect processing means, when the selection of the effect template of one video file is completed by the selection means, the effect template to be used is selected and processed. And a batch processing function that waits for the selection means to select the effect templates for all video files, selects an effect template to be used for each image file, and then processes the batch. When performing processing using the batch processing function, the effect processing means refers to all effect templates selected by the selecting means, and for each set of effect templates having the same emotion type information and the same degree of emotion expression. The number of times the set was selected is determined, and among the sets selected multiple times, a plurality of different effect templates are included in one set. If there is, the effect template is selected so that the number of selection of each effect template is uniform, so when creating a digest video program, if there is no time, use the sequential processing function to perform the program effect processing Is performed, and if there is a sufficient time, the batch processing function can be used, whereby a natural effect in which repetition is less noticeable can be performed.

また、請求項２１にかかる発明によれば、演出テンプレートは、各演出テンプレートの有する感情種類情報および感情表現の程度に対応付けられる度合い値のうち、最も高い度合い値を有する映像ファイルまたは最も低い度合い値を有する映像ファイルの番組演出処理に使用することを指定する指定情報を設定可能であり、演出処理手段は、指定情報が設定されてる演出テンプレートが存在する場合、該当する演出テンプレートが選択された全ての映像ファイルの度合い値を相対的に比較し、該当する演出テンプレートを最大の度合い値または最小の度合い値を有する映像ファイルの番組演出処理のみに使用するため、効果的に演出テンプレート、すなわち演出方法を選択することができる。 According to the twenty-first aspect of the present invention, the effect template is a video file having the highest degree value or the lowest degree among the degree values associated with the emotion type information and the degree of emotion expression included in each effect template. It is possible to set designation information for designating the use of the video file having the value in the program production process. If there is a production template in which the designation information is set, the production process is selected. Since the degree values of all the video files are relatively compared and the corresponding rendering template is used only for the program rendering processing of the video file having the maximum degree value or the minimum degree value, the rendering template, that is, the rendering effect, is effectively obtained. You can choose the method.

また、請求項２２にかかる発明によれば、請求項１１または１２に記載のダイジェスト映像の番組化方法をコンピュータに実行させるためのプログラムを記録したおき、このプログラムをコンピュータで読み出して実行することにより、請求項１１または１２に記載のダイジェスト映像の番組化方法と同様の効果を奏することができる。 According to the invention of claim 22, a program for causing a computer to execute the digest video programming method according to claim 11 or 12 is recorded, and the program is read and executed by the computer. The same effect as that of the digest video programming method according to the eleventh or twelfth aspect can be obtained.

以下、本発明の映像内容の説明文生成方法、映像内容説明文生成装置、ダイジェスト映像の番組化方法、ダイジェスト映像の番組化装置およびその方法をコンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体の実施の形態について、添付の図面を参照しつつ詳細に説明する。 Hereinafter, a method for generating a description of a video content, a method for generating a video description, a method for converting a digest video into a program, a device for converting a digest video into a program, and a computer readable recording program for causing a computer to execute the method will be described. An embodiment of a recording medium will be described in detail with reference to the accompanying drawings.

〔実施の形態１〕
図１は、実施の形態１の映像内容説明文生成装置の概略構成図を示す。実施の形態１の映像内容説明文生成装置１００は、図示しないダイジェスト作成エンジンからダイジェスト映像用のシーンとして検索された各映像シーンの内容を説明する断片的な文字列からなる複数の文字情報を入力し、該文字情報を用いて映像シーンの映像内容を説明する説明文を生成する説明文生成部１０１と、入力した文字情報から各映像シーンの内容を判定する映像内容判定部１０２と、映像内容判定部１０２の判定結果に基づいて、前後の映像シーンの関係により、順接、逆接、並列、添加、選択の中から接続表現を選択する接続表現選択部１０３と、から構成される。 [Embodiment 1]
FIG. 1 is a schematic configuration diagram of the video content description generating apparatus according to the first embodiment. The video content description generating apparatus 100 according to the first embodiment inputs a plurality of pieces of character information composed of fragmentary character strings that explain the content of each video scene retrieved as a digest video scene from a digest creation engine (not shown). A description generating unit 101 that generates a description describing the video content of the video scene using the text information; a video content determination unit 102 that determines the content of each video scene from the input text information; And a connection expression selection unit 103 for selecting a connection expression from sequential connection, reverse connection, parallel, addition, and selection based on the relationship between the preceding and following video scenes based on the determination result of the determination unit 102.

ここでは、階層構造を用いて構造化された映像ストリームを使用するものとする。例えば、階層構造を用いた構造化は、映像全体を最上位の階層として、最上位の階層を論理的に意味のある映像シーン（映像の単位）に分割して次の階層とし、分割した映像シーンをさらに分割してその次の階層とするように、順次、映像シーンを分割して構造化することにより、容易に実現できる。また、この構造化した映像ストリームの各映像シーンには、その内容を説明する断片的な文字列（または文字列に変換可能な情報）からなる複数の文字情報がインデックスとして付加されているものとする。 Here, it is assumed that a video stream structured using a hierarchical structure is used. For example, in the structuring using a hierarchical structure, the entire image is set as the highest layer, and the highest layer is divided into logically meaningful video scenes (units of the image) to form the next layer. This can be easily realized by sequentially dividing and structuring the video scene so that the scene is further divided into the next layer. Each of the video scenes of the structured video stream has a plurality of pieces of character information composed of fragmentary character strings (or information that can be converted into character strings) describing the contents as indexes. I do.

なお、ダイジェスト作成エンジンで、構造化された映像ストリームからダイジェスト映像用のシーンを検索し、検索された各映像シーンと、その内容を説明する断片的な文字列（文字情報）とを出力する技術に関しては、本出願人らによって先に出願された技術（例えば、特願平１１−０５８９１６号「ダイジェスト作成装置、ダイジェスト作成方法およびその方法をコンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体」）を用いて容易に実現することができる。 A technology for searching for a digest video scene from a structured video stream by a digest creation engine and outputting each searched video scene and a fragmentary character string (character information) describing the content thereof. With respect to the technology, a technology previously filed by the present applicants (for example, Japanese Patent Application No. 11-058916, “Digest Creation Apparatus, Digest Creation Method, and Computer-Readable Computer Recorded with Program for Making Computer Execute the Method”) Recording medium ").

また、説明文生成部１０１は、接続表現選択部１０３で選択した接続表現を用いて、該当する前後の映像シーンの説明文を接続して出力するものである。さらに説明文生成部１０１は、ある階層の映像シーンについての説明文を生成する際に、階層構造を利用して、当該階層の映像シーンの映像内容を示す説明文と共に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の前書きとなる前書き文および説明文の後書きとなる後書き文を生成する。 The description generating unit 101 uses the connection expression selected by the connection expression selecting unit 103 to connect and output the description of the preceding and succeeding video scenes. Further, when generating a description about a video scene of a certain hierarchy, the description generation unit 101 uses a hierarchical structure to provide a description indicating the video content of the video scene of the hierarchy, and a description of the video scene of the hierarchy. From the character information of the video scene of the upper layer, a preamble sentence which is a preamble of the explanation and a postscript which is a postscript of the explanation are generated.

以上の構成において、（１）接続表現の付加処理、（２）前書き文・後書き文の生成処理の順に、その動作を説明する。
（１）接続表現の付加処理
この接続表現の付加処理は、上記映像内容判定部１０２と接続表現選択部１０３との共同作業によって実行される。接続表現の付加処理では、各映像シーンを説明する文字列（文字情報）から説明文を生成して、それらをただ連続的に提示するのではなく、前後の映像シーンの内容関係に着目し、２つの説明文の間に適切な接続を付加する。これにより、各映像シーンの説明文が並べられた複数の説明文からなる文章の流れがスムーズになり、視聴者の状況理解を助けるものである。 In the above configuration, the operation will be described in the order of (1) connection expression addition processing, and (2) preamble / postscript generation processing.
(1) Connection Expression Addition Processing This connection expression addition processing is executed by the joint work of the video content determination unit 102 and the connection expression selection unit 103. In the process of adding the connection expression, an explanatory sentence is generated from a character string (character information) that describes each video scene, and instead of presenting them continuously, attention is paid to the content relationship between the preceding and following video scenes. Add the appropriate connection between the two legends. As a result, the flow of a sentence composed of a plurality of explanatory sentences in which the explanatory sentences of each video scene are arranged is smoothed, which helps the viewer understand the situation.

先ず、ダイジェスト作成エンジンでダイジェスト映像として切り出された２つの映像シーンの文字情報を入力し、その映像シーンの内容を分析して、その間の関係を判定する関数について説明する。以下、この関数を接続関係判別関数と呼ぶこととする。 First, a description will be given of a function that inputs character information of two video scenes cut out as digest videos by the digest creation engine, analyzes the contents of the video scenes, and determines the relationship between them. Hereinafter, this function is referred to as a connection relation determination function.

一般に接続関係には以下の５つのタイプがあり、上記接続関係判別関数はこれらのどれかを返り値とする。
１．並列：並べあげる意味を表すもの。
例：また、および、あるいは、ならびに。
２．添加：付け加える意味を表すもの。
例：しかも、そのうえ、さらに、おまけに、それに。
３．選択：どちらか一方を選び取る意味を表すもの。
例：あるいは、それとも、もしくは、または。
４．順接：前に述べることが、後に述べることの原因、理由となることを表すも
の。
例：したがって、よって、すると、それゆえ、ですから、そうすると、
だから。
５．逆接：前に述べたことと、その後に述べたこととが逆の関係になることを表
すもの。
例：けれども、しかし、だか、でも、といっても、ところが、だけど、
しかしながら。 Generally, there are the following five types of connection relations, and the connection relation discrimination function sets any one of these as a return value.
1. Parallel: Represents the meaning of arranging.
Example: Also and / or and.
2. Addition: Represents the meaning of addition.
Example: And, moreover, additionally.
3. Choice: Represents the meaning of choosing one or the other.
Example: Or, or, or, or.
4. Junction: Indicates that what is stated before is the cause or reason for what is stated later.
Example: Therefore, therefore, and therefore, so, then,
Because.
5. Inverse: An indication that what is said before and what is after are inversely related.
Example: But, but, but, but, but, but,
However.

この接続関係判別関数の引数としては、ダイジェスト作成エンジンから入力した文字情報が与えられる。なお、実施の形態１では、映像シーンの内容を説明する断片的な文字列の他に、後述する重要度判定パラメータの値をダイジェスト作成エンジンが計算して、文字情報として映像内容説明文生成装置１００に出力し、映像内容説明文生成装置１００において、文字列と共に重要度判定パラメータの値が接続関係判別関数の引数として利用される。 Character information input from the digest creation engine is given as an argument of the connection relation determination function. In the first embodiment, in addition to a fragmentary character string that describes the contents of a video scene, a digest creation engine calculates the value of an importance determination parameter described later, and generates a video content description generation device as character information. 100, and the video content description generating apparatus 100 uses the value of the importance determination parameter together with the character string as an argument of the connection relation determination function.

以下、野球番組に対する接続関係判別関数を例として具体的に説明する。野球番組の場合の代表的な接続表現として以下に示す添加接続と逆接表現が挙げられる。
＊加点が続く映像シーン間の添加接続：
例：さらに→「＜さらに＞、ワンアウト、ランナー２塁、３塁、清原のホー
ムランにより，，，」
＊得点チャンスを逃がした場合の逆接表現：
例：しかし→「ランナー高橋３塁に進みました。＜しかし＞、４番清原セン
ターフライに倒れ、，，，」 Hereinafter, a connection relation determination function for a baseball program will be specifically described as an example. Representative connection expressions in the case of a baseball program include the following additional connection and reverse connection expressions.
* Additional connections between video scenes with additional points:
Example: more → "<more>, one-out, runner second base, third base, Kiyohara home run ,,,,"
* Reverse connection expression when missing a score chance:
Example: However, → "I went to Runner Takahashi 3rd base. <But> I fell to the 4th Kiyohara center fly ,,,,,"

また、説明文を生成する対象となる映像が野球の場合、接続関係判別関数で利用する重要度判定パラメータは以下のものとした。なお、いずれも正の値をとる。
＊攻撃レベル（重要度判定パラメータ）
攻撃的に重要なレベルを示す。ヒットやホームランなど攻撃的に重要な事象のときに値が上がる。
＊興奮レベル（重要度判定パラメータ）
視聴者の期待および興奮度を示す。例えば、打順が３、４、５番のクリーンナップの打席であったり、ランナーが３塁に出ていて特定のチャンスであるといったようなときに値が上がる。
＊投手レベル（重要度判定パラメータ）
投手および守備の調子を示す。ストライクや連続三振のときに値があがる。 When the video for which the description is to be generated is baseball, the importance determination parameters used in the connection relation determination function are as follows. In addition, each takes a positive value.
* Attack level (importance judgment parameter)
Indicates an aggressively important level. Raises during aggressive events such as hits and home runs.
* Excitement level (importance judgment parameter)
Show audience expectations and excitement. For example, the value increases when the batting order is a turn at the cleanup of the third, fourth or fifth or when the runner is on the third base and has a specific chance.
* Pitcher level (importance judgment parameter)
Shows the pitcher and defensive tone. The value increases during strikes and continuous strikeouts.

図２は、接続関係判別関数のアルゴリズムを示す。このアルゴリズムの例では、説明を簡単にするために、説明文を生成する対象となる映像の構造のクラスが打席あるいは投球クラスといった小さい場合（換言すれば、前述した映像シーンの階層が下位階層の場合）と、イニングクラスのように大きい場合（換言すれば、前述した映像シーンの階層が上位階層の場合）に分けて考える。前者では、〔攻撃レベル−投手レベル〕を指標として、その計算値を興奮レベルでバイアスをかけるようにしてある（内容指標レベル）。マジックナンバのα，β，γについてはそれぞれ５，６，０に設定してある。また、イニング間の関係は得点の変化を基に計算している。 FIG. 2 shows an algorithm of the connection relation determination function. In the example of this algorithm, in order to simplify the explanation, in the case where the class of the structure of the video for which the description is to be generated is a small class such as a bat or pitching class (in other words, the hierarchy of the video scene is a lower hierarchy) Case) and a case such as a large inning class (in other words, a case where the layer of the video scene is a higher layer). In the former, the calculated value is biased by the excitement level using [attack level-pitcher level] as an index (content index level). The magic numbers α, β, and γ are set to 5, 6, and 0, respectively. The relationship between innings is calculated based on the change in score.

野球の場合、接続関係判別関数の返り値は、添加と逆接のいずれかとなる。ただし、例外的な場合には、これ以外の値をもつ場合も否定できないが、殆どの場合にはこの２通りであると考えられる。なお、接続関係判別関数は視聴者の嗜好に依存しない。例えば、どちらのチームのファンであっても形勢逆転は逆接であり、点数の追加は添加である。 In the case of baseball, the return value of the connection relation discrimination function is either addition or reverse connection. However, in exceptional cases, it is not possible to deny having a value other than this, but in most cases, it is considered that these two types are used. Note that the connection relation determination function does not depend on the viewer's preference. For example, regardless of the fans of either team, reversal is a reverse connection, and adding points is addition.

（２）前書き文・後書き文の生成処理
説明文生成部１０１は、ある映像シーンの説明文を生成する際に、その時点にける各種の状況や、前提条件などを必要に応じて前書き文として提示する。また、ある映像シーンの説明をして、次の映像シーンの説明に入る前に、その映像シーンが全体に及ぼした結果の情報などを必要に応じて後書き文として提示する。これらの前書き文、後書き文は、映像シーンの階層構造を利用して、該当する映像シーンの親シーン（上位の階層の映像シーン）の文字情報から生成する。 (2) Foreword / Postscript Generation Processing When generating a description of a certain video scene, the description generation unit 101 converts various situations at that time, preconditions, and the like as a preamble as needed. Present. In addition, a certain video scene is described, and before the description of the next video scene, information on the result of the video scene as a whole is presented as a postscript if necessary. These preamble sentences and postscript sentences are generated from the character information of the parent scene of the corresponding video scene (the video scene of the higher hierarchy) using the hierarchical structure of the video scene.

具体的には、例えば、野球映像の場合、前書き文（前書きの表現）は、その時点で処理を行っている映像シーンの状況などを示す情報から生成される。
例えば、親シーンに付加された文字情報として、
・得点状況
・攻撃チーム名
・アウトカウント
・出塁ランナー
・投手名
・打者名
・ボールカウント
がある場合、「５回の裏、巨人の攻撃、ワンアウト、ランナー２，３塁，，」というような文字列を前書き文として自動的に生成することができる。 Specifically, for example, in the case of a baseball video, a preamble sentence (expression of the preamble) is generated from information indicating a situation of a video scene being processed at that time.
For example, as character information added to the parent scene,
・ Score situation ・ Attack team name ・ Out count ・ Base runner ・ Pitcher name ・ Batter name ・ If there is a ball count, such as "5 backs, giant attack, one out, runners 2nd and 3rd base ,," Strings can be automatically generated as preambles.

また、後書き文（後書きの表現）は、結果に関する情報、例えば、
・試合の結果
・出塁ランナーの結果
・得点結果
等の結果に関する情報を、その時点の状況を示す情報から生成する。 In addition, the postscript sentence (expression of the postscript) is information about the result, for example,
-Results of Matches-Results of base runners-Information on results such as score results is generated from information indicating the situation at that time.

前述したように実施の形態１の映像内容の説明文生成方法および映像内容説明文生成装置によれば、文字情報から各映像シーンの内容を判定し、前後の映像シーンの関係により、順接、逆接、並列、添加、選択等の中から接続表現を選択し、選択した接続表現を用いて、該当する前後の映像シーンの説明文を接続した映像内容の説明文を生成するため、各映像シーンから生成した説明文の前後のつながりや、関連性を明瞭にして、視聴者（利用者）にとって違和感のないスムーズな文章の流れの説明文を生成することができる。 As described above, according to the video content description generating method and the video content description generating device of the first embodiment, the content of each video scene is determined from the character information, Select a connection expression from among concatenation, parallel, addition, selection, etc., and use the selected connection expression to generate a description of the video content that connects the description of the relevant video scene before and after. It is possible to clarify the connection and relevance of the description sentence before and after and generate a description of the flow of the sentence that is smooth for the viewer (user) without discomfort.

また、階層構造を用いて構造化された映像ストリームから検索結果として得られた各映像シーンのある階層の映像シーンについての説明文を生成する際に、階層構造を利用して、上位の階層の映像シーンの文字情報から説明文の前書きとなる前書き文を生成するため、検索結果である各映像シーンが、前後の映像シーンの中でどのような意味を持つのかを明確するための概要説明的な文章として、前書き文を生成することができる。同様に、当該階層の映像シーンの上位の階層の映像シーンの文字情報から説明文の後書きとなる後書き文を生成するため、検索結果である各映像シーンが、前後の映像シーンの中でどのような意味を持つのかを明確するための概要説明的な文章として、後書き文を生成することができる。 In addition, when generating a description about a video scene of a certain hierarchy of each video scene obtained as a search result from a video stream structured using a hierarchical structure, the hierarchical structure is used to generate a description of a higher hierarchy. In order to generate a foreword sentence that is a foreword of the explanatory note from the character information of the video scene, a brief explanation to clarify what each search result video scene has in the preceding and following video scenes A foreword sentence can be generated as a simple sentence. Similarly, in order to generate a postscript sentence that is a postscript of the explanatory note from the character information of the video scene of the upper layer of the video scene of the relevant layer, how each video scene as a search result is A trailing sentence can be generated as a brief explanatory sentence for clarifying the meaning.

〔実施の形態２〕
図３は、実施の形態２の映像内容説明文生成装置の概略構成図を示す。実施の形態２の映像内容説明文生成装置２００は、図示しないダイジェスト作成エンジンからダイジェスト映像用のシーンとして検索された各映像シーンの内容を説明する断片的な文字列からなる複数の文字情報を入力し、該文字情報を用いて映像シーンの映像内容を説明する説明文を生成する説明文生成部２０１と、予め映像シーン毎に、その映像内容に対する利用者の感情的な変化の度合い（嗜好レベル）を計算するための複数のパラメータを感情度パラメータとして定義して記憶した記憶部２０２と、利用者の嗜好情報を設定するための設定部２０３と、各映像シーンに対応する感情度パラメータおよび嗜好情報を用いて、各映像シーンに対する利用者の感情的な反応の度合い値（嗜好レベルの値）を計算する計算部２０４とから構成される。 [Embodiment 2]
FIG. 3 is a schematic configuration diagram of a video content description generating apparatus according to the second embodiment. The video content description generating apparatus 200 according to the second embodiment inputs a plurality of pieces of character information composed of fragmentary character strings that explain the content of each video scene retrieved as a digest video scene from a digest creation engine (not shown). A description generating unit 201 that generates a description explaining the video content of the video scene using the character information; and a degree of a user's emotional change in the video content (preference level) for each video scene in advance. ), A storage unit 202 in which a plurality of parameters are defined and stored as emotion degree parameters, a setting unit 203 for setting user preference information, an emotion degree parameter and a preference corresponding to each video scene. And a calculation unit 204 that calculates a degree value (preference level value) of a user's emotional reaction to each video scene using the information. It is.

なお、感情度パラメータの複数のパラメータは、映像シーンに付加され文字情報の内容と嗜好情報の内容との組み合わせによって度合い（嗜好レベル）が数値化されるものである。また、計算部２０４は、この数値化された度合いを用いて度合い値（嗜好レベルの値）を計算する。 The plurality of emotion degree parameters are added to the video scene and the degree (preference level) is quantified by a combination of the contents of the character information and the contents of the preference information. The calculation unit 204 calculates a degree value (preference level value) using the digitized degree.

さらに、詳細は後述するが実施の形態２では、説明文生成部２０１が、文字情報を用いて映像シーンの映像内容を説明する説明文を生成する際に、計算部２０４で計算した度合い値（嗜好レベルの値）に基づいて感情表現を示す感情表現文を付加するものである。 Further, in the second embodiment, when the description generating unit 201 generates the description explaining the video contents of the video scene using the character information, the degree value ( An emotion expression sentence indicating an emotion expression is added based on the value of the preference level).

以上の構成において、実施の形態２の要部である感情表現生成処理（感情表現の付加処理）について、その動作を具体的に説明する。
感情表現生成処理では、各映像シーンの文字情報から説明文を生成する際に、事実を客観的に述べるだけではなく、視聴者の嗜好情報を利用して、表現方法を変化させる。例えば、検索結果が視聴者にとって嬉しい内容であれば、嬉しさあふれる表現を、検索結果が悲しい内容であれば、悲しい気持ちを表す表現を生成する。なお、実施の形態２の感情表現生成処理では、視聴者の嗜好情報を利用して感情表現を説明文（文章）で表すが、感情表現生成処理そのものは、映像中の音楽、画面の色調などの演出効果や、説明文を話す仮想キャラクタの表情などに反映させることも可能である。 In the configuration described above, the operation of the emotion expression generation process (emotional expression addition process), which is a main part of the second embodiment, will be specifically described.
In the emotion expression generation process, when generating a description from the character information of each video scene, not only the facts are objectively stated, but also the expression method is changed using the preference information of the viewer. For example, if the search result is content that is happy for the viewer, an expression full of joy is generated, and if the search result is sad content, an expression expressing sad feeling is generated. In the emotion expression generation process according to the second embodiment, the emotion expression is represented by an explanatory sentence (sentence) using the preference information of the viewer, but the emotion expression generation process itself includes music in a video, color tone of a screen, and the like. Can be reflected in the effect of the virtual character, the expression of the virtual character speaking the explanation, and the like.

ここで、記憶部２０２、設定部２０３、計算部２０４および説明文生成部２０１による感情表現生成処理の一連の流れを、検索結果に対する視聴者の嗜好レベル（度合い）の計算関数（以下、感情度判別関数と記載する）のアルゴリズムで表現して説明する。 Here, a series of flows of emotion expression generation processing by the storage unit 202, the setting unit 203, the calculation unit 204, and the description generation unit 201 are described as a calculation function of a viewer's preference level (degree) with respect to search results (hereinafter, emotion level). The algorithm will be described below.

以下、野球の場合を例として説明する。図４はこの感情度判別関数のアルゴリズムを示す。嗜好レベルの計算は、初めに、利用者が攻撃チームファンであるという立場にたって計算する。嗜好情報で設定された利用者の嗜好が守備チームである場合には、最後に正負の逆転をする。つまり、攻撃チームにとって攻撃の流れに乗っている場合、嬉しさ度（利用者の感情的な変化の度合い：正の変化）は高くなるが、逆に守備チームにとっては悲しさ度（利用者の感情的な変化の度合い：負の変化）が高くなる。 Hereinafter, the case of baseball will be described as an example. FIG. 4 shows an algorithm of the emotion degree discrimination function. The preference level is calculated first from the standpoint that the user is an attacking team fan. When the user's preference set in the preference information is the defense team, the sign is reversed last. In other words, when the attacking team is riding on the flow of attack, the degree of joy (the degree of the user's emotional change: positive change) increases, but the degree of sadness (the user's The degree of emotional change: negative change) increases.

また、その値は利用者の嗜好度が高くなるほど増幅される。その増幅の調整値φを図においては「５」に設定してある。これによって、例えば、好きな選手が出ている時には、良い場面はより嬉しく、悪い場面はより悲しくなるというような、利用者の感情的な変化を表現することが可能となる。 The value is amplified as the user's preference level increases. The amplification adjustment value φ is set to “5” in the figure. This makes it possible to express emotional changes of the user, for example, when a favorite player is appearing, a good scene is more happy and a bad scene is more sad.

なお、このアルゴリズムは、仮定として、嗜好情報によって設定さた利用者の嗜好が、好きな選手の所属するチームと自分が応援するチームが同じである場合を想定して作成したものである。 This algorithm is created on the assumption that the user's preference set by the preference information is the same as the team to which the favorite player belongs and the team to which he supports.

実施の形態２の映像内容説明文生成装置２００において、各映像シーンの文字情報を入力すると、計算部２０４は、各映像シーンに対応した感情度パラメータを記憶部２０２から読み出して、設定部２０３に設定されている嗜好情報を参照して、感情度パラメータに該当する嗜好情報および該当する文字情報を設定して計算を行い、対象となる映像シーンの嗜好レベルの値を求める。次に、説明文生成部２０１は、各映像シーンの文字情報を入力して映像シーンの映像内容を説明する説明文を生成すると共に、計算部２０４で求めた嗜好レベル（度合い値）に基づいて感情表現を示す感情表現文を付加する。例えば、映像シーンの嗜好レベルの値が（嗜好レベル＞θ）の場合には、嬉しいという感情を示す感情表現文を付加する。説明文が「ツーアウト、ランナー３塁、高橋のタイムリーヒットで逆転します。」という内容であった場合、感情表現文「やりました。」を付加して、「ツーアウト、ランナー３塁、高橋のタイムリーヒットで逆転します。やりました。」という説明文を生成する。 In the video content description generating apparatus 200 according to the second embodiment, when character information of each video scene is input, the calculation unit 204 reads out the emotion degree parameter corresponding to each video scene from the storage unit 202, and sends the parameter to the setting unit 203. With reference to the set preference information, preference information corresponding to the emotion degree parameter and corresponding character information are set and calculation is performed to obtain the value of the preference level of the target video scene. Next, the description generation unit 201 inputs character information of each video scene to generate a description explaining the video content of the video scene, and based on the preference level (degree value) obtained by the calculation unit 204. An emotion expression sentence indicating an emotion expression is added. For example, when the value of the preference level of the video scene is (preference level> θ), an emotion expression sentence indicating an emotion of being happy is added. If the description is "Two-out, runner third base, Takahashi will reverse in a timely hit", add the emotional expression "I did it." And add "two-out, runner third base, Takahashi. Will be reversed with a timely hit of. I did it. "

前述したように実施の形態２の映像内容の説明文生成方法および映像内容説明文生成装置によれば、説明文生成手段が、文字情報を用いて映像シーンの映像内容を説明する説明文を生成する際に、度合い値に基づいて感情表現を示す感情表現文を付加するため、視聴者（利用者）の感覚に適合させて、嗜好を反映した説明文を生成することができる。換言すれば、利用者の感情的な反応の度合い値に対応させて、より柔軟に（または段階的に）嗜好を反映した説明文を生成することにより、利用者の嗜好に沿ったパーソナルな説明文を作成することが可能となる。 As described above, according to the video content description generation method and the video content description generation device of the second embodiment, the description generation unit generates the description describing the video content of the video scene using the character information. At this time, since an emotional expression sentence indicating an emotional expression is added based on the degree value, it is possible to generate an explanatory sentence that reflects the taste in conformity with the sense of the viewer (user). In other words, by generating a description that reflects the preference more flexibly (or stepwise) in accordance with the degree value of the emotional reaction of the user, a personal description according to the preference of the user is generated. A sentence can be created.

また、実施の形態２では、感情度パラメータの複数のパラメータは、映像シーンに付加され文字情報の内容と嗜好情報の内容との組み合わせによって度合いが数値化され、計算部２０４は数値化された度合いを用いて度合い値を計算するため、さらに視聴者（利用者）の感覚に適合させて、嗜好を反映した説明文を生成することができる。換言すれば、利用者の感情的な反応の度合い値に対応させて、より柔軟に（または段階的に）に感情表現文を付加でき、利用者の嗜好を反映したパーソナルな説明文を作成することができる。 Further, in the second embodiment, the degree of the plurality of emotion degree parameters is quantified by a combination of the contents of the character information and the contents of the preference information added to the video scene. Is used to calculate the degree value, and it is possible to generate an explanatory sentence reflecting tastes, further adapted to the senses of the viewer (user). In other words, an emotional expression can be added more flexibly (or stepwise) in accordance with the degree of the emotional reaction of the user, and a personalized description reflecting the user's preference is created. be able to.

〔実施の形態３〕
実施の形態３では、映像の階層構造に基づく説明文生成アルゴリズムを用いて映像内容の説明文を生成する方法について説明する。図５は実施の形態３の説明文生成関数（説明文生成アルゴリズム）を示す。図示の如く、説明文生成関数は実施の形態１または実施の形態２で説明した接続関係判別関数および感情度判別関数を用いながら、階層的に再起呼び出しを使い、順に説明文を生成する。 [Embodiment 3]
In the third embodiment, a method of generating a description of a video content using a description generation algorithm based on a hierarchical structure of the video will be described. FIG. 5 illustrates an explanatory note generation function (explanatory note generation algorithm) according to the third embodiment. As shown in the figure, the description generation function sequentially generates a description by using a recursive call while using the connection relation determination function and the emotion degree determination function described in the first or second embodiment.

例えば、ある映像シーンに対する説明文を生成する場合、まずその映像シーンがクラス（階層）の先頭であるか否かをチェックする。先頭である場合、前の映像シーンは存在しないので、接続関係判別関数は呼ばない。クラス階層ごとに、同レベルのクラスインスタンスの集合に対して、前書き文、後書き文を付加する。例えば、野球では、文字情報から「５回の裏、巨人の攻撃、ワンアウト、ランナー２、３塁」というような前書き文を生成する。後書き文としては、そのイニング終了時の得点状況や、イニングの概要説明などを生成する。 For example, when generating a description for a video scene, it is first checked whether the video scene is at the top of a class (hierarchy). If it is the head, there is no previous video scene, so the connection relation determination function is not called. For each class hierarchy, a preamble sentence and a postscript sentence are added to a set of class instances at the same level. For example, in baseball, a preamble sentence such as "5 backs, giant attack, one out, runners 2 and 3rd base" is generated from character information. As a postscript sentence, a score status at the end of the inning, a brief description of the inning and the like are generated.

計算された感情レベルの値は、説明文生成関数の各所で利用される。前書き文の生成においては、正値：嬉しいの場合、「嬉しいことに，，，」などの表現を加える。反対に、負値：悲しいの場合、「残念なことに，，，」などの表現を付加する。また、後書き文の生成の場合は、「本当によかったですね」、「全く残念な結果となってしまいました」などの表現を加える。 The calculated emotion level value is used in various parts of the explanation generating function. In the generation of the preamble sentence, if the positive value is happy, an expression such as "happily ,,," is added. Conversely, a negative value: If sad, add an expression such as "Unfortunately ,,,". In addition, in the case of generating a postscript sentence, add expressions such as "It was really good" or "It was a completely disappointing result".

図６は、実施の形態３の説明文生成関数を用いた場合、ある試合に対してどのような順序で説明文が生成されるかを示した説明図である。視聴者が広島ファンであった場合、それぞれの説明文は以下の（１）〜（１７）のようになる。なお、対応する文章が必要ない場合は、説明文の生成は行わない。また、図において、矢印および数字を用いて説明文の生成する順序を記述している。 FIG. 6 is an explanatory diagram showing the order in which explanatory texts are generated for a certain game when the explanatory text generation function of the third embodiment is used. If the viewer is a Hiroshima fan, the explanations are as follows (1) to (17). If a corresponding sentence is not required, no explanation is generated. In the figure, the order in which explanatory sentences are generated is described using arrows and numbers.

（１）１０月３日、広島対巨人戦が東京ドームで行われました。
（２）１回の表、広島の攻撃、
（４）江藤の打席で、ソロホームランがでました。
（５）よかったですね。
（６）１回表を終わり、江藤のホームランにより１対０で広島が先制してい
ます。
（７）しかし悔しいことに、１回の裏、すぐに巨人に逆転されてしまいまし
た。
（８）まず
（９）先頭バッター川相がセンター前ヒットで出塁しました。
（１０）巨人の反撃開始です。
（１１）さらに
（１２）松井がフォアボールで出塁です。
（１３）走者、１，２塁。広島、ピンチです。
（１４）その上、残念なことに
（１５）高橋のタイムリーヒットで、２点加点。巨人１−２と逆転です。
（１７）１回裏を終わり、広島１対２と巨人に逆転されてしまいました。まっ
たく残念なことです。 (1) On October 3, a match between Hiroshima and Giants was held at Tokyo Dome.
(2) One table, Hiroshima attack,
(4) At Eto's turn at bat, there was a solo home run.
(5) It was good.
(6) After finishing the table once, Hiroshima preempted 1-0 by Eto's home run.
You.
(7) Unfortunately, one time, it was immediately reversed by a giant
Was.
(8) First (9) Leader Batter Kawasho made a hit in front of the center.
(10) The giant's counterattack has begun.
(11) Further (12) Matsui is on the foreball.
(13) Runner, first and second base. Hiroshima, pinch.
(14) In addition, unfortunately (15) Takahashi's timely hit gives 2 points. It is a reversal with giant 1-2.
(17) I finished the back once, and I was turned over by Hiroshima 1: 2 and a giant. True
I'm sorry.

前述したように実施の形態３によれば、映像の階層構造に基づく説明文生成アルゴリズムを用いて映像内容の説明文を生成するので、実施の形態１および実施の形態２の効果に加えて、階層構想表現を用いて説明文をより、明確に作成することができ、さらに読み易い文章にすることができる。特に、実施の形態３によれば、映像の階層構造を汎用的に利用することが可能となるという効果を奏することができる。 As described above, according to the third embodiment, the description of the video content is generated by using the description generation algorithm based on the hierarchical structure of the video. Therefore, in addition to the effects of the first and second embodiments, The description can be more clearly created using the hierarchical concept expression, and the sentence can be made more readable. In particular, according to the third embodiment, it is possible to obtain an effect that the hierarchical structure of a video can be used for general purposes.

ここで、前述した実施の形態１〜実施の形態３の映像内容の説明文生成方法および映像内容説明文生成装置を、ダイジェスト作成システムに適用した場合について説明する。図７は、本発明の映像内容の説明文生成方法を映像文生成機能として取り込んだダイジェスト作成システムの概要図を示し、ダイジェスト作成エンジンにより切り出したシーン（映像シーン）およびその簡単な説明文が出力された後、最終的にＴＶ装置上でどのように表示されるかまでのシステム全体の概要を説明する。なお、図において、ＴＶ視聴者と対話的に操作を進めていくユーザインターフェース（ＵＩＦ）を番組視聴用ユーザインターフェースと呼び、以降、ＰＶ（ＰｒｏｇｒａｍＶｉｅｗｅｒ）と略す。 Here, a description will be given of a case where the method for generating a description of a video content and the video content description generating apparatus according to the first to third embodiments are applied to a digest creation system. FIG. 7 is a schematic diagram of a digest creation system that incorporates the video content description generation method of the present invention as a video text generation function, and outputs a scene (video scene) cut out by the digest generation engine and a brief description thereof. After that, an overview of the entire system up to how it is finally displayed on the TV device will be described. In the figure, a user interface (UIF) for performing an operation interactively with a TV viewer is called a program viewing user interface, and is hereinafter abbreviated as PV (Program Viewer).

ダイジェスト作成エンジンにより生成された説明文（文字列）および映像シーンは、説明文生成関数に入力され、接続表現および感情表現、構造表現を含む説明文として生成される。この生成された説明文や、各映像シーン、計算された接続のタイプおよび嗜好レベルがＰＶに渡される。 The description (character string) and the video scene generated by the digest creation engine are input to a description generation function, and are generated as a description including a connection expression, an emotion expression, and a structural expression. The generated description, each video scene, the calculated connection type and preference level are passed to the PV.

ＰＶは、ＴＶ視聴を対象としたユーザインターフェースであるため、ＴＶ番組シナリオでできるようなアクション記述能力が必要とされる。この要件を満たすものとしてＴＶＭＬが知られている。なお、このＴＶＭＬの技術については、林、折原、下田、他：「テレビ番組記述言語ＴＶＭＬの言語仕様とＣＧ記述方法」第３回知能情報メディアンシンポジウム、ｐｐ７５−８０，１９９７．に記述されている。 Since PV is a user interface for viewing TV programs, it is required to have an action description capability that can be performed in a TV program scenario. TVML is known to satisfy this requirement. The technology of TVML is described in Hayashi, Orihara, Shimoda, et al .: "Language specifications and CG description method of TV program description language TVML", 3rd Intelligent Information Median Symposium, pp. 75-80, 1997. It is described in.

ＴＶＭＬは、ＴＶ番組のシナリオを記述する言語としてよく仕様検討され、広く普及している言語であるので、ＰＶインタプリタでは、ＴＶ番組記述言語としてＴＶＭＬインタプリタを用いることができる。なお、ＰＶインタプリタからＴＶＭＬインタプリタを呼び出すことで、ＴＶＭＬのもつ以下のような機能を実現することができる。
＊ＣＧキャラクタの選択、配置およびシナリオ中での動作（首を傾げる等）
＊カメラの位置の設定、複数台カメラ間のスイッチング、パンチルト
＊動画および音声ファイル再生
＊ビデオイフェクト
＊字幕の表示 TVML is a language widely described and widely used as a language for describing a scenario of a TV program. Therefore, the PV interpreter can use the TVML interpreter as a TV program description language. The following functions of the TVML can be realized by calling the TVML interpreter from the PV interpreter.
* Selection and arrangement of CG characters and movements in scenarios (tilting the head, etc.)
* Camera position setting, switching between multiple cameras, pan / tilt * Video and audio file playback * Video effect * Subtitle display

ダイジェスト作成システムの出力する動画は、最終的にＴＶＭＬの動画再生機能で再生される。また、ＰＶ記述言語では、シーンの遷移における照明の変化や、カメラのズームインアクションなどのＴＶ的演出効果を記述できるようにすることが望ましい。 The moving image output by the digest creation system is finally reproduced by the moving image reproduction function of TVML. Further, in the PV description language, it is desirable to be able to describe a TV-like effect such as a change in illumination at a scene transition or a zoom-in action of a camera.

また、ＴＶＭＬライブラリとして、図示の如く、キャラクタデザインや、そのキャラクタ語彙等をデータベース化する。例えば、ＰＶが、現在選択されているキャラクタの語彙データベースを検索し、そのキャラクタがその種類の接続言語を話すときの台詞を見つけ、コードに埋め込むという処理を行うことができる。具体的には、マルチリンガル対応の場合、キャラクタによって逆接表現「しかし」、“ｂｕｔ”、“ｈｏｗｅｖｅｒ”などを使いわけるといった処理を行う。 In addition, as shown in the figure, a database of the character design, the character vocabulary thereof, and the like is stored as a TVML library. For example, the PV can search the vocabulary database of the currently selected character, find the dialogue when that character speaks that type of connected language, and embed it in the code. Specifically, in the case of multilingual support, processing is performed such as using the inverse concatenation expression “but”, “but”, “however”, etc. depending on the character.

上記のようなダイジェスト作成システムでは、簡単な映像検索問い合わせの実現の他に、検索結果として得られたダイジェスト映像を如何にわかりやすく提示するかが大きな問題となるが、本発明の映像内容の説明文生成方法および映像内容説明文生成装置を一つの説明文生成機能として組み込むことにより、この問題を解決するために大いに役に立つことは明らかである。 In the digest creation system as described above, in addition to realizing a simple video search query, how to present a digest video obtained as a search result in an easy-to-understand manner is a major problem. It is clear that incorporating the sentence generation method and the video content explanation sentence generation device as one explanation sentence generation function is very useful for solving this problem.

以上説明した実施の形態１〜３に係る映像内容の説明文生成方法は、前述した説明および各フローチャート（アルゴリズム）に示した手順に従って予め用意したプログラムをコンピュータで実行することによって実現することができる。このプログラムは、ハードディスク、フロッピー（Ｒ）ディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ等のコンピュータで読み取り可能な記録媒体に記録されて提供される。またはネットワークを介して配布することができる。 The method for generating a description of video content according to the first to third embodiments described above can be realized by executing a program prepared in advance by a computer according to the above-described description and the procedure shown in each flowchart (algorithm). . This program is provided by being recorded on a computer-readable recording medium such as a hard disk, a floppy (R) disk, a CD-ROM, an MO, and a DVD. Or it can be distributed over a network.

〔実施の形態４〕
実施の形態４は、本明のダイジェスト映像の番組化方法およびダイジェスト映像の番組化装置を示す。実施の形態４のダイジェスト映像の番組化装置は、１つの映像ストリームの中からダイジェスト映像用のシーンとして検索された各映像シーンと、各映像シーンに対して作成された映像内容の説明文を入力し、各映像シーンの再生に加えて、予め設定された仮想キャラクタを介して映像内容の説明文を音声または文字で提供することでダイジェスト映像の番組を作成するものであり、映像シーンおよび映像内容の説明文と共に、各映像シーンの映像内容に対する仮想キャラクタの感情的な反応の度合い値を入力し、各映像シーン毎に度合い値に基づいて仮想キャラクタの感情表現の演出を行う。 [Embodiment 4]
Embodiment 4 shows a digest video programming method and a digest video programming apparatus of the present invention. The digest video programming apparatus according to the fourth embodiment inputs each video scene retrieved as a digest video scene from one video stream, and a description of the video content created for each video scene. In addition to the reproduction of each video scene, a digest video program is created by providing a description of the video content in voice or text via a preset virtual character. Of the virtual character with respect to the video content of each video scene, and the effect of expressing the emotion of the virtual character is produced based on the degree value for each video scene.

また、ダイジェスト映像用の映像シーンと共に、実施の形態２の映像内容説明文生成装置２００で生成された各映像シーンの説明文、前書き文、後書き文および度合い値を入力し、ダイジェスト映像の番組を作成する。このとき、各映像シーンの再生に加えて、予め設定された仮想キャラクタを介して説明文、前書き文および後書き文を音声で提供すると共に、各映像シーン毎に度合い値に基づいて仮想キャラクタの感情表現の演出を行うものである。 In addition to the video scene for the digest video, the description, the preamble, the postscript, and the degree value of each video scene generated by the video content description generation device 200 according to the second embodiment are input, and the digest video program is displayed. create. At this time, in addition to the reproduction of each video scene, an explanatory sentence, an introductory sentence, and a postscript sentence are provided in audio via a preset virtual character, and the emotion of the virtual character is determined based on the degree value for each video scene. This is to produce an expression.

図８は、実施の形態４のダイジェスト映像の番組化装置４００のブロック構成図を示す。なお、２００は、前述した実施の形態２の映像内容説明文生成装置を示し、前提条件として、映像内容説明文生成装置２００で、ダイジェスト映像用のシーンとして検索された各映像シーンに対して、説明文、前書き文、後書き文およびその映像内容に対する利用者の感情的な変化の度合いを示す度合い値と、さらにスーパー（キャプション）が生成され、これらの６つの情報がダイジェスト映像の番組化装置４００に渡されるものとする。 FIG. 8 is a block diagram showing a digest video programmer 400 according to the fourth embodiment. Note that reference numeral 200 denotes the video content description generating apparatus according to the second embodiment described above. As a precondition, each video scene searched by the video content description generating apparatus 200 as a scene for a digest video is used. A description value, an introductory sentence, an introductory sentence, and a degree value indicating the degree of emotional change of the user with respect to the video content and a super (caption) are generated, and these six pieces of information are converted into a digest video programmer 400. Shall be passed to

なお、実施の形態４では、これら６つの情報を以下のように置き換えて記述する。
１）映像シーン（映像シーン）
２）前書き説明文（前書き文）
３）イベント説明文（説明文）
４）後書き説明文（後書き文）
５）スーパー（スーパー）
６）感情レベルパラメータ（感情種類情報を有する度合い値） In the fourth embodiment, these six pieces of information will be described in the following manner.
1) Video scene (video scene)
2) Foreword explanation (foreword)
3) Event description (description)
4) Postscript explanation (postscript)
5) Super (super)
6) Emotion level parameter (degree value having emotion type information)

ただし、これらの６つの情報のうち、映像シーン以外の情報は必要に応じて生成されるか、あるいは必要に応じて情報が設定されていなくても良いものである。また、ここで、感情レベルパラメータは、喜怒哀楽等の感情の種類を示すための感情種類情報を有している。感情種類情報としては、例えば、『嬉しい』、『楽しい』、『可笑しい』、『びっくりした』、『悲しい』、『悔しい』、『残念』、『安堵』などのように具体的な感情の種類を示す情報が設定される。 However, of these six pieces of information, information other than the video scene may be generated as needed, or information may not be set as needed. Here, the emotion level parameter has emotion type information for indicating the type of emotion such as emotion, anger, and so on. Emotion type information includes specific types of emotion, such as "happy," "fun," "smiley," "surprised," "sad," "frustrated," "sorry," "relief," etc. Is set.

さらに、感情レベルパラメータは、複数の感情レベルパラメータで構成することも可能であり、例えば、、『悔しい』という感情種類情報を有する感情レベルパラメータと、『残念』という感情種類情報を有する感情レベルパラメータとの２つの感情レベルパラメータを用いて１つの映像シーンの感情レベルパラメータが構成されていても良い。このように複数の感情レベルパラメータを用いることにより、それぞれの感情種類情報の内容を合成した『悔しくて、残念』というような感情を表現し、情報として利用することができる。 Further, the emotion level parameter can be composed of a plurality of emotion level parameters. For example, an emotion level parameter having an emotion type information of "regret" and an emotion level parameter having an emotion type information of "sorry" The emotion level parameter of one video scene may be configured using the two emotion level parameters. By using a plurality of emotion level parameters in this way, it is possible to express an emotion such as "regretful and disappointing" by combining the contents of each emotion type information and use it as information.

実施の形態４のダイジェスト映像の番組化装置４００は、映像ファイル生成部４０１と、番組定義ファイルデータベース４０２と、演出定義データベース４０３と、演出テンプレート選択部４０４と、演出処理部４０５と、ＰＶＭＬインタプリタ４０６と、ＴＶＭＬプレーヤ４０７と、ＴＶ（テレビジョン：表示装置））４０８とから構成される。また、図示を省略するが、後述する複数の番組定義ファイルの中から所望の番組定義ファイルを指定するための指定手段を備えている。この指定手段は、パソコンの表示画面やキーボード等で容易に構成することができる。 The digest video programming apparatus 400 according to the fourth embodiment includes a video file generation unit 401, a program definition file database 402, an effect definition database 403, an effect template selection unit 404, an effect processing unit 405, and a PVML interpreter 406. , A TVML player 407, and a TV (television: display device) 408. Although not shown in the drawing, a designation unit for designating a desired program definition file from a plurality of program definition files described later is provided. This designation means can be easily constituted by a display screen of a personal computer, a keyboard or the like.

映像ファイル生成部４０１は、映像内容説明文生成装置２００から１つの映像シーン毎にイベント説明文、前書き説明文、後書き説明文、スーパー（キャプション）および感情レベルパラメータを入力し、番組化の処理単位として１つの映像シーン毎にイベント説明文、前書き説明文、後書き説明文および感情レベルパラメータを対応させて映像ファイルを生成する。 The video file generation unit 401 inputs an event description, a preamble description, a postscript description, a super (caption), and an emotion level parameter for each video scene from the video content description generation device 200, and processes the program for processing. Then, a video file is generated by associating an event description, a preamble description, a postscript description, and an emotion level parameter for each video scene.

番組定義ファイルデータベース４０２は、番組の各種構成情報を番組定義ファイルとして記憶しており、番組定義ファイルの番組の各種構成情報としては、例えば、少なくとも１つの仮想キャラクタと、番組のスタジオセット、カメラの台数や位置、ＣＧ照明、ＣＧ小道具、サウンド、番組タイトル、スーパーの設定等の情報がある。なお、この番組定義ファイルは、予め複数記憶さており、所定の形式で各種構成情報を設定することにより、追加、変更等も容易に行える。 The program definition file database 402 stores various types of program configuration information as program definition files. Examples of the various types of program configuration information in the program definition file include at least one virtual character, a studio set of the program, and a camera. There is information such as the number and position, CG lighting, CG props, sound, program title, supermarket settings, and the like. It should be noted that a plurality of the program definition files are stored in advance, and addition and change can be easily performed by setting various types of configuration information in a predetermined format.

演出定義データベース４０３は、複数の演出テンプレートが記憶されており、演出テンプレートは少なくとも複数段階に設定された感情表現の程度（例えば、非常に、普通に、少しの３つの段階）毎にそれぞれ１つの演出方法が定義されている。また、これら複数の演出テンプレートは、感情種類情報および感情表現の程度をキーインデックスとして分類され、記憶されている。 The effect definition database 403 stores a plurality of effect templates, and each effect template has one at least for each degree of emotional expression set in a plurality of stages (for example, very, usually, a few three stages). The production method is defined. These effect templates are classified and stored using emotion type information and the degree of emotion expression as key indices.

また、演出定義データベース４０３には、複数の感情種類情報および複数の感情種類情報の感情表現の程度をキーインデックスとして分類された複数の演出テンプレートも記憶されている。 The effect definition database 403 also stores a plurality of effect templates in which a plurality of emotion type information and a degree of emotion expression of the plurality of emotion type information are classified as a key index.

さらに、演出テンプレートには、定義されている演出方法が適用可能な番組環境を示す番組環境情報が設定さており、また、定義されている演出方法を１回のダイジェスト映像の番組化で何回使用可能とするかを限定する使用回数限定情報が必要に応じて設定されている。 Furthermore, in the effect template, program environment information indicating a program environment to which the defined effect method is applicable is set, and the defined effect method is used several times in one digest video program. Use number limitation information for limiting whether or not it is possible is set as necessary.

また、演出テンプレートは、各演出テンプレートの有する感情種類情報および感情表現の程度に対応付けられる感情レベルパラメータのうち、最も高い感情レベルパラメータを有する映像ファイルまたは最も低い感情レベルパラメータを有する映像ファイルの番組演出処理に使用することを指定する指定情報が必要に応じて設定可能である。 The effect template is a program of a video file having the highest emotion level parameter or a video file having the lowest emotion level parameter among emotion level parameters associated with the emotion type information and the degree of emotion expression included in each effect template. Specifying information for specifying the use for the effect processing can be set as necessary.

演出テンプレート選択部４０４は、映像ファイルを入力し、１つの映像ファイル毎に感情レベルパラメータに基づいて感情表現の程度を決定し、演出定義データベース４０３から感情表現の程度に応じた感情表現の演出テンプレートを選択する。具体的には、演出テンプレートを選択する際に、感情レベルパラメータに基づいて、キーインデックスとして使用する感情種類情報および感情表現の程度を決定し、演出定義データベース４０３から該当する全ての演出テンプレートを選択する。 The effect template selection unit 404 inputs a video file, determines the degree of emotion expression based on the emotion level parameter for each image file, and from the effect definition database 403, the effect template of the emotion expression according to the degree of emotion expression. Select Specifically, when selecting an effect template, the emotion type information and the degree of emotion expression to be used as a key index are determined based on the emotion level parameter, and all applicable effect templates are selected from the effect definition database 403. I do.

また、演出テンプレート選択部４０４は、１つの映像ファイルの感情レベルパラメータが複数の感情レベルパラメータで構成されている場合には、演出テンプレートを選択する際に、複数の感情レベルパラメータに基づいて、キーインデックスとして使用する複数の感情種類情報および複数の感情種類情報の感情表現の程度を決定し、演出定義データベースから該当する全ての演出テンプレートを選択する。 When the emotion level parameter of one video file is composed of a plurality of emotion level parameters, the effect template selection unit 404 selects a key based on the plurality of emotion level parameters when selecting the effect template. A plurality of emotion type information to be used as an index and a degree of emotion expression of the plurality of emotion type information are determined, and all corresponding effect templates are selected from the effect definition database.

演出処理部４０５は、番組定義ファイル、映像ファイルおよび演出テンプレートを入力し、１つの映像ファイル毎に選択した演出テンプレートに基づいて、少なくとも映像シーンの再生タイミングと、仮想キャラクタの音声として出力するイベント説明文、前書き説明文、後書き説明文の設定および音声の出力タイミングと、仮想キャラクタの動作とを設定することにより、映像ファイル単位の番組演出処理を行う。また、このとき、使用する番組定義ファイルは、指定手段を介して指定された番組定義ファイルを使用する。 The effect processing unit 405 inputs the program definition file, the video file, and the effect template, and based on the effect template selected for each image file, at least the reproduction timing of the video scene and the event description to output as the sound of the virtual character. By setting the sentence, the preamble description, the postscript description, the audio output timing, and the operation of the virtual character, the program effect processing is performed for each video file. At this time, the program definition file to be used uses the program definition file specified via the specifying means.

以上の構成において、図９を参照してダイジェスト映像の番組化装置４００の処理の概略について説明する。ダイジェスト映像の番組化装置４００は、先ず、映像内容説明文生成装置２００で生成した入力ファイル（前書き説明文、イベント説明文、後書き説明文、スーパー、感情レベルパラメータ）と映像（ダイジェスト映像用の各映像シーン）とを入力する。当然ながらダイジェスト映像用の映像シーンは複数出力され、各映像シーン毎に、上記の入力ファイルが生成されて出力される。映像シーンによっては、前書き説明文および後書き説明文がない場合もある。 In the above configuration, the outline of the process of the digest video programmer 400 will be described with reference to FIG. The digest video programmer 400 first inputs the input files (foreword description, event description, postscript, super, emotion level parameter) generated by the video content description generator 200, and the video (each for digest video). Video scene). Naturally, a plurality of video scenes for digest video are output, and the input file is generated and output for each video scene. Depending on the video scene, there may be no foreword explanation and afterword explanation.

また、実施の形態１〜３で説明したように上記の３つの説明文には、感情表現、接続表現および階層構造表現が既に含まれている。映像内容説明文生成装置２００において感情表現を作成する基となった感情レベルパラメータ（感情種類情報を有する度合い値）は、映像シーンの演出決定に利用するため、そのままダイジェスト映像の番組化装置４００に渡される。 Further, as described in the first to third embodiments, the above three explanatory sentences already include an emotion expression, a connection expression, and a hierarchical structure expression. The emotion level parameter (degree value having emotion type information), which is the basis for creating the emotion expression in the video content description generation apparatus 200, is used as it is by the digest video programmer 400 for use in determining the production of the video scene. Passed.

映像ファイル生成部４０１は、図９のＳ９０１で示すように、入力した映像シーン、前書き説明文、イベント説明文、後書き説明文、スーパーおよび感情レベルパラメータを対応させて映像ファイルを生成する。 As shown in S901 in FIG. 9, the video file generation unit 401 generates a video file in correspondence with the input video scene, preamble description, event description, postscript description, super, and emotion level parameter.

演出テンプレート選択部４０４は、図９のＳ９０２〜Ｓ９０４で示すように、１つの映像ファイル毎（すなわち、映像シーン毎）に感情レベルパラメータから感情ＩＤ（感情表現の程度）を決定する。具体的には、予め感情表現定義ファイルとして、感情ＩＤ毎に感情レベルパラメータの数値（レベル値）の適用範囲を定義しておき、１シーン毎に、感情レベルパラメータから属する感情ＩＤを求め（Ｓ９０２，Ｓ９０３）、求めた感情ＩＤをキーインデックス（検索キー）として演出定義データベース４０３から該当する全ての演出テンプレートを選択する。 The effect template selection unit 404 determines an emotion ID (degree of emotion expression) from an emotion level parameter for each video file (that is, for each video scene) as shown in S902 to S904 in FIG. Specifically, the application range of the numerical value (level value) of the emotion level parameter is defined for each emotion ID in advance as the emotion expression definition file, and the emotion ID belonging to the emotion level parameter is determined for each scene (S902). , S903), and selects all applicable effect templates from the effect definition database 403 using the obtained emotion ID as a key index (search key).

ここで、感情レベルパラメータが複数の感情レベルパラメータで構成されている場合には、複数の感情レベルパラメータが定義されている感情ＩＤを対象とし、複数の感情レベルパラメータをキーインデックスとして全ての感情レベルパラメータがマッチングする感情ＩＤを決定し、演出定義データベース４０３から該当する全ての演出テンプレートを選択する。例えば、感情レベルパラメータがｐ１とｐ２の２つであった場合、（ｐ１：−５〜−３）ａｎｄ（ｐ２：５〜６）の範囲の場合、感情ＩＤを「悔しくて残念」とする。 Here, when the emotion level parameter is composed of a plurality of emotion level parameters, the emotion ID in which the plurality of emotion level parameters are defined is targeted, and all the emotion levels are set using the plurality of emotion level parameters as a key index. An emotion ID that matches the parameter is determined, and all corresponding effect templates are selected from the effect definition database 403. For example, if the emotion level parameters are two of p1 and p2, and if it is in the range of (p1: -5 to -3) and (p2: 5 to 6), the emotion ID is set to "regretful and disappointing."

感情表現定義ファイルには、感情ＩＤとその感情ＩＤの値範囲パターンの定義が複数並んでいるが、上から順番にみていき、始めにパターンマッチした感情ＩＤが選ばれる。 In the emotion expression definition file, a plurality of definitions of the emotion ID and the value range pattern of the emotion ID are arranged, and the emotion ID that matches the pattern first is selected in order from the top.

次に、演出テンプレート選択部４０４は、選択した感情ＩＤと対応する演出テンプレートを選ぶが、基本的には、感情ＩＤと予め用意したある演出テンプレートの関係は１対多の関係である。１つの感情ＩＤに複数の演出テンプレートを用意しておく理由は、番組としてつまらないものにならないように演出に多様性をもたせるためである。例えば、『非常に嬉しい』という感情ＩＤに対して、以下の演出方法が定義された４つの演出テンプレートの集合を用意しておくこにより、『非常に嬉しい』シーンが来ると、適宜、その中から演出テンプレートを１つ選択することが可能となる。
（演出方法１）顔を真っ赤にさせて立ち上がる
（演出方法２）嬉し涙を流す
（演出方法３）万歳三唱する
（演出方法４）くす玉を割って鳩を飛ばす Next, the effect template selection unit 404 selects an effect template corresponding to the selected emotion ID. Basically, the relationship between the emotion ID and a certain effect template prepared in advance is a one-to-many relationship. The reason why a plurality of effect templates are prepared for one emotion ID is to provide a variety of effects so as not to be a boring program. For example, for the emotion ID “Very happy”, by preparing a set of four production templates with the following production methods defined, when the “Very happy” scene comes, Can select one effect template.
(Direction Method 1) Make your face bright red and stand up (Direction Method 2) Weep with tears (Direction Method 3) Sing for three years (Direction Method 4) Break a ball and fly a dove

演出テンプレートを定義する際に注意すべき点は、始めに演出環境の枠組み（番組環境情報）を設定することである。例えば、出演する仮想キャラクタは何人か、小道具として何を使うか、などを決めておく必要がある。同様に番組定義ファイル中にも番組環境としても設定しておく必要がある。このように演出テンプレートと番組定義ファイルの両方に番組環境（演出環境）を設定することで、１つの番組中、一貫して同じ環境を用いることができる。 A point to be noted when defining an effect template is to first set an effect environment framework (program environment information). For example, it is necessary to determine how many virtual characters to appear, what to use as props, and the like. Similarly, it is necessary to set the program environment in the program definition file. By setting the program environment (effect environment) in both the effect template and the program definition file in this way, the same environment can be used consistently in one program.

例えば、キャスタ役の仮想キャラクタが２人であると、始めに決めて、該当する番組定義ファイルを決定したら、演出（演出テンプレート）も２人という環境の枠組みに合致するものだけを組み合わせる。演出テンプレートと番組定義ファイルには、環境識別子（番組環境情報）を記載し、同じ環境であることの確認に用いる。 For example, it is initially determined that there are two caster virtual characters, and if a corresponding program definition file is determined, then only effects (effect templates) that match the environment framework of two people are combined. An environment identifier (program environment information) is described in the effect template and the program definition file, and is used to confirm that the environment is the same.

実施の形態４において、演出テンプレートおよび番組定義ファイルはＰＶＭＬで記述する。また、演出テンプレートを作成する際、以下の２種類の変数を使って演出を定義する。
（変数１）映像内容説明文生成装置２００から渡される情報
例：『イベント説明文』は、変数＆ｅｎｅｔｓｃｒｉｐｔ
（変数２）番組定義ファイル中で定義した項目
例：仮想キャラクタは＆Ｃａｓｔｎｎ（ｎｎは添字）
音楽や効果音ファイルは＆Ｓｏｕｎｄｎｎ In the fourth embodiment, the effect template and the program definition file are described in PVML. When creating an effect template, an effect is defined using the following two types of variables.
(Variable 1) Information passed from the video content description generation device 200
Example: "Event description" is a variable & etnetscript
(Variable 2) Items defined in the program definition file
Example: The virtual character is & Castnn (nn is a subscript)
Music and sound effect files & Soundnn

演出テンプレートは、定義した変数を使ってＰＶＭＬコードを書くだけなので、コンテンツ間の同期は自由に記述できる。例えば、以下に示すような同期の取り方が考えられる。
（１）始めに前書き説明文を仮想キャラクタが喋る。
（２）次に、以下を並列で行う。
（２ａ）映像シーンの再生
（２ｂ）イベント説明文の喋り
（２ｃ）スーパー（キャプション）表示
（３）その後、後書き説明文を喋る。 Since the effect template simply writes PVML code using the defined variables, synchronization between contents can be freely described. For example, the following synchronization method can be considered.
(1) First, a virtual character speaks a foreword explanation.
(2) Next, the following is performed in parallel.
(2a) Reproduction of a video scene (2b) Talking of an event description (2c) Super (caption) display (3) Then, a postscript description is spoken.

演出処理部４０５は、図９に示すＳ９０５を実行する。先ず、番組定義ファイル、映像ファイルおよび演出テンプレートを入力し、１つの映像ファイル毎に選択した演出テンプレートに基づいて、少なくとも映像シーンの再生タイミングと、仮想キャラクタの音声として出力するイベント説明文、前書き説明文、後書き説明文の設定および音声の出力タイミングと、仮想キャラクタの動作とを設定することにより、映像ファイル単位の番組演出処理を行う。また、このとき、使用する番組定義ファイルは、指定手段を介して指定された番組定義ファイルを使用する。 The effect processing unit 405 executes S905 shown in FIG. First, a program definition file, a video file, and a rendering template are input, and at least the playback timing of a video scene and an event description and a preamble description output as a virtual character sound based on the rendering template selected for each video file. By setting the sentence, the postscript description, the output timing of the sound, and the operation of the virtual character, the program effect processing is performed for each video file. At this time, the program definition file to be used uses the program definition file specified via the specifying means.

さらに、演出テンプレート選択部４０４で選択された演出テンプレートが複数存在する場合、各演出テンプレートの番組環境情報を参照して指定手段を介して指定された番組定義ファイルの番組環境と合致（マッチング）するか否かを判定し、合致する演出テンプレート（すなわち、実行可能な演出テンプレート）の１つを選択し、映像ファイル単位の番組演出処理を行う。 Further, when there are a plurality of effect templates selected by the effect template selection unit 404, the program environment information of each effect template is referred to and matched (matched) with the program environment of the program definition file specified via the specifying means. Then, one of the effect templates (that is, executable effect templates) that match is selected, and a program effect process is performed for each video file.

また、演出テンプレート選択部４０４は、実行可能な演出テンプレートの１つを選択した後、演出テンプレートに使用回数限定情報が設定されている場合、選択した演出テンプレートを過去に使用した回数と使用回数限定情報とを比較して使用可能であるか否かを判定し、使用可能でない場合には、他の実行可能な演出テンプレートを選択する。 Further, after selecting one of the executable effect templates, the effect template selecting unit 404 sets the number of times the selected effect template has been used in the past and the number of times of use in the case where the effect template has been set with use number limitation information. The information is compared with the information to determine whether or not it is usable. If not, another executable effect template is selected.

具体的には、演出処理部４０５は、各映像シーンの演出テンプレートを決めた後、番組定義ファイルを参照しながら、各映像シーンの演出テンプレート（ＰＶＭＬコード）の上記変数に実際のデータを埋め込んでいく。図９の処理フローでは、最後にまとめて最終的にＰＶＭＬコードを作成するバッチ処理を示している。一方、番組利用者と対話的に処理を進めたい場合は、各映像シーン毎にＰＶＭＬコードを生成して実行するという逐次処理を行う。 Specifically, the rendering processing unit 405 determines the rendering template of each video scene, and then embeds actual data in the above-described variables of the rendering template (PVML code) of each video scene while referring to the program definition file. Go. The processing flow of FIG. 9 illustrates batch processing for finally creating a PVML code in a lump. On the other hand, when it is desired to proceed with the process interactively with the program user, a sequential process of generating and executing a PVML code for each video scene is performed.

さらに、図９のＳ９０６で示すように、まとまった動作や演出をサブルーチン化して共有するための別定義群ファイルを作成し、演出がカプセル化された別定義群ファイルを指定して一連の演出を選択するようにもできる。 Further, as shown in S906 in FIG. 9, a separate definition group file for creating a subroutine for sharing a group of actions and effects is created, and a separate definition group file in which the effects are encapsulated is designated to perform a series of effects. You can also choose.

前述したように演出処理部４０５は、映像ファイル単位の番組演出処理として、１つの映像ファイルの演出テンプレートの選択が終了すると、使用する演出テンプレートを選択して処理する逐次処理と、全ての映像ファイルの演出テンプレートの選択が終了するのを待って、各映像ファイルで使用する演出テンプレートを選択した後、処理するバッチ処理とを有している。 As described above, when the selection of an effect template for one video file is completed, the effect processing unit 405 selects the effect template to be used as a program effect process for each image file, After the effect template selection is completed, a batch process is performed for selecting an effect template to be used in each video file and then processing the selected effect template.

バッチ処理を行う際の他の変形例として、例えば、演出テンプレート選択部４０４で選択された全ての演出テンプレートを参照して、感情種類情報および感情表現の程度が同一である演出テンプレートの集合毎に、その集合が選択された回数を求め、複数回選択された集合のうち、１つの集合の中に異なる演出テンプレートが複数存在する場合、それぞれの演出テンプレートの選択回数が均一になるように演出テンプレートを選択するようにしても良い。換言すれば、各感情ＩＤ毎に選択された回数を求め、複数回選択された感情ＩＤのうち、複数の演出テンプレートを選択する感情ＩＤについて、それぞれの演出テンプレートの選択回数が均一になるように演出テンプレートを選択する。 As another modified example of performing the batch processing, for example, with reference to all the effect templates selected by the effect template selecting unit 404, for each set of effect templates having the same emotion type information and the same degree of emotion expression, The number of times the set is selected is determined, and when a plurality of different effect templates exist in one set among the sets selected multiple times, the effect templates are selected such that the number of times of selecting each effect template is uniform. May be selected. In other words, the number of times selected for each emotion ID is determined, and among the emotion IDs selected a plurality of times, for the emotion IDs that select a plurality of effect templates, the number of times each effect template is selected is made uniform. Select an effect template.

さらに、バッチ処理を行う際の他の変形例として、演出処理部４０５は、理手段は、指定情報が設定されてる演出テンプレートが存在する場合、該当する演出テンプレートが選択された全ての映像ファイルの感情レベルパラメータを相対的に比較し、該当する演出テンプレートを最大の感情レベルパラメータまたは最小の感情レベルパラメータを有する映像ファイルの番組演出処理のみに使用するようにしても良い。 Further, as another modified example of performing the batch processing, the effect processing unit 405 determines that, when there is an effect template in which the specified information is set, the effect processing unit 405 determines whether the corresponding effect template is selected for all the video files. The emotion level parameters may be relatively compared, and the corresponding effect template may be used only for the program effect processing of the video file having the maximum emotion level parameter or the minimum emotion level parameter.

次に、図１０（ａ）、（ｂ）を参照して、実施の形態４のＴＶ４０８に表示されるダイジェスト映像の番組の画面例について説明する。ＴＶ４０８の画面（ＰＶＵＩ画面）は図示の如く、映像再生および字幕・文字スーパーを表示する素材表示エリア１００１と、仮想キャラクタの動作やスタジオ演出効果（セット、照明、カメラ位置などを含む）の表示に使用するスタジオエリア１００２と、利用者（視聴者）による操作メニュー選択に使用する操作メニューエリア１００３とから成る３つの論理的エリアから構成される。 Next, an example of a digest video program screen displayed on the TV 408 according to the fourth embodiment will be described with reference to FIGS. As shown in the figure, the screen of the TV 408 (PVUI screen) includes a material display area 1001 for displaying video and displaying subtitles and superimposed characters, and displaying a motion of a virtual character and a studio effect (including a set, lighting, and a camera position). It is composed of three logical areas including a studio area 1002 to be used and an operation menu area 1003 used for selecting an operation menu by a user (viewer).

実施の形態４では、上記エリアの数は各１個とし、重ね合わせなしのタイル貼りレイアウトとする。マルチウィンドウの表示形態としてタイル貼りレイアウトを使用するのは、重ね合わせて表示するより、コンピュータに不慣れな利用者に馴染み易いと考えたからであり、利用者のコンピュータ操作スキルに応じて、表示形態を選択可能としても良い。 In the fourth embodiment, the number of the above-mentioned areas is one each, and a tiled layout without overlapping is adopted. The reason why the tiled layout is used as the multi-window display mode is that it is easier for users who are unfamiliar with computers to adjust to the display mode, depending on the computer operation skills of the user. It may be selectable.

図１１は、実施の形態４のダイジェスト映像の番組化装置４００でダイジェスト映像の番組として作成された最終的なＰＶＭＬコードの例を示す。
先ず、ダイジェスト映像の１つの映像シーンに対して、先ず仮想キャラクタ（ＢＯＢ）が前書き説明文を喋り、その後、仮想キャラクタ（ＢＯＢ）によるイベント説明文の喋りと、映像シーンの再生が並列に行われるように記述したものである。 FIG. 11 shows an example of a final PVML code created as a digest video program by the digest video programmer 400 according to the fourth embodiment.
First, for one video scene of the digest video, the virtual character (BOB) first speaks the preamble description, and then, the virtual character (BOB) speaks the event description and the reproduction of the video scene is performed in parallel. It is described as follows.

なお、＜ｈｅａｄ＞部分が番組定義ファイルの部分に相当し、＜ｂｏｄｙ＞部分が番組本体である。並列処理および逐次処理はそれぞれ＜ｐａｒ＞、＜ｓｅｑ＞タグで記述する。 Note that the <head> portion corresponds to the program definition file portion, and the <body> portion is the program body. The parallel processing and the sequential processing are described by <par> and <seq> tags, respectively.

ＰＶＭＬの言語仕様は、原則は、＜メソッド、対象オブジェクト、メソッドに関するパラメータ列＞であるが、対象オブジェクトに対して多数のメソッドを記述した場合もあるので、以降のメソッド列に対して対象オブジェクトを指定するタグとして“＜ｓｅｔ＞”を用意した。万歳動作のようなよく使うマクロは、ＰＶＭＬのライブラリとして予め別途定義しておく。 In principle, the language specification of PVML is <method, target object, parameter sequence related to method>, but since many methods may be described for the target object, the target “<Set>” is prepared as a tag to be specified. Frequently used macros such as hurray movements are separately defined in advance as a PVML library.

＜ｈｅａｄ＞部に記載された位置レイアウト記述について説明する。予めｈｅａｄ部のレイアウト指定において、画面の左右に垂直分割（＜ｖｅｒｔｉｃａｌ＞）、その後、左半分に対して水平分割（＜ｈｏｒｉｓｏｎｔａｌ＞）を行っている。この分割ツリー情報の関係は保持したまま、サイズの連動が起こる。よって以下のようなサイズ変更により、操作メニューエリアは大きくなり、スタジオエリアは小さくなる。図１０（ａ）に示す画面の場合、図１０（ｂ）に示す画面のように変更される。
<viewchange area="display" duration="2"
dstx="0" dsty="0" dstheight="500" dstwidth="500"/> The position layout description described in the <head> section will be described. In the layout specification of the head section, a vertical division (<vertical>) is performed on the left and right sides of the screen, and a horizontal division (<horizontal>) is performed on the left half in advance. The size is linked while maintaining the relationship of the divided tree information. Therefore, the operation menu area becomes large and the studio area becomes small by the following size change. In the case of the screen shown in FIG. 10A, the screen is changed to the screen shown in FIG.
<viewchange area = "display" duration = "2"
dstx = "0" dsty = "0" dstheight = "500" dstwidth = "500"/>

実施の形態４で使用したＰＶＭＬはＳＭＩＬとＴＶＭＬの持つ各種機能を呼び出して使用するので、演出の内容はＳＭＩＬおよびＴＶＭＬの仕様に制約されることになるが、記述言語は特に限定するものではなく、本発明のダイジェスト映像の番組化方法およびダイジェスト映像の番組化装置において他の記述言語が適用可能であることは明らかである。 Since the PVML used in the fourth embodiment calls and uses various functions of SMIL and TVML, the contents of the production are restricted by the specifications of SMIL and TVML, but the description language is not particularly limited. It is obvious that other description languages can be applied to the digest video programming method and the digest video programming apparatus of the present invention.

前述した実施の形態４においては、仮想キャラクタの解説（前書き説明文、イベント説明文、後書き説明文の音声出力）とダイジェスト映像の各映像シーンの再生、およびスーパーの表示の間で容易に整合性を保って同期をとることができる。これにより、説明の分かりやすいプレゼンテーションを行うことができる。また、作成した番組の中で、仮想キャラクタにダイジェスト映像の内容を説明・解説させると共に、実施の形態２の映像内容説明文生成装置２００で計算された度合い値（感情レベルパラメータ）を用いて、仮想キャラクタに喜怒哀楽の演出を施すので、作成された番組を評価した場合、その感情表現は視聴者にとって理解を助け馴染み易い、違和感のないものとすることができた。 In the above-described fourth embodiment, the consistency between the explanation of the virtual character (the audio output of the foreword explanation, the event explanation, and the postscript explanation), the reproduction of each video scene of the digest video, and the display of the supermarket is easily achieved. And can be synchronized. This makes it possible to give a presentation that is easy to understand. In the created program, the virtual character is explained and explained by the virtual character, and the degree value (emotion level parameter) calculated by the image content explanation generating device 200 of the second embodiment is used. Since the virtual character is rendered with emotions and emotions, when the created program is evaluated, the emotional expression can help the viewer to comprehend and understand, and it is easy for the viewer to feel comfortable.

以上説明した実施の形態４に係るダイジェスト映像の番組化方法は、前述した説明で示した手順に従って予め用意したプログラムをコンピュータで実行することによって実現することができる。このプログラムは、ハードディスク、フロッピー（Ｒ）ディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ等のコンピュータで読み取り可能な記録媒体に記録されて提供される。またはネットワークを介して配布することができる。 The digest video program conversion method according to the fourth embodiment described above can be realized by executing a previously prepared program on a computer according to the procedure described in the above description. This program is provided by being recorded on a computer-readable recording medium such as a hard disk, a floppy (R) disk, a CD-ROM, an MO, and a DVD. Or it can be distributed over a network.

実施の形態１の映像内容説明文生成装置の概略構成図である。FIG. 1 is a schematic configuration diagram of a video content explanation sentence generation device according to a first embodiment. 実施の形態１の接続関係判別関数のアルゴリズムを示す説明図である。FIG. 4 is an explanatory diagram illustrating an algorithm of a connection relation determination function according to the first embodiment. 実施の形態２の映像内容説明文生成装置の概略構成図である。FIG. 9 is a schematic configuration diagram of a video content description generating device according to a second embodiment. 実施の形態２の感情度判別関数のアルゴリズムを示す説明図である。FIG. 14 is an explanatory diagram illustrating an algorithm of an emotion degree discrimination function according to the second embodiment. 実施の形態３の説明文生成関数（説明文生成アルゴリズム）を示す説明図である。FIG. 14 is an explanatory diagram illustrating an explanatory note generation function (explanatory note generation algorithm) according to the third embodiment; 実施の形態３の説明文生成関数を用いた場合、ある試合に対してどのような順序で説明文が生成されるかを示した説明図である。FIG. 14 is an explanatory diagram showing in which order a description is generated for a certain game when the description generation function of the third embodiment is used. 本発明の映像内容の説明文生成方法を映像文生成機能として取り込んだダイジェスト作成システムの概要図である。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a schematic diagram of a digest creation system that incorporates a video content description method according to the present invention as a video text generation function. 実施の形態４のダイジェスト映像の番組化装置のブロック構成図である。FIG. 13 is a block diagram of a digest video programmer according to a fourth embodiment. ダイジェスト映像の番組化装置の処理の概略フローを示す説明図である。It is explanatory drawing which shows the outline | summary flow of a process of the program conversion apparatus of a digest video. 実施の形態４のＴＶに表示されるダイジェスト映像の番組の画面例を示す説明図である。FIG. 14 is an explanatory diagram showing an example of a screen of a digest video program displayed on a TV according to the fourth embodiment. 実施の形態４のダイジェスト映像の番組化装置でダイジェスト映像の番組として作成された最終的なＰＶＭＬコードの例を示す説明図である。FIG. 15 is an explanatory diagram showing an example of a final PVML code created as a digest video program by the digest video programmer of the fourth embodiment.

Explanation of reference numerals

１００映像内容説明文生成装置
１０１説明文生成部
１０２映像内容判定部
１０３接続表現選択部
２００映像内容説明文生成装置
２０１説明文生成部
２０２記憶部
２０３設定部
２０４計算部
４００ダイジェスト映像の番組化装置
４０１映像ファイル生成部
４０２番組定義ファイルデータベース
４０３演出定義データベース
４０４演出テンプレート選択部
４０５演出処理部
４０６ＰＶＭＬインタプリタ
４０７ＴＶＭＬプレーヤ
４０８ＴＶ REFERENCE SIGNS LIST 100 video description generator 101 description generator 102 video determination unit 103 connection expression selector 200 video description generator 201 description generator 202 storage unit 203 setting unit 204 calculator 400 digest video generator 401 Video file generation unit 402 Program definition file database 403 Production definition database 404 Production template selection unit 405 Production processing unit 406 PVML interpreter 407 TVML player 408 TV

Claims

For each video scene retrieved as a digest video scene from one video stream, a plurality of pieces of character information composed of a fragmentary character string or information that can be converted into a character string that describes the content are added. In the video content description generation method having a description generation step of generating a description describing the video content of the video scene using the character information,
A video content determining step of determining the content of each video scene from the character information;
Based on the determination result of the video content determination step, according to the relationship between the previous and subsequent video scene, forward, reverse, parallel, addition, connection expression selection step of selecting a connection expression from among the selection,
The method of generating a video content description, wherein the description generation step connects the description of the relevant video scene before and after the connection using the connection expression selected in the connection expression selection step.

For each video scene searched as a digest video scene from a video stream structured using a hierarchical structure, a fragmentary character string or a plurality of information that can be converted into a character string that describes the content of the video scene is described. When the text information is added, the video content description generation method having a description generation step of generating a description describing the video content of the video scene using the text information,
A video content determining step of determining the content of each video scene from the character information;
Based on the determination result of the video content determination step, according to the relationship between the previous and subsequent video scene, forward, reverse, parallel, addition, connection expression selection step of selecting a connection expression from among the selection,
The method of generating a video content description, wherein the description generation step connects the description of the relevant video scene before and after the connection using the connection expression selected in the connection expression selection step.

The description generating step includes the step of generating a description of a video scene of a certain hierarchy by using the hierarchical structure, together with a description indicating the video content of the video scene of the hierarchy, 3. The method according to claim 2, wherein a preamble sentence, which is a preamble to the explanation sentence, is generated from the character information of the video scene of the upper hierarchy.

The description generating step includes the step of generating a description of a video scene of a certain hierarchy by using the hierarchical structure, together with a description indicating the video content of the video scene of the hierarchy, 4. The method according to claim 2, wherein a postscript sentence to be a postscript of the explanatory note is generated from the character information of the video scene of the upper hierarchy.

The description generating step further includes, when generating the description of the video scene, changing the sentence expression of the description according to the user's preference using preset user preference information. 5. The method according to claim 1, further comprising the steps of:

For each video scene retrieved as a digest video scene from one video stream, a plurality of pieces of character information composed of a fragmentary character string or information that can be converted into a character string that describes the content are added. In the video content description generation device having a description generation means for generating a description describing the video content of the video scene using the character information,
Video content determining means for determining the content of each video scene from the character information,
Based on the determination result of the video content determination means, by the relationship of the previous and next video scene, sequential, reverse connection, parallel, addition, connection expression selection means for selecting a connection expression from among selection,
A video content description generating apparatus, wherein the description generating means connects the description of the preceding and succeeding video scenes using the connection expression selected by the connection expression selecting means.

For each video scene searched as a digest video scene from a video stream structured using a hierarchical structure, a fragmentary character string or a plurality of information that can be converted into a character string that describes the content of the video scene is described. When the text information is added, in the video content description generation device having a description generation means for generating a description to explain the video content of the video scene using the text information,
Video content determining means for determining the content of each video scene from the character information,
Based on the determination result of the video content determination means, by the relationship of the previous and next video scene, sequential, reverse connection, parallel, addition, connection expression selection means for selecting a connection expression from among selection,
A video content description generating apparatus, wherein the description generating means connects the description of the preceding and succeeding video scenes using the connection expression selected by the connection expression selecting means.

Further, when generating a description about a video scene of a certain hierarchy, the description generating means uses the hierarchical structure to provide a description indicating the video content of the video scene of the hierarchy and the video of the hierarchy. 8. The video content description generating apparatus according to claim 7, wherein a preamble sentence which is a preamble of the description is generated from the character information of the video scene of a hierarchy higher than the scene.

Further, when generating a description about a video scene of a certain hierarchy, the description generating means uses the hierarchical structure to provide a description indicating the video content of the video scene of the hierarchy and the video of the hierarchy. 9. The video content description generation apparatus according to claim 7, wherein a postscript text that is a postscript of the description text is generated from the character information of the video scene in the upper layer of the scene.

A computer-readable recording medium having recorded thereon a program for causing a computer to execute the method for generating a description of a video content according to claim 1.

Each video scene retrieved as a digest video scene from one video stream and a description of the video content created for each video scene are input, and in addition to the reproduction of each video scene, A digest video programming method for creating a digest video program by providing a description of the video content by voice or text via a preset virtual character,
An input step of inputting a degree value of the emotional reaction of the virtual character to the video content of each video scene, together with the description of the video scene and the video content,
An effecting step of performing an effecting process of an emotional expression of the virtual character based on the degree value for each video scene;
A method for converting a digest video into a program.

Along with the video scene for the digest video, the description of each video scene, a preamble, a postscript, and a degree value are input, and a digest video programming method of creating a digest video program,
In addition to the reproduction of each of the video scenes, the description, the preamble, and the postscript are provided by voice through a preset virtual character, and the virtual character of the virtual character is provided for each video scene based on the degree value. An effect step for performing an effect expression process,
A digest video program conversion method comprising:

Each video scene retrieved from one video stream as a scene for a digest video, and a description of each video scene created in advance, a preamble, a postscript, and a change in the emotional state of the user to the video content. A digest video programming device for inputting a degree value indicating the degree and creating a digest video program,
Video file generating means for generating a video file by associating the explanatory note, preamble sentence, postscript sentence, and degree value for each video scene as a processing unit of program conversion;
A program definition file database storing various configuration information of programs including at least virtual characters as a program definition file,
An effect definition database in which a plurality of emotion expression levels are set, and for each of the emotion expression degrees, an effect template defining one effect method is stored;
Selecting means for inputting the video file, determining a degree of emotional expression based on a degree value for each video file, and selecting an effect expression effect template according to the degree of emotional expression from the effect definition database; and ,
The program definition file, the video file, and the effect template are input, and at least the reproduction timing of the video scene and the description sentence as the sound of the virtual character are output based on the selected effect template for each video file. Effect processing means for performing program effect processing for each video file by setting the sentence, the setting of the postscript sentence and the output timing of the sound, and the operation of the virtual character;
A digest video programmer comprising:

The various types of configuration information of the program in the program definition file include at least one virtual character, and information such as the studio set of the program, the number and position of cameras, CG lighting, CG props, sound, program title, supermarket settings, and the like. The digest video programming apparatus according to claim 13, characterized in that:

The degree value has emotion type information for indicating the type of emotion such as emotions and emotions,
The effect definition database stores a plurality of effect templates that are classified as emotion type information and the degree of emotion expression as a key index,
The selecting means, when selecting the effect template, determines the emotion type information and the degree of emotion expression to be used as a key index based on the degree value, and determines all applicable effect templates from the effect definition database. 15. The digest video programmer according to claim 13, wherein the digest video is selected.

Further, the degree value can be composed of a plurality of degree values,
The effect definition database stores a plurality of effect templates in which a plurality of emotion type information and a degree of emotion expression of the plurality of emotion type information are classified as a key index,
The selecting means determines a plurality of emotion type information to be used as a key index and a degree of emotional expression of the plurality of emotion type information based on the plurality of degree values when selecting the effect template. 16. The digest video programming apparatus according to claim 15, wherein all of the effect templates are selected from the definition database.

Further, a designation unit for designating a desired program definition file from a plurality of program definition files is provided,
The program definition file database stores a plurality of program definition files,
The effect processing means, when performing the program effect processing of the video file unit, input the program definition file specified via the specifying means, based on the various configuration information corresponding, the video file unit 17. The digest video programming device according to claim 13, wherein the program rendering process is performed.

In the effect template, program environment information to which a defined effect method can be applied is set,
When there are a plurality of effect templates selected by the selecting means, the effect processing means refers to the program environment information of each effect template, and provides a program environment provided in a program definition file specified via the specifying means. 18. The digest video programming apparatus according to claim 17, wherein one of the rendering templates executable in step (a) is selected, and the program rendering process is performed for each video file.

In the effect template, it is possible to set use number limitation information for limiting the number of times the defined effect method is used when making one digest video into a program,
The effect processing means, after selecting one of the executable effect templates, if the effect template has been set with use count limitation information, the selected effect template has been used in the past and the use count limit. 19. The digest video program according to claim 18, wherein the information is compared with the information to determine whether or not the digest video can be used, and if not, another executable effect template is selected. apparatus.

The program effect processing for each video file in the effect processing means includes a sequential processing function of selecting and processing an effect template to be used when the selection means has finished selecting an effect template for one video file; It has a batch processing function to wait until the selection of the rendering templates for all video files is completed, select the rendering template to be used for each video file, and then process it.
The effect processing means, when performing processing using the batch processing function, referring to all effect templates selected by the selecting means, the effect type information and the degree of emotion expression of the effect template of the same emotion expression For each set, the number of times that the set is selected is obtained, and when a plurality of different effect templates exist in one set among the sets selected a plurality of times, the number of times of selecting each effect template is made uniform. 20. The digest video programming apparatus according to claim 15, wherein an effect template is selected.

The effect template is used for the program effect processing of the video file having the highest degree value or the video file having the lowest degree value among the degree values associated with the emotion type information and the degree of the emotion expression included in each effect template. It is possible to set the specification information that specifies
When there is a rendering template in which the designation information is set, the rendering processing means relatively compares the degree values of all the video files for which the rendering template is selected, and determines the rendering template to be the maximum degree. 21. The digest video programming apparatus according to claim 20, wherein the digest video programming apparatus is used only for program effect processing of a video file having a value or a minimum degree value.

13. A computer-readable recording medium having recorded thereon a program for causing a computer to execute the digest video programming method according to claim 11 or 12.