JP2004127324A

JP2004127324A - Data processing apparatus, data processing method, recording medium, and program for making computer to perform the data processing method

Info

Publication number: JP2004127324A
Application number: JP2004007748A
Authority: JP
Inventors: Toshihiko Munetsugi; 宗續　敏彦; Minoru Eito; 栄藤　稔; Shoichi Araki; 荒木　昭一; Koichi Emura; 江村　恒一
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1998-12-25
Filing date: 2004-01-15
Publication date: 2004-04-22

Abstract

<P>PROBLEM TO BE SOLVED: To provide a data processing apparatus, a data processing method, a program for making a computer to perform the same data processing method and a recording medium with the same program recorded thereon in which a required scene is freely selected out of media contents. <P>SOLUTION: The data processing apparatus inputs context content description data describing a segment representing each of scenes of media contents composed of the plurality of scenes and a score representing a degree of significance based on context contents of the media contents as attribute information of the segment, and selects the segment based upon the score. <P>COPYRIGHT: (C)2004,JPO

Description

　本発明は、動画や映像、音声などの連続視聴覚情報（メディアコンテンツ）の視聴、再生、配送、蓄積において、該当メディアコンテンツのあらすじやハイライトシーン、あるいは視聴者が見たいと希望するシーンのみを再生、配送するための、メディアコンテンツのデータ処理装置、データ処理方法、記録媒体およびプログラムを提供するものである。 According to the present invention, in viewing, reproducing, distributing, and storing continuous audiovisual information (media content) such as moving images, videos, and audio, only synopses and highlight scenes of the corresponding media content, or scenes that the viewer desires to see, are displayed. An object of the present invention is to provide a media content data processing device, a data processing method, a recording medium, and a program for reproducing and delivering.

　従来、メディアコンテンツの再生、配送、蓄積は、メディアコンテンツを格納するファイル単位で行われていた。 Conventionally, playback, delivery, and storage of media content have been performed on a file-by-file basis that stores the media content.

　また、動画の特定シーンの検索を行う方法として、特開平10-111872号公報のように、動画の場面の切り替わり（シーンカット）を検出し、シーンカットごとに、開始フレームのタイムコード、終了フレームのタイムコード、該当シーンのキーワードの付加情報をつけて行っていた。 As a method of searching for a specific scene of a moving image, as in Japanese Patent Application Laid-Open No. H10-111872, switching of a scene of a moving image (scene cut) is detected, and a time code of a start frame, an end frame, Time code and additional information of the keyword of the scene.

　あるいは、カーネギーメロン大学（ＣＭＵ）では、動画のシーンカットの検出、人間の顔やキャプションの検出、音声認識によるキーフレーズの検出などにより、動画の要約を行っていた（Michael A. Smith, Takeo Kanade, 「Video Skimming and Characterization through the Combination of Image and Language Understanding Techniques」、CMU-CS-97-111, 1997年2月3日）。 Alternatively, Carnegie Mellon University (CMU) summarizes videos by detecting scene cuts, detecting human faces and captions, and detecting key phrases by voice recognition (Michael A. Smith, Takeo Kanade). , "Video Skimming and Characterization through the Combination of Image and Language Understanding Techniques", CMU-CS-97-111, February 3, 1997).

　しかしながら従来の方法では、再生をファイル単位で行う場合、そのコンテンツのあらすじを見ることは不可能である。また、ハイライトシーンや、ユーザが見たい場面を検索する場合においても、コンテンツの先頭から参照しなければならないという問題があった。また、動画配送においては、ファイルのデータすべてを送信するため、多大な時間を要するといった問題があった。 However, in the conventional method, when the reproduction is performed in file units, it is impossible to see a synopsis of the content. Also, when searching for a highlight scene or a scene that the user wants to see, there is another problem that the user must refer to the content from the beginning. In addition, in moving image delivery, there is a problem that it takes a lot of time to transmit all the data of the file.

　また、特開平10-111872号公報の方法によれば、シーンの検索はキーワードを用いて行うことができるため、ユーザが望むシーンの検索は容易となる。しかし、付加情報には各シーンの間の関係やつながりといったものがなく、例えば、物語のひとつの節を検索する場合の処理が困難となる。また、キーワードだけの検索ではどの場面が文脈上重要であるか知ることが困難であるため、あらすじの作成やハイライトシーン集の作成も困難である。 According to the method disclosed in Japanese Patent Application Laid-Open No. H10-111872, a scene search can be performed using a keyword, so that a search for a scene desired by a user can be easily performed. However, the additional information has no relation or connection between the scenes, and for example, it is difficult to perform processing when searching for one section of a story. Also, it is difficult to know which scene is important in context by searching only for keywords, so it is also difficult to create a synopsis or a highlight scene collection.

　また、ＣＭＵの手法によると、動画の要約は行えるが、結果は一通りに定まってしまうため、例えば５分の要約と３分の要約などのように、再生時間を変えた要約を行うことは困難である。また、特定の人物の写っているシーンなどを選択するといった、ユーザの要望による要約も困難である。 Also, according to the CMU method, summarization of moving images can be performed, but since the result is determined in one way, it is not possible to perform summarization with different playback times, for example, a five-minute summary and a three-minute summary. Have difficulty. In addition, it is difficult to summarize by a user's request, such as selecting a scene in which a specific person is shown.

　本発明は、メディアコンテンツの再生において、そのあらすじやハイライトシーンのみ、あるいは、視聴者が希望するシーンのみを選択し、再生、配送する手段と提供することを目的とする。 The object of the present invention is to provide a means for selecting, synthesizing or highlighting only the scenes or highlight scenes or reproducing and distributing only the scenes desired by the viewer in reproducing the media contents.

　また、あらすじやハイライトシーン、視聴者の希望するシーンなどの選択において、その再生時間をユーザが希望する時間に合わせて行う手段を提供することを目的とする。 Another object of the present invention is to provide means for selecting a synopsis, a highlight scene, a scene desired by a viewer, and the like in accordance with a time desired by the user.

　さらに、メディアコンテンツの配送において、ユーザの要求により、ユーザが希望する再生時間であらすじ、ハイライトシーン集、ユーザの希望するシーンといったもののみを配送する手段を提供することを目的とする。 Further, it is an object of the present invention to provide a means for delivering only a playback time desired by a user, a highlight scene collection, and a scene desired by a user in response to a user's request in the delivery of media content.

　さらに、サーバとユーザの通信を行う回線状況によって配送するデータ量を調整する手段を提供することを目的とする。 (4) It is another object of the present invention to provide a means for adjusting the amount of data to be delivered according to the line status of communication between the server and the user.

　本発明に係るデータ処理装置は、複数の場面で構成されるメディアコンテンツの各場面を表すセグメントと、前記セグメントの属性情報である前記メディアコンテンツの文脈内容に基づいた重要度を表すスコアとが記述された文脈内容記述データを入力する手段と、
　前記スコアに基づいてセグメントを選択する選択手段とを備える。 In the data processing apparatus according to the present invention, a segment representing each scene of a media content composed of a plurality of scenes and a score representing importance based on context content of the media content, which is attribute information of the segment, are described. Means for inputting the contextual description data thus obtained;
Selecting means for selecting a segment based on the score.

　この構成により、必要とする場面をメディアコンテンツの中から自由に選択することができる。 With this configuration, the required scene can be freely selected from the media contents.

　また、上記のデータ処理装置において、前記メディアコンテンツの各場面は、場面の区切りに応じて時間によって区切られ、前記文脈内容記述データには、前記属性情報として、前記場面の区切りを表す時間情報が記述されている。 In the data processing device, each scene of the media content is divided by time according to a break of the scene, and the context content description data includes, as the attribute information, time information indicating the break of the scene. It has been described.

　また、上記のデータ処理装置において、前記時間情報は、前記各場面の開始時間及び終了時間を含む。 In the above data processing device, the time information includes a start time and an end time of each scene.

　また、上記のデータ処理装置において、前記時間情報は、前記各場面の開始時間及び継続時間を含む。 In the above data processing device, the time information includes a start time and a duration of each scene.

　また、上記のデータ処理装置において、前記文脈内容記述データには、複数の前記セグメントが階層的に記述されている。 In the above data processing device, the plurality of segments are hierarchically described in the context description data.

　また、上記のデータ処理装置において、前記文脈内容記述データは、文脈内容に関する補助情報を有する。 In the above data processing device, the context content description data has auxiliary information on context content.

　また、上記のデータ処理装置において、前記メディアコンテンツは、映像情報及び音情報のうち少なくとも一方である。 In the above data processing device, the media content is at least one of video information and sound information.

　また、上記のデータ処理装置において、前記セグメントを代表する代表データのリンク先が前記セグメントに付加されている。 In the above data processing apparatus, a link of representative data representing the segment is added to the segment.

　また、上記のデータ処理装置において、前記代表データは、映像情報及び音情報のうち少なくとも一方である。 In the above data processing device, the representative data is at least one of video information and sound information.

　本発明に係るデータ処理方法は、複数の場面で構成されるメディアコンテンツの各場面を表すセグメントと、前記セグメントの属性情報である、前記メディアコンテンツの文脈内容に基づいた重要度を表すスコアとが記述された文脈内容記述データを入力し、前記スコアに基づいてセグメントを選択する。 In the data processing method according to the present invention, a segment representing each scene of a media content composed of a plurality of scenes, and a score representing an importance based on a context content of the media content, which is attribute information of the segment, may be used. The described context content description data is input, and a segment is selected based on the score.

　この方法により、必要とする場面をメディアコンテンツの中から自由に選択することができる。 This method allows you to freely select the required scene from the media contents.

　また、上記のデータ処理方法において、前記メディアコンテンツの各場面は、場面の区切りに応じて時間によって区切られ、前記文脈内容記述データには、前記属性情報として、前記場面の区切りを表す時間情報が記述されている。 In the data processing method described above, each scene of the media content is separated by time according to a break of the scene, and the context content description data includes, as the attribute information, time information indicating the break of the scene. It has been described.

　また、上記のデータ処理方法において、前記時間情報は、前記各場面の開始時間及び終了時間を含む。 In the data processing method, the time information includes a start time and an end time of each scene.

　また、上記のデータ処理方法において、前記時間情報は、前記各場面の開始時間及び継続時間を含む。 In the above data processing method, the time information includes a start time and a duration of each scene.

　また、上記のデータ処理方法において、前記文脈内容記述データには、複数の前記セグメントが階層的に記述されている。 {In addition, in the above data processing method, the plurality of segments are hierarchically described in the context description data.

　また、上記のデータ処理方法において、前記文脈内容記述データは、文脈内容に関する補助情報を有する。 In the data processing method described above, the context description data has auxiliary information related to the context.

　また、上記のデータ処理方法において、前記メディアコンテンツは、映像情報及び音情報のうち少なくとも一方である。 In the above data processing method, the media content is at least one of video information and sound information.

　また、上記のデータ処理方法において、前記セグメントを代表する代表データのリンク先が前記セグメントに付加されている。 In the above data processing method, a link of representative data representing the segment is added to the segment.

　また、上記のデータ処理方法において、前記代表データは、映像情報及び音情報のうち少なくとも一方である。 In the above data processing method, the representative data is at least one of video information and sound information.

　本発明に係るプログラムは、複数の場面で構成されるメディアコンテンツの各場面を表すセグメントと、前記セグメントの属性情報である、前記メディアコンテンツの文脈内容に基づいた重要度を表すスコアとが記述された文脈内容記述データを入力するステップと、前記スコアに基づいてセグメントを選択するステップとをコンピュータに実行させるためのプログラムである。 In the program according to the present invention, a segment representing each scene of a media content composed of a plurality of scenes and a score representing importance based on context content of the media content, which is attribute information of the segment, are described. And a step of inputting the contextual description data and selecting a segment based on the score.

　このプログラムにより、必要とする場面をメディアコンテンツの中から自由に選択することができる。プログラム This program allows you to freely select the required scene from the media contents.

　また、上記のプログラムにおいて、前記メディアコンテンツの各場面は、場面の区切りに応じて時間によって区切られ、前記文脈内容記述データには、前記属性情報として、前記場面の区切りを表す時間情報が記述されている。 In the above-mentioned program, each scene of the media content is separated by time according to a break of the scene, and the context content description data describes time information indicating the break of the scene as the attribute information. ing.

　また、本発明に係る記録媒体は、上記のプログラムを記録したコンピュータ読み取り可能な記録媒体である。 The recording medium according to the present invention is a computer-readable recording medium that stores the above-described program.

　この記録媒体により、必要とする場面をメディアコンテンツの中から自由に選択するプログラムを記録することができる。 (4) With this recording medium, a program for freely selecting a required scene from media contents can be recorded.

　本発明によれば、必要とする場面をメディアコンテンツの中から自由に選択することができるデータ処理装置、データ処理方法、およびそのデータ処理方法をコンピュータに実行させるためのプログラム並びにそのプログラムを記録した記録媒体を提供することができる。 According to the present invention, a data processing apparatus, a data processing method, a program for causing a computer to execute the data processing method, and a program for causing a computer to execute the data processing method can be selected freely from media contents. A recording medium can be provided.

　以下、図面を参照しながら、本発明の実施の形態について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

〔第１の実施の形態〕
　以下、本発明に係る第１の実施の形態について述べる。本実施の形態においては、メディアコンテンツとして、MPEG1システムストリームの動画像を想定する。この場合、メディアセグメントは、ひとつのシーンカットに相当する。また本実施の形態において、スコアは、該当する場面における文脈内容に基づいた客観的な重要度とする。 [First Embodiment]
Hereinafter, a first embodiment according to the present invention will be described. In the present embodiment, a moving image of an MPEG1 system stream is assumed as the media content. In this case, the media segment corresponds to one scene cut. In the present embodiment, the score is an objective importance based on the context content in the corresponding scene.

　図１は、本実施の形態におけるデータ処理方法のブロック図である。図１において、１０１は選択ステップを、１０２は抽出ステップを表す。選択ステップ１０１は、文脈内容記述データからメディアコンテンツの場面を選択し、その場面の開始時間と終了時間を出力する処理を行う。また、抽出ステップ１０２は、選択ステップ１０１が出力した開始時間と終了時間によって区切られるメディアコンテンツの区間のデータを抽出する処理を行う。 FIG. 1 is a block diagram of a data processing method according to the present embodiment. In FIG. 1, 101 indicates a selection step and 102 indicates an extraction step. The selecting step 101 performs a process of selecting a scene of the media content from the context description data and outputting a start time and an end time of the scene. In addition, the extraction step 102 performs a process of extracting data of a section of the media content separated by the start time and the end time output by the selection step 101.

　図２に、本実施の形態の文脈内容記述データの構成を示す。本実施の形態では、文脈内容を木構造で記述する。また、木構造の兄弟関係は、左から時間順にならんでいるものとする。図２において、<contents>と記されている木構造の根(root)は、ひとつのコンテンツを表し、属性としてそのコンテンツのタイトルが付けられる。 FIG. 2 shows the structure of the context description data according to the present embodiment. In the present embodiment, context contents are described in a tree structure. Further, it is assumed that the siblings of the tree structure are arranged in chronological order from the left. In FIG. 2, the root of the tree structure described as <contents> represents one content, and the title of the content is given as an attribute.

　<contents>の子要素は、<section>である。<section>には、該当場面の文脈内容上の重要度を表すpriorityが属性として付加される。重要度は1から5までの整数値とし、1が最も重要度が低く、5が最も重要度が高い、とする。子 The child element of <contents> is <section>. To <section>, priority indicating the importance of the context content of the scene is added as an attribute. The importance is an integer from 1 to 5, with 1 being the least important and 5 being the most important.

　<section>の子要素は、<section>か<segment>である。すなわち、<section>は、それ自身を子要素としても良いとする。ただし、ひとつの<section>の子要素として、<section>と<segment>を混在させてはならないこととする。子 The child element of <section> is <section> or <segment>. That is, <section> may itself be a child element. However, <section> and <segment> must not be mixed as child elements of one <section>.

　<segment>は、ひとつのシーンカットを表し、<section>と同様のpriorityと、該当シーンの時間情報として、開始時間を表すstartと、終了時間を表すendとが、属性として付加される。シーンカットの方法は、市販されていたり、ネットワークで流通しているソフトを用いても良いし、人手で行っても良い。なお、本実施の形態では、時間情報をシーンカットの開始時間と終了時間としたが、時間情報として開始時間と該当シーンの継続時間としても同様の効果が得られる。この場合、該当シーンの終了時間は、開始時間に継続時間を加算して求められる。 <Segment> indicates one scene cut, and the same priority as <section>, and start indicating start time and end indicating end time are added as attributes as time information of the corresponding scene. The method of scene cut may be software that is commercially available or distributed on a network, or may be manually performed. In the present embodiment, the time information is the start time and the end time of the scene cut, but the same effect can be obtained by using the start time and the duration of the scene as the time information. In this case, the end time of the scene is obtained by adding the duration to the start time.

　この文脈内容記述データにより、映画などの物語の場合は、多階層の<section>により、章、節、段落などを記述することができる。もうひとつの例として野球を記述する場合、最上位の<section>で回を記述し、その子要素の<section>で表裏を記述し、その子要素の<section>で各打者の場面を記述し、さらにその子要素の<section>で、各投球やその合間、その打席の結果などを記述することができる。文脈 With this context description data, in the case of a story such as a movie, chapters, sections, paragraphs, etc. can be described by multi-layer <section>. When describing baseball as another example, describe the times in the top <section>, describe the front and back with the child element <section>, describe the scene of each batter with the child element <section>, In addition, the <section> of the child element can describe each pitch, the interval, the result of the turn at bat, and the like.

　この構成の文脈内容記述データをコンピュータ上で表現する一例として、Extensible Markup Language(XML)による記述を用いることができる。XMLはWorld Wide Web Consortiumによって標準化が進められているデータ記述言語であり、1998年2月10日にVer. 1.0が勧告された。XML ver. 1.0の仕様書は、http://www.w3.org/TR/1998/REC-xml-19980210で得られる。図３〜図９は、本実施の形態の文脈内容記述データをXMLで記述するためのDocument Type Definition(DTD)と、このDTDによる文脈内容記述データの一例である。また、図１０〜図１９は、図３〜図９に示す文脈内容記述データに、代表画像（映像情報）やキーワード（音情報）などのメディアセグメントの代表データ（dominant-data）を追加した文脈内容記述データの一例と、該文脈内容記述データをXMLで記述するためのDTDである。記述 As an example of expressing the context content description data of this configuration on a computer, a description in Extensible Markup Language (XML) can be used. XML is a data description language being standardized by the World Wide Web Consortium, and Ver. 1.0 was recommended on February 10, 1998. The XML ver. 1.0 specification is available at http://www.w3.org/TR/1998/REC-xml-19980210. 3 to 9 show an example of a Document Type Definition (DTD) for describing the context content description data of the present embodiment in XML, and an example of the context content description data based on the DTD. 10 to 19 show a context in which representative data (dominant-data) of a media segment such as a representative image (video information) or a keyword (sound information) is added to the context description data shown in FIGS. An example of the content description data and a DTD for describing the context description data in XML.

　以下、選択ステップ１０１での処理について説明する。選択ステップ１０１での処理は、文脈内容記述データの形式、および各場面の文脈内容におけるスコアの付け方と密接に関係するものである。本実施の形態では、選択ステップ１０１は図２２に示すような<segment>を子要素にもつ<section>のみに着目し（図２３のＳ１、Ｓ４、Ｓ５）、そのpriorityの値があるしきい値より大きい<section>を選択し（図２３のＳ２）、その開始時間と終了時間を出力する処理（図２３のＳ３）を行うこととする。そのため、<segment>を子要素に持つ<section>のpriorityは、コンテンツ内すべての<segment>を子要素に持つ<section>の間での重要度とする。すなわち、図２２における点線で囲んだ<section>の中での重要度をpriorityに設定する。これ以外の<section>、<segment>のpriorityの付け方は任意とする。なお、重要度はすべて異なる値である必要はなく、異なる要素に同じ値の重要度が付いてよいとする。図２３に、本実施の形態における選択ステップでの処理のフローチャートを示す。選択された<section>に関しては、その子要素である<segment>から、該当<section>で表現される場面の開始時間と終了時間を調べる。そして、その開始時間と終了時間を出力する。 Hereinafter, the processing in the selection step 101 will be described. The processing in the selection step 101 is closely related to the format of the context content description data and how to score the context content of each scene. In the present embodiment, the selection step 101 focuses only on <section> having <segment> as a child element as shown in FIG. 22 (S1, S4, S5 in FIG. 23), and there is a threshold having a value of the priority. A <section> larger than the value is selected (S2 in FIG. 23), and a process of outputting the start time and the end time (S3 in FIG. 23) is performed. Therefore, the priority of <section> having <segment> as a child element is the importance of all <section> having <segment> as a child element in the content. That is, the priority in <section> surrounded by a dotted line in FIG. 22 is set to priority. The priority of <section> and <segment> other than this is optional. Note that all importance levels need not be different values, and different elements may have the same importance level. FIG. 23 shows a flowchart of the process in the selection step in the present embodiment. For the selected <section>, the start time and the end time of the scene represented by the relevant <section> are checked from the child element <segment>. Then, the start time and the end time are output.

　なお、本実施の形態では、<segment>を子要素として持つ<section>に着目して処理を行ったが、ほかに<segment>に着目して、それらの選択を行っても良い。この場合、priorityは、コンテンツ内すべての<segment>間での重要度とする。また、<segment>を子要素として持たない上位層の<section>のうち、同じ階層のものに着目して、その選択を行っても良い。すなわち、<contents>あるいは<segment>から数えて同じ経路数の<section>に着目した処理を行っても良い。 In the present embodiment, the processing is performed by focusing on <section> having <segment> as a child element. However, the selection may be performed by focusing on <segment>. In this case, priority is the importance of all <segments> in the content. In addition, among <section> of the upper layer not having <segment> as a child element, the selection may be performed by focusing on the same layer. That is, the processing may be performed focusing on <section> having the same number of paths counted from <contents> or <segment>.

　以下、図２４を参照しながら、抽出ステップ１０２の動作について説明する。図２４は、本実施の形態に係わる抽出ステップ１０２のブロック図である。図２４において、本実施の形態における抽出ステップ１０２は、分離手段６０１と、ビデオスキミング手段６０２と、オーディオスキミング手段６０３とから構成される。本実施の形態においては、メディアコンテンツとしてMPEG1システムストリームを想定している。MPEG1システムストリームはビデオストリームとオーディオストリームが多重化されたものであり、分離手段６０１は、多重化されたシステムストリームをビデオストリームとオーディオストリームとに分離するものである。ビデオスキミング手段６０２は、分離されたビデオストリームと選択ステップ１０１の出力である区間を入力とし、入力されたビデオストリームから、選択された区間のデータだけを出力するものである。オーディオスキミング手段６０３は、分離されたオーディオストリームと選択ステップ１０１の出力である区間を入力とし、入力されたオーディオストリームから、選択された区間のデータだけを出力するものである。 Hereinafter, the operation of the extraction step 102 will be described with reference to FIG. FIG. 24 is a block diagram of the extraction step 102 according to the present embodiment. In FIG. 24, the extraction step 102 in the present embodiment includes a separating unit 601, a video skimming unit 602, and an audio skimming unit 603. In the present embodiment, an MPEG1 system stream is assumed as the media content. The MPEG1 system stream is obtained by multiplexing a video stream and an audio stream, and the separating unit 601 separates the multiplexed system stream into a video stream and an audio stream. The video skimming means 602 receives as input the separated video stream and the section output from the selection step 101, and outputs only data of the selected section from the input video stream. The audio skimming means 603 receives the separated audio stream and the section output from the selection step 101, and outputs only data of the selected section from the input audio stream.

　以下、図を参照しながら、分離手段６０１の処理について説明する。図２５に分離手段６０１の処理のフローチャートを示す。MPEG1システムストリームの多重化方式は、国際標準ISO/IEC IS 11172-1で標準化されたものであり、ビデオストリームとオーディオストリームがパケットにより多重化されている。パケットによる多重化は、ビデオストリーム、オーディオストリームそれぞれを、パケットと呼ばれる適当な長さのストリームに分割し、ヘッダなどの付加情報を付けて行うものである。この時、ビデオストリームとオーディオストリームは、それぞれ複数あっても良いとされている。パケットのヘッダには、ビデオ、オーディオを区別することができるストリームidや、ビデオとオーディオの同期をとるためのタイムスタンプが記述されている。ストリームidは、ビデオとオーディオの区別だけでなく、ビデオが複数あった場合、どのストリームであるかの区別することができるものである。同様に、オーディオストリームが複数あった場合にも、区別することができるものである。MPEG1システムでは、パケットを複数束ねたパックという単位で構成される。パックには、多重化レートや同期再生用の時間基準参照用の付加情報などがヘッダとして付加されている。さらに先頭のパックには、多重化したビデオストリーム数やオーディオストリーム数などの付加情報がシステムヘッダとして付けられている。分離手段６０１は、まず先頭のパックのシステムヘッダから、多重化されているビデオストリーム数とオーディオストリーム数を読みとり（Ｓ１、Ｓ２）、各ストリームのデータを保存する領域を確保する（Ｓ３、Ｓ４）。続いて、各パケットごとにストリームidを調べ、該当ストリームidで指定されるストリームを保存するデータ領域にパケットデータを書き込む（Ｓ５、Ｓ６）。すべてのパケットに対して以上の処理を繰り返す（Ｓ８、Ｓ９、Ｓ１０）。すべてのデータに対して処理を行った後、各ストリーム毎に、ビデオストリームはビデオスキミング手段６０２へ、オーディオストリームはオーディオスキミング手段６０３へ出力する（Ｓ１１）。 Hereinafter, the processing of the separating unit 601 will be described with reference to the drawings. FIG. 25 shows a flowchart of the processing of the separating means 601. The multiplexing method of the MPEG1 system stream is standardized by the international standard ISO / IEC IS 11172-1, and a video stream and an audio stream are multiplexed by packets. Multiplexing by packets is performed by dividing each of a video stream and an audio stream into a stream of an appropriate length called a packet and adding additional information such as a header. At this time, it is said that there may be a plurality of video streams and a plurality of audio streams. In the header of the packet, a stream id capable of distinguishing between video and audio, and a time stamp for synchronizing video and audio are described. The stream id can distinguish not only a video and an audio but also a stream when there are a plurality of videos. Similarly, when there are a plurality of audio streams, it can be distinguished. In the MPEG1 system, the packet is constituted by a unit called a pack in which a plurality of packets are bundled. A multiplex rate, additional information for referring to a time reference for synchronous reproduction, and the like are added to the pack as a header. Furthermore, additional information such as the number of multiplexed video streams and the number of audio streams is added to the first pack as a system header. The separating unit 601 first reads the number of multiplexed video streams and audio streams from the system header of the first pack (S1, S2), and secures an area for storing data of each stream (S3, S4). . Subsequently, the stream id is checked for each packet, and the packet data is written to the data area for storing the stream specified by the stream id (S5, S6). The above processing is repeated for all packets (S8, S9, S10). After processing all data, the video stream is output to the video skimming means 602 and the audio stream is output to the audio skimming means 603 for each stream (S11).

　以下、ビデオスキミング手段６０２の動作について述べる。図２６にビデオスキミング手段６０２の処理のフローチャートを示す。MPEG1のビデオストリームは、国際標準ISO/IEC IS 11172-2で標準化されたものであり、図２７に示すように、シーケンス層、ＧＯＰ層、ピクチャ層、スライス層、マクロブロック層、ブロック層で構成されている。そのランダムアクセスの最小単位はＧＯＰ(Group Of Pictures)層である。また、ピクチャ層のひとつが１フレームに相当する。ビデオスキミング手段６０２は、ＧＯＰ単位のデータ処理を行う。初期化処理として、出力したフレーム数のカウンタCを0とする（Ｓ３）。　まず、ビデオスキミング手段６０２は、ビデオストリームの先頭がシーケンス層のヘッダであることを確認し（Ｓ２、Ｓ４）、そのデータを保存するとともに（Ｓ５）、そのヘッダのデータを出力する。シーケンス層のヘッダは以降も現れる場合があるが、その値は量子化マトリックス以外は変更が許されないため、シーケンスヘッダが入力されるたびに値の比較を行って（Ｓ８、Ｓ１４）、量子化マトリックス以外の値が異なる場合はエラーとする（Ｓ１５）。続いてビデオスキミング手段６０２は、入力されたデータからＧＯＰ層のヘッダを検出する（Ｓ９）。ＧＯＰ層のヘッダにはタイムコードのデータが記述されている（Ｓ１０）。これは、シーケンスの先頭からの時間を記述したものである。ビデオスキミング手段６０２は、このタイムコードと選択ステップ１０１が出力した区間（Ｓ１）との比較を行う（Ｓ１１）。タイムコードが選択された区間に含まれていない場合、ビデオスキミング手段６０２は、次のＧＯＰ層またはシーケンス層が現れるまでのデータをすべて廃棄する。タイムコードが選択された区間の中に含まれている場合、ビデオスキミング手段６０２は、この次のＧＯＰ層あるいはシーケンス層が現れるまでのデータをすべて出力する（Ｓ１３）。ただし、それまで出力されたデータとの連続性を持たせるために、ＧＯＰ層のタイムコードを変更する必要がある（Ｓ１２）。そこで、カウンタCの値を用いて変更するタイムコードを求める。カウンタCは、それまでに出力されたフレーム数であるため、今回出力するＧＯＰ層の先頭フレームが表示される時間Tvは、Cとシーケンスヘッダに記述されている毎秒の表示画面数であるピクチャレートprを用いて、以下の式（１）により求められる。 Hereinafter, the operation of the video skimming means 602 will be described. FIG. 26 shows a flowchart of the processing of the video skimming means 602. The MPEG1 video stream is standardized by the international standard ISO / IEC IS 11172-2, and includes a sequence layer, a GOP layer, a picture layer, a slice layer, a macroblock layer, and a block layer as shown in FIG. Have been. The minimum unit of the random access is a GOP (Group Of Pictures) layer. One of the picture layers corresponds to one frame. The video skimming means 602 performs data processing on a GOP basis. As initialization processing, the output frame number counter C is set to 0 (S3). First, the video skimming means 602 confirms that the head of the video stream is the header of the sequence layer (S2, S4), saves the data (S5), and outputs the data of the header. Although the header of the sequence layer may appear in the future, its value cannot be changed except for the quantization matrix. Therefore, each time the sequence header is input, the value is compared (S8, S14), and the quantization matrix is changed. If any other value is different, an error is determined (S15). Subsequently, the video skimming means 602 detects the header of the GOP layer from the input data (S9). Time code data is described in the header of the GOP layer (S10). This describes the time from the beginning of the sequence. The video skimming means 602 compares the time code with the section (S1) output by the selection step 101 (S11). If the time code is not included in the selected section, the video skimming means 602 discards all data until the next GOP layer or sequence layer appears. If the time code is included in the selected section, the video skimming means 602 outputs all data until the next GOP layer or sequence layer appears (S13). However, it is necessary to change the time code of the GOP layer in order to have continuity with the data output up to that time (S12). Therefore, a time code to be changed is obtained using the value of the counter C. Since the counter C is the number of frames output so far, the time Tv for displaying the first frame of the GOP layer to be output this time is C and the picture rate, which is the number of display screens per second described in the sequence header. It is obtained by the following equation (1) using pr.

　Tv=C/pr　・・・（１） {Tv = C / pr} ・・・ (1)

　Tvは1/pr秒単位の値であるため、これをMPEG1のタイムコードのフォーマットにしたがって変換し、今回出力するＧＯＰ層のタイムコードに設定する。また、ＧＯＰ層のデータを出力するときに、出力したピクチャ層の数をカウンタCに加算する。以上の処理を、ビデオストリームの最後まで繰り返す（Ｓ７、Ｓ１６）。分離手段６０１から複数のビデオストリームが出力された場合は、上記の処理を各ビデオストリーム毎に行う。 Since Tv is a value in the unit of 1 / pr second, it is converted according to the MPEG1 time code format and set as the time code of the GOP layer to be output this time. When outputting data of the GOP layer, the number of output picture layers is added to the counter C. The above processing is repeated until the end of the video stream (S7, S16). When a plurality of video streams are output from the separating unit 601, the above processing is performed for each video stream.

　以下、オーディオスキミング手段６０３の処理について記述する。図２８にオーディオスキミング手段６０３の処理のフローチャートを示す。MPEGオーディオは国際標準ISO/IEC IS 11172-3で標準化されたものであり、AAU(Audio Access Unit)と呼ばれるフレームから構成される。図２９にAAUの構造を示す。AAUはひとつひとつ独立でオーディオデータに復号できる最小単位であり、常に一定のサンプル数Snのデータで構成されている。したがって伝送速度であるビットレートbrと、サンプリング周波数Fsと、AAUのビット数Lから、１AAUの再生時間を算出することができる。まず、オーディオストリームからAAUのヘッダを検出することによって（Ｓ２、Ｓ５）、１AAUのビット数Lを求めることができる。また、AAUのヘッダには、ビットレートbrとサンプリング周波数Fsが記述されている。1AAUのサンプル数Snは以下の式（２）によって求められる。 Hereinafter, processing of the audio skimming means 603 will be described. FIG. 28 shows a flowchart of the processing of the audio skimming means 603. MPEG audio is standardized by the international standard ISO / IEC IS 11172-3, and is composed of a frame called an AAU (Audio Access Unit). FIG. 29 shows the structure of the AAU. The AAU is a minimum unit that can be independently decoded into audio data one by one, and is always configured with data having a fixed number of samples Sn. Therefore, the playback time of one AAU can be calculated from the bit rate br, which is the transmission speed, the sampling frequency Fs, and the number L of bits of the AAU. First, by detecting the AAU header from the audio stream (S2, S5), the number of bits L of one AAU can be obtained. The bit rate br and the sampling frequency Fs are described in the AAU header. The sample number Sn of 1 AAU is obtained by the following equation (2).

　Sn=(L×Fs)/br　・・・（２） Sn = (L × Fs) / br ・・・ (2)

　また、1AAUの再生時間Tuは以下の式（３）によって求められる（Ｓ３）。 {Circle around (1)} The playback time Tu of one AAU is obtained by the following equation (3) (S3).

　Tu=Sn/Fs=L/Br　・・・（３） {Tu = Sn / Fs = L / Br} ・・・ (3)

　Tuを求めると、AAUの個数をカウントすることによって、ストリームの先頭からの時間を得ることができる。オーディオスキミング手段６０３は、それまで現れたAAUの個数をカウントし、その先頭からの時間を算出する（Ｓ７）。その時間と、選択ステップ１０１が出力した区間との比較を行う（Ｓ８）。AAUの出現時間が選択された区間に含まれる場合、オーディオスキミング手段６０３はそのAAUのデータをすべて出力する（Ｓ９）。また、AAUの出現時間が選択された区間に含まれていない場合、オーディオスキミング手段６０３はそのAAUのデータを廃棄する。以上の処理を、オーディオストリームの最後まで繰り返す（Ｓ６、Ｓ１１）。分離手段６０１から複数のオーディオストリームが出力された場合は、各オーディオストリーム毎に上記の処理を行う。 When Tu is obtained, the time from the head of the stream can be obtained by counting the number of AAUs. The audio skimming means 603 counts the number of AAUs that have appeared so far, and calculates the time from the beginning (S7). The time is compared with the section output by the selection step 101 (S8). If the appearance time of the AAU is included in the selected section, the audio skimming means 603 outputs all the data of the AAU (S9). If the appearance time of the AAU is not included in the selected section, the audio skimming means 603 discards the data of the AAU. The above processing is repeated until the end of the audio stream (S6, S11). When a plurality of audio streams are output from the separation unit 601, the above processing is performed for each audio stream.

　本実施の形態の効果としては、図３０に示すように、抽出ステップ１０２の出力であるビデオストリームとオーディオストリームをそれぞれビデオ再生手段、オーディオ再生手段に入力させ、ビデオストリームとオーディオストリームを同期させて再生することにより、該当メディアコンテンツのあらすじやハイライトシーンを再生することができる。また、上記得られたビデオストリームとオーディオストリームを多重化することによって、該当メディアコンテンツのあらすじやハイライトシーン集のMPEG1システムストリームを作成することができる。 As an effect of the present embodiment, as shown in FIG. 30, the video stream and the audio stream output from the extraction step 102 are input to the video playback unit and the audio playback unit, respectively, and the video stream and the audio stream are synchronized. By reproducing, the synopsis and highlight scene of the corresponding media content can be reproduced. Further, by multiplexing the obtained video stream and audio stream, it is possible to create an MPEG1 system stream of a synopsis or highlight scene collection of the corresponding media content.

〔第２の実施の形態〕
　以下、本発明に係る第２の実施の形態について述べる。本実施の形態は、第１の実施の形態と比較して、選択ステップの処理のみが異なるものである。 [Second embodiment]
Hereinafter, a second embodiment according to the present invention will be described. This embodiment differs from the first embodiment only in the processing of the selection step.

　以下、図を参照しながら本実施の形態における選択ステップ１０１の処理について記述する。本実施の形態における選択ステップ１０１では、最上位の<section>から葉である<segment>まですべてのpriorityを利用する。<section>、<segment>の各々のpriorityは、文脈内容における客観的な重要度とする。この処理を図３１を参照しながら説明する。図３１において、１３０１は文脈内容記述データにおける最上位の<section>のうちのひとつである。１３０２は<section>１３０１の子要素<section>である。１３０３は<section>１３０２の子要素<section>である。１３０４は<section>１３０３の子要素<segment>である。本実施の形態における選択ステップ１０１では、<segment>から祖先である最上位の<section>までの経路上すべてのpriorityの相加平均をとり、その値がしきい値以上の<segment>を選択する。図２８の例では、<segment>１３０４と、<section>１３０３と、<section>１３０２と、<section>１３０１との、それぞれの属性priorityの値p4,p3,p2,p1の相加平均paを計算する。paは以下の式（４）によって求められる。 Hereinafter, the processing of the selection step 101 in the present embodiment will be described with reference to the drawings. In the selection step 101 in the present embodiment, all priorities are used from the top <section> to the leaf <segment>. The priority of each of <section> and <segment> is an objective importance in the context content. This processing will be described with reference to FIG. In FIG. 31, reference numeral 1301 denotes one of the highest-level <section> in the context description data. 1302 is a child element <section> of <section> 1301. 1303 is a child element <section> of <section> 1302. Reference numeral 1304 denotes a child element <segment> of <section> 1303. In the selection step 101 in the present embodiment, the arithmetic mean of all priorities on the path from <segment> to the highest ancestor <section> is selected, and <segment> whose value is equal to or greater than the threshold value is selected. I do. In the example of FIG. 28, the arithmetic mean pa of the attribute values p4, p3, p2, and p1 of <segment> 1304, <section> 1303, <section> 1302, and <section> 1301 is calculate. pa is obtained by the following equation (4).

　pa=(p1+p2+p3+p4)/4　・・・（４） {Pa = (p1 + p2 + p3 + p4) / 4} ・・・ (4)

　このpaとしきい値との比較を行い（Ｓ１、Ｓ２）、paがしきい値以上であれば<segment>１３０４を選択し（Ｓ３）、<segment>１３０４の属性startとendの値を、選択された場面の開始時間と終了時間として出力する（Ｓ４）。以上の処理をすべての<segment>に対して行う（Ｓ１、Ｓ６）。図３２に、本実施の形態における選択ステップ１０１の処理のフローチャートを示す。 This pa is compared with the threshold value (S1, S2). If pa is equal to or greater than the threshold value, <segment> 1304 is selected (S3), and the values of the attributes start and end of <segment> 1304 are selected. The start time and the end time of the scene are output (S4). The above processing is performed for all <segments> (S1, S6). FIG. 32 shows a flowchart of the processing of selection step 101 in the present embodiment.

　なお、本実施の形態では、<segment>から祖先である最上位の<section>までのpriorityの相加平均を算出して、それにより<segment>の選択を行ったが、これを、<segment>を子要素としてもつ<section>から祖先である最上位の<section>までのpriorityの相加平均をとって、しきい値処理により、<segment>を子要素として持つ<section>の選択を行っても良い。同様に、他の階層の<section>から祖先である最上位の<section>までの相加平均をとって、しきい値処理により、その階層の<section>の選択を行っても良い。 In the present embodiment, the arithmetic average of the priorities from <segment> to the highest ancestor <section> is calculated, and <segment> is selected based on the arithmetic mean. Calculates the arithmetic mean of the priority from <section> with <> as a child element to the highest ancestor <section>, and selects the <section> with <segment> as a child element by threshold processing. You may go. Similarly, the arithmetic section from the <section> of another layer to the highest <section> that is the ancestor may be calculated, and the <section> of that layer may be selected by threshold processing.

〔第３の実施の形態〕
　以下、本発明に係る第３の実施の形態について述べる。本実施の形態も、第１の実施の形態と比較して、選択ステップの処理のみが異なるものである。 [Third Embodiment]
Hereinafter, a third embodiment according to the present invention will be described. This embodiment also differs from the first embodiment only in the processing of the selection step.

　以下、図を参照しながら本実施の形態における選択ステップ１０１の処理について記述する。本実施の形態における選択ステップ１０１は、第１の実施の形態における処理と同様に、<segment>を子要素にもつ<section>のみに着目し、その選択を行う。本実施の形態においては、選択する場面すべての継続時間の和にしきい値を設ける。すなわち、それまでに選択された<section>の継続時間の和が、このしきい値以下で最大となるまで、<section>のpriorityの大きい順に選択を行う。図３３に、本実施の形態における選択ステップ１０１のフローチャートを示す。<segment>を子要素としてもつ<section>の集合をΩとする（Ｓ１）。まず、属性priorityをキーとして、Ωの要素<section>を降順にソートする（Ｓ２）。Ωから最もpriorityの大きい<section>を選択する（Ｓ４、Ｓ５）。選択された<section>をΩから除去する。選択された<section>の子要素<segment>をすべて調べることにより、<section>の開始時間と終了時間を求め、<section>の継続時間を計算する（Ｓ６）。これまでに選択された<section>の継続時間の和を求め（Ｓ７）、しきい値を越えていれば処理を終了する（Ｓ８）。しきい値以下であれば、今回選択された<section>の開始時間と終了時間とを出力し（Ｓ９）、Ωからpriorityの最も大きい<section>の選択へ返る。この処理を、選択された<section>の継続時間の和がしきい値を越えるか、あるいはΩが空集合となるまで繰り返す（Ｓ４、Ｓ８）。 Hereinafter, the processing of the selection step 101 in the present embodiment will be described with reference to the drawings. In the selection step 101 according to the present embodiment, similarly to the processing according to the first embodiment, the selection is performed by focusing on only <section> having <segment> as a child element. In the present embodiment, a threshold value is provided for the sum of the durations of all selected scenes. That is, selection is performed in descending order of the priority of <section> until the sum of the durations of the <section> selected up to that time becomes the maximum below this threshold. FIG. 33 shows a flowchart of the selection step 101 in the present embodiment. A set of <section> having <segment> as a child element is set to Ω (S1). First, the elements <section> of Ω are sorted in descending order using the attribute priority as a key (S2). <Section> having the highest priority is selected from Ω (S4, S5). Remove the selected <section> from Ω. By examining all the child elements <segment> of the selected <section>, the start time and end time of the <section> are obtained, and the duration of the <section> is calculated (S6). The sum of the continuation times of the <section> selected so far is obtained (S7), and if it exceeds the threshold value, the process is terminated (S8). If not, the start time and the end time of the currently selected <section> are output (S9), and the process returns to the selection of the <section> having the highest priority from Ω. This process is repeated until the sum of the durations of the selected <section> exceeds the threshold value or Ω becomes an empty set (S4, S8).

　なお、本実施の形態では、<segment>を子要素として持つ<section>に着目して処理を行ったが、ほかに<segment>に着目して、それらの選択を行っても良い。この場合、priorityは、コンテンツ内すべての<segment>間での重要度とする。また、<segment>を子要素として持たない<section>のうち同じ階層のものに着目して、その選択を行っても良い。すなわち、<contents>あるいは<segment>から数えて同じ経路数の<section>に着目した処理を行っても良い。 In the present embodiment, the processing is performed by focusing on <section> having <segment> as a child element. However, the selection may be performed by focusing on <segment>. In this case, priority is the importance of all <segments> in the content. Alternatively, the selection may be performed by focusing on the same layer of <section> having no <segment> as a child element. That is, the processing may be performed focusing on <section> having the same number of paths counted from <contents> or <segment>.

　また、第２の実施の形態と同様に、<section>、<segment>の各々のpriorityを文脈内容における客観的な重要度とし、<segment>から祖先である最上位の<section>までのpriorityの相加平均paを計算して、paの大きい順から<segment>を子要素としてもつ<section>、あるいは<segment>を、継続時間の和がしきい値以下の最大となるまで選択する、としても同様の効果が得られる。 Similarly to the second embodiment, the priority of each of <section> and <segment> is set as an objective importance in the context content, and the priority from <segment> to the highest ancestor <section> of the ancestor is set. Calculate the arithmetic mean pa of and select <section> or <segment> having <segment> as a child element from the largest pa, until the sum of the durations becomes the maximum below the threshold, The same effect can be obtained.

〔第４の実施の形態〕
　以下、本発明に係る第４の実施の形態について述べる。本実施の形態も、第１の実施の形態と比較して、選択ステップの処理のみが異なるものである。 [Fourth Embodiment]
Hereinafter, a fourth embodiment according to the present invention will be described. This embodiment also differs from the first embodiment only in the processing of the selection step.

　以下、図を参照しながら本実施の形態における選択ステップ１０１の処理について記述する。本実施の形態における選択ステップ１０１は、第１の実施の形態における処理と同様に、<segment>と<segment>を子要素にもつ<section>とに着目し処理を行う。また、本実施の形態においては、第３の実施の形態と同様に、選択する場面すべての継続時間の和にしきい値を設ける。<segment>を子に持つ<section>のpriorityは、第１の実施の形態と同様、コンテンツ内すべての<segment>を子要素に持つ<section>の間での重要度とする。すなわち、図３４における点線で囲んだ<section>間での重要度とする。また、<segment>のpriorityは、同じ<section>を親要素に持つ<segment>間での重要度とする。すなわち、図３４における一点鎖線で囲んだ中の<segment>間での重要度とする。 Hereinafter, the processing of the selection step 101 in the present embodiment will be described with reference to the drawings. The selection step 101 in the present embodiment focuses on <segment> and <section> having <segment> as a child element, and performs the processing, similarly to the processing in the first embodiment. Also, in the present embodiment, as in the third embodiment, a threshold value is provided for the sum of the durations of all selected scenes. The priority of <section> having <segment> as a child is the importance of all the <section> having <segment> as a child element in the content, as in the first embodiment. That is, the importance is set between <section> enclosed by the dotted line in FIG. The priority of <segment> is the importance of <segment> having the same <section> as a parent element. That is, the importance between <segment> s enclosed by the dashed line in FIG.

　図３５に本実施の形態における選択ステップ１０１の処理のフローチャートを示す。まず、<segment>を子要素として持つ<section>の集合をΩとする（Ｓ１）。Ωをpriorityをキーとして降順にソートする（Ｓ２）。続いて、Ωからpriorityの最も大きい<section>を選択する（Ｓ３、Ｓ４、Ｓ５）。この時、最も重要度の大きい<section>が複数ある場合はすべて選択する。選択された<section>を集合Ω’の要素とし、集合Ωから削除する。選択された<section>の子要素<segment>から、該当<section>で表現される場面の開始時間と終了時間と継続時間を求めて記憶しておく（Ｓ６）。<section>が複数選択された場合は、そのすべてに関して、それらを求める。Ω’の要素である<section>の継続時間の総和を求め（Ｓ７、Ｓ８）、しきい値との比較を行う（Ｓ９）。継続時間の総和がしきい値と等しい場合は、記憶しておいた開始時間と終了時間をすべて出力して、処理を終了する（Ｓ１０）。継続時間の総和がしきい値より小さい場合は、Ωから<section>の選択処理へ返る（Ｓ４、Ｓ５）。このときΩが空集合の場合は、記憶しておいた開始時間と終了時間をすべて出力して、処理を終了する（Ｓ４）。継続時間の総和がしきい値より大きい場合は、以下の処理を行う。集合Ω’の要素のうち、重要度が最も小さい<section>を選択する（Ｓ１１）。このとき、最も重要度の小さい<section>が複数ある場合は、それらをすべて選択する。選択された<section>の子要素<segment>のうち、最も重要度の小さいものを削除し（Ｓ１２）、記憶されている該当<section>の開始時間と終了時間と継続時間を変更する（Ｓ１３）。<segment>の削除によって、場面が分断されることがあるが、この場合は、分断されたそれぞれの開始時間と終了時間と継続時間を記憶しておくこととする。また、<segment>の削除によって、すべての<segment>が削除された<section>がある場合は、その<section>をΩ’から削除する。選択された<section>が複数ある場合は、そのすべてに関してこの処理を行う。<segment>を削除することによって、該当<section>の継続時間が短くなり、継続時間の総和も短くなる。この削除処理を、Ω’の要素の継続時間の総和がしきい値以下になるまで繰り返す。Ω’の要素の継続時間の総和がしきい値以下となった場合は（Ｓ１４）、記憶している開始時間と終了時間をすべて出力して、処理を終了する（Ｓ１５）。 FIG. 35 shows a flowchart of the process of the selection step 101 in the present embodiment. First, a set of <section> having <segment> as a child element is set to Ω (S1). Ω is sorted in descending order using priority as a key (S2). Subsequently, <section> having the highest priority is selected from Ω (S3, S4, S5). At this time, if there are a plurality of <section> s having the highest importance, all are selected. The selected <section> is set as an element of the set Ω ′ and is deleted from the set Ω. From the child element <segment> of the selected <section>, the start time, end time, and duration of the scene represented by the relevant <section> are obtained and stored (S6). If multiple <section> s are selected, ask for them for all of them. The total sum of the durations of <section>, which is an element of Ω ', is obtained (S7, S8), and compared with a threshold value (S9). If the sum of the durations is equal to the threshold, the stored start time and end time are all output, and the process ends (S10). If the total of the durations is smaller than the threshold, the process returns from Ω to the selection of <section> (S4, S5). At this time, if Ω is an empty set, the stored start time and end time are all output, and the process ends (S4). If the sum of the durations is greater than the threshold, the following processing is performed. The <section> having the lowest importance is selected from the elements of the set Ω ′ (S11). At this time, if there are a plurality of <section> s having the least importance, all of them are selected. Of the child elements <segment> of the selected <section>, the one with the lowest importance is deleted (S12), and the stored start time, end time and duration of the relevant <section> are changed (S13). ). The scene may be divided by deleting <segment>. In this case, the start time, end time, and duration of each divided part are stored. If there is a <section> from which all <segments> have been deleted by deleting the <segment>, the <section> is deleted from Ω '. If there are a plurality of selected <section> s, this process is performed for all of them. By deleting the <segment>, the duration of the relevant <section> is reduced, and the total duration is also reduced. This deletion process is repeated until the sum of the durations of the elements of Ω ′ becomes equal to or smaller than the threshold value. If the sum of the durations of the elements of Ω ′ is equal to or smaller than the threshold value (S14), the stored start time and end time are all output, and the process ends (S15).

　なお、本実施の形態においては、<segment>と<segment>を子要素としてもつ<section>に着目して処理を行っているが、<section>とその子要素の<section>、<section>とその子要素の<section>に着目して処理を行っても同様の効果が得られる。 In the present embodiment, processing is performed by focusing on <section> having <segment> and <segment> as child elements, but <section> and its child elements <section>, <section> and The same effect can be obtained even if the processing is performed by focusing on the <section> of the child element.

　また、継続時間の総和がしきい値を越えた場合の<segment>の削除処理に関して、priorityの小さい<section>から削除を行ったが、<section>のpriorityにしきい値を設け、そのしきい値以下の<section>すべてから最も小さい重要度の<segment>を削除する、としてもよい。さらに、<segment>のpriorityにしきい値を設け、しきい値以下の<segment>を削除する、としても良い。 In addition, regarding the <segment> deletion processing when the total duration exceeds the threshold, deletion was performed from <section> with lower priority, but a threshold was set for the priority of <section>, and the threshold was set. The least significant <segment> may be deleted from all <section> s below the value. Furthermore, a threshold may be set for the priority of <segment>, and <segment> below the threshold may be deleted.

〔第５の実施の形態〕
　以下、本発明に係る第５の実施の形態について述べる。本実施の形態においては、メディアコンテンツとして、MPEG1システムストリームの動画像を想定する。この場合、メディアセグメントは、ひとつのシーンカットに相当する。また本実施の形態において、スコアは、該当する場面における文脈内容に基づいた客観的な重要度とする。 [Fifth Embodiment]
Hereinafter, a fifth embodiment according to the present invention will be described. In the present embodiment, a moving image of an MPEG1 system stream is assumed as the media content. In this case, the media segment corresponds to one scene cut. In the present embodiment, the score is an objective importance based on the context content in the corresponding scene.

　図３６は、本発明の実施の形態に係わるデータ処理方法のブロック図である。
　図３６において、１８０１は選択ステップを、１８０２は抽出ステップを、１８０３は構成ステップを、１８０４は配送ステップを、１８０５はデータベースを表す。選択ステップ１８０１は、文脈内容記述データからメディアコンテンツの場面を選択し、その場面の開始時間と終了時間と、それが格納されているファイルを表すデータを出力する処理を行う。抽出ステップ１８０２は、選択ステップ１８０１が出力したファイルを表すデータ、開始時間、終了時間を受けとり、物理内容記述データを参照して、メディアコンテンツのファイルから、入力した開始時間と終了時間で区切られる区間のデータを抽出する処理を行う。構成ステップ１８０３は、抽出ステップ１８０２が出力したデータを多重化し、MPEG1システムストリームを構成する処理を行う。配送ステップ１８０４は、構成ステップ１８０３が作成したMPEG1システムストリームを、回線を通じて配送する処理を行う。１８０５はメディアコンテンツと、その物理内容記述データ、文脈内容記述データを格納したデータベースである。 FIG. 36 is a block diagram of a data processing method according to an embodiment of the present invention.
36, reference numeral 1801 denotes a selection step, 1802 denotes an extraction step, 1803 denotes a configuration step, 1804 denotes a delivery step, and 1805 denotes a database. The selection step 1801 performs a process of selecting a scene of the media content from the context description data and outputting the start time and end time of the scene and data representing a file in which the scene is stored. The extracting step 1802 receives the data representing the file output by the selecting step 1801, the start time and the end time, and refers to the physical content description data to separate the media content file from the file of the input start time and end time. The process of extracting the data of the data is performed. The configuration step 1803 performs processing for multiplexing the data output from the extraction step 1802 and configuring an MPEG1 system stream. The delivery step 1804 performs processing of delivering the MPEG1 system stream created by the configuration step 1803 through a line. Reference numeral 1805 denotes a database that stores media contents, their physical content description data, and context content description data.

　図３７に、本実施の形態における物理内容記述データの構成を示す。本実施の形態では、物理内容を木構造で記述する。メディアコンテンツのデータベース１８０５上の格納形態は、ひとつのメディアコンテンツがひとつのファイルとして格納されているとは限らず、ひとつのメディアコンテンツが複数のファイルに分割されて格納されている場合もある。そこで、物理内容記述データの木構造の根(root)は、<contents>と表記されひとつのコンテンツを表す。根<contents>には、属性として該当コンテンツのタイトルが付けられる。<contents>の子要素は<mediaobject>で、格納されているファイルを表す。<mediaobject>には、属性として格納されているファイルへのリンクlocatorと、文脈内容記述データとの関連付けのために識別子idが属性として付加される。また、メディアコンテンツが複数のファイルで構成されているときのために、該当ファイルがコンテンツ内での順序を表すseqも属性として付加する。 FIG. 37 shows the structure of the physical content description data in the present embodiment. In the present embodiment, physical contents are described in a tree structure. The storage format of the media content on the database 1805 is not limited to one media content being stored as one file, and one media content may be divided into a plurality of files and stored. Therefore, the root of the tree structure of the physical content description data is expressed as <contents> and represents one content. The root <contents> is given the title of the corresponding content as an attribute. The <contents> child element is <mediaobject>, which represents the stored file. To <mediaobject>, an identifier id is added as an attribute for associating a link locator to a file stored as an attribute with context content description data. In addition, for when the media content is composed of a plurality of files, seq indicating the order of the files in the content is also added as an attribute.

　図３８に、本実施の形態における文脈内容記述データの構成を示す。これは、第１の実施の形態における文脈内容記述データに、物理内容記述データの<mediaobject>との関連を加えたものである。すなわち、文脈内容記述データの根<contents>の子要素は<mediaobject>で、この<mediaobject>の子要素が<section>となる。<section>、<segment>は第１の実施の形態と同様のものである。文脈内容記述データの<mediaobject>との対応を取る。すなわち、文脈内容記述データの<mediaobject>の子孫で記述されるメディアコンテンツの場面は、同じ値の属性idをもつ物理内容記述データの<mediaobject>が示すファイルに格納されている。また、<segment>の時間情報startとendは、各ファイルの先頭からの時間を設定することとする。すなわち、ひとつのメディアコンテンツが複数のファイルから構成される場合、各ファイルの先頭時間は０であり、各場面の開始時間は、それが格納されているファイルの先頭からそこまでの経過時間で表すこととする。 FIG. 38 shows the configuration of context description data in the present embodiment. This is obtained by adding the association with the <mediaobject> of the physical content description data to the context content description data in the first embodiment. That is, the child element of the root <contents> of the context description data is <mediaobject>, and the child element of this <mediaobject> is <section>. <section> and <segment> are the same as in the first embodiment. Correspond to <mediaobject> in the context description data. That is, the scene of the media content described by the descendants of <mediaobject> of the context description data is stored in the file indicated by <mediaobject> of the physical description data having the attribute id of the same value. The time information start and end of <segment> is set to the time from the beginning of each file. That is, when one media content is composed of a plurality of files, the head time of each file is 0, and the start time of each scene is represented by the elapsed time from the head of the file in which it is stored. It shall be.

　本実施の形態における物理内容記述データと文脈内容記述データとをコンピュータ上で表現する一例として、Extensible Markup Language(XML)による記述を用いることができる。図３９は図３７に示す物理内容記述データをXMLで記述するためのDocument Type Definition(DTD)と、このDTDによる物理内容記述データの一例である。また、図４０〜図４５は図３８に示す文脈内容記述データをXMLで記述するためのDTDと、このDTDによる文脈内容記述データの一例である。 As an example of expressing the physical content description data and the context content description data in the present embodiment on a computer, a description in Extensible Markup Language (XML) can be used. FIG. 39 shows an example of a Document Type Definition (DTD) for describing the physical content description data shown in FIG. 37 in XML, and an example of the physical content description data based on the DTD. FIGS. 40 to 45 show an example of a DTD for describing the context description data shown in FIG. 38 in XML, and an example of the context description data based on the DTD.

　以下、選択ステップ１８０１の処理について説明する。選択ステップ１８０１での場面の選択の手法は、第１〜第４の実施の形態に記載のいずれかの手法を用いる。ただし、結果として開始時間、終了時間とともに、対応する物理内容記述データの<mediaobject>のidも同時に出力する。物理内容記述データを図３９に示すDTDによるXML文書で記述し、文脈内容記述データを図４０、図４５に示すDTDによるXML文書で表した場合の、選択ステップ１８０１の出力の一例を図４６に示す。図４６において、id=の後に物理内容記述データの<mediaobject>のidが記述され、start=の後に開始時間が記述され、end=の後に終了時間が記述される。 Hereinafter, the processing of the selection step 1801 will be described. As a technique for selecting a scene in the selection step 1801, any one of the techniques described in the first to fourth embodiments is used. However, as a result, the id of <mediaobject> of the corresponding physical content description data is output at the same time as the start time and the end time. FIG. 46 shows an example of the output of the selection step 1801 in the case where the physical content description data is described in the XML document based on the DTD shown in FIG. 39 and the context content description data is expressed in the XML document based on the DTD shown in FIGS. Show. In FIG. 46, the id of <mediaobject> of the physical content description data is described after id =, the start time is described after start =, and the end time is described after end =.

　以下、抽出ステップ１８０２の処理について説明する。図４７に、本実施の形態に係わる抽出ステップ１８０２のブロック図を示す。図４７において、本実施の形態における抽出ステップ１８０２は、インターフェース手段２４０１と、分離手段２４０２と、ビデオスキミング手段２４０３と、オーディオスキミング手段２４０４とから構成される。インターフェース手段２４０１は、物理内容記述データと選択ステップ１８０２の出力を入力とし、データベース１８０５から、メディアコンテンツのファイルを取り出して、そのデータを分離手段２４０２へ出力し、選択ステップ１８０２が出力した区間の開始時間と終了時間をビデオスキミング手段２４０３とオーディオスキミング手段２４０４へ出力する。分離手段２４０２は、本実施の形態におけるメディアコンテンツはビデオストリームとオーディオストリームが多重化されたMPEG1システムストリームであるため、ビデオストリームとオーディオストリームとに分離するものである。ビデオスキミング手段２４０３は、分離されたビデオストリームとインターフェース手段２４０１が出力した区間を入力とし、入力されたビデオストリームから、選択された区間のデータだけを出力するものである。オーディオスキミング手段２４０２は、分離されたオーディオストリームと選択ステップ２４０２が出力した区間を入力とし、入力されたオーディオストリームから、選択された区間のデータだけを出力するものである。 Hereinafter, the process of the extraction step 1802 will be described. FIG. 47 shows a block diagram of extraction step 1802 according to the present embodiment. In FIG. 47, the extraction step 1802 in the present embodiment includes an interface means 2401, a separating means 2402, a video skimming means 2403, and an audio skimming means 2404. The interface means 2401 receives the physical content description data and the output of the selection step 1802 as input, extracts the file of the media content from the database 1805, outputs the data to the separation means 2402, and starts the section output by the selection step 1802. The time and the end time are output to the video skimming means 2403 and the audio skimming means 2404. The separating means 2402 separates the media content into a video stream and an audio stream because the media content in the present embodiment is an MPEG1 system stream in which a video stream and an audio stream are multiplexed. The video skimming unit 2403 receives the separated video stream and the section output by the interface unit 2401 as inputs, and outputs only the data of the selected section from the input video stream. The audio skimming means 2402 receives the separated audio stream and the section output by the selection step 2402 as inputs, and outputs only data of the selected section from the input audio stream.

　以下、インターフェース手段２４０１での処理について説明する。図４８にインターフェース手段２４０１の処理のフローチャートを示す。インターフェース手段は、まず該当メディアコンテンツの物理内容記述データと、図４６に示すような選択ステップ１８０１の出力を入力する。物理内容記述データの<mediaobject>の属性idから、ファイルの時間順が得られるので、選択ステップ１８０１の出力を、idをキーとして、時間順にソートする（Ｓ１）。さらに図４９のようなデータに変換する。これは、同じファイルのものはまとめ、さらに開始時間順に並べたものである。続いて、インターフェース手段２４０１は、図４９のデータの上から順に以下の処理を行う。まず、idを用いて、物理内容記述データの<mediaobject>を参照し、その属性locatorからファイル名を取得する。該当ファイル名のファイルのデータをデータベースから読み取り、分離手段２４０２へ出力する（Ｓ２、Ｓ３）。さらに、図４９のidに続いて記されている、該当ファイル内の選択された区間の開始時間と終了時間を、すべてビデオスキミング手段２４０３とオーディオスキミング手段２４０４へ出力する（Ｓ４）。すべてのデータに対し以上の処理が行われた時は、処理を終了する（Ｓ５）。まだデータが残っている場合は、分離手段２４０２、ビデオスキミング手段２４０３、オーディオスキミング手段２１０４の処理終了を待ってから（Ｓ６、Ｓ７）、以上の処理を繰り返す。 Hereinafter, processing in the interface means 2401 will be described. FIG. 48 shows a flowchart of the processing of the interface means 2401. The interface means first inputs the physical content description data of the corresponding media content and the output of the selection step 1801 as shown in FIG. Since the time order of the files can be obtained from the attribute id of <mediaobject> of the physical content description data, the output of the selecting step 1801 is sorted in order of time using the id as a key (S1). Further, the data is converted into data as shown in FIG. In this example, files of the same file are put together and arranged in order of start time. Subsequently, the interface means 2401 performs the following processing in order from the top of the data in FIG. First, the file name is acquired from the attribute locator by referring to <mediaobject> of the physical content description data using id. The data of the file with the corresponding file name is read from the database and output to the separating means 2402 (S2, S3). Further, the start time and the end time of the selected section in the file, which are described after the id in FIG. 49, are all output to the video skimming means 2403 and the audio skimming means 2404 (S4). When the above processing has been performed on all data, the processing is terminated (S5). If data still remains, the above processing is repeated after waiting for the end of the processing of the separating means 2402, the video skimming means 2403, and the audio skimming means 2104 (S6, S7).

　以下、分離手段２４０２の処理について説明する。図５０に分離手段２４０２の処理のフローチャートを示す。分離手段２４０２は、インターフェース手段２４０１からメディアコンテンツであるMPEG1システムストリームを受けとって、ビデオストリームとオーディオストリームに分離し、ビデオストリームをビデオスキミング手段２４０３へ、オーディオストリームをオーディオスキミング手段２４０４へ出力し（Ｓ１〜Ｓ１０）、出力終了後（Ｓ９、Ｓ１１）、インターフェース手段２４０１へ処理終了を通知するものである（Ｓ１２）。図５０のフローチャートに示す通り、処理終了の通知以外は、第１の実施の形態で記述した分離手段と同様の処理を行うものである。 Hereinafter, the processing of the separating unit 2402 will be described. FIG. 50 shows a flowchart of the processing of the separating means 2402. The separating unit 2402 receives the MPEG1 system stream as the media content from the interface unit 2401, separates the stream into a video stream and an audio stream, and outputs the video stream to the video skimming unit 2403 and the audio stream to the audio skimming unit 2404 (S1). After the output is completed (S9, S11), the interface unit 2401 is notified of the end of the process (S12). As shown in the flowchart of FIG. 50, the same processing as that of the separating means described in the first embodiment is performed except for notification of the end of the processing.

　以下、ビデオスキミング手段２４０３の処理について説明する。図５３にビデオスキミング手段２４０３の処理のフローチャートを示す。図５３のフローチャートに示す通り、処理終了時にインターフェース手段２４０１へ処理終了の通知を行う（Ｓ１６、Ｓ１７）以外は、第１の実施の形態で記述したビデオスキミング手段と同様の処理を行うものである。 Hereinafter, the processing of the video skimming means 2403 will be described. FIG. 53 shows a flowchart of the processing of the video skimming means 2403. As shown in the flowchart of FIG. 53, the same processing as that of the video skimming means described in the first embodiment is performed, except that a notification of the processing end is sent to the interface means 2401 at the end of the processing (S16, S17). .

　以下、オーディオスキミング手段２４０４の処理について説明する。図５２にオーディオスキミング手段２４０４の処理のフローチャートを示す。図５２のフローチャートに示す通り、処理終了時にインターフェース手段２４０１へ処理終了の通知を行う（Ｓ１１、Ｓ１２）以外は、第１の実施の形態で記述したオーディオスキミング手段と同様の処理を行うものである。 Hereinafter, the processing of the audio skimming means 2404 will be described. FIG. 52 shows a flowchart of the processing of the audio skimming means 2404. As shown in the flowchart of FIG. 52, the same processing as that of the audio skimming means described in the first embodiment is performed, except that the processing means is notified to the interface means 2401 when the processing is completed (S11, S12). .

　構成ステップ１８０３は、抽出ステップ１８０２が出力したビデオストリームとオーディオストリームを、国際標準ISO/IEC IS 11172-1で標準化されたMPEG1システムの多重化方式により、時分割多重化を行うものである。メディアコンテンツが複数のファイルに分割されて格納されている場合、抽出ステップ１８０２は各ファイル毎にビデオストリーム、オーディオストリームを出力するため、それぞれに対して多重化を行う。 The configuration step 1803 performs time-division multiplexing of the video stream and audio stream output from the extraction step 1802 according to the multiplexing method of the MPEG1 system standardized by the international standard ISO / IEC IS 11172-1. If the media content is divided into a plurality of files and stored, the extraction step 1802 multiplexes each of the files in order to output a video stream and an audio stream for each file.

　配送ステップ１８０４は、構成ステップ１８０３が多重化したMPEG1システムストリームを回線を通じて配送するものである。構成ステップ１８０３が複数のMPEG1システムストリームを出力した場合、出力された順にすべてを配送する。 The delivery step 1804 is for delivering the MPEG1 system stream multiplexed in the configuration step 1803 through a line. When the configuration step 1803 outputs a plurality of MPEG1 system streams, all are delivered in the order of output.

　なお、本実施の形態においては、メディアコンテンツが複数のファイルに分割されて格納されている場合は、抽出ステップ１８０２の処理において、各ファイル毎の処理を行ったが、メディアコンテンツのファイル間で対応するビデオストリーム、オーディオストリームをすべてつなぎ合わせて出力し、構成ステップ１８０３においては、ビデオストリームとオーディオストリームの多重化により、ひとつのMPEG1システムストリームを構成する、としても同様の効果が得られる。この場合、ビデオスキミング手段２４０３でのタイムコードの変更処理を以下のように行う必要がある。すなわち、ビデオストリームの数だけ、出力したフレーム数のカウンタCを用意し、Cの初期化は最初のファイルの時にのみ行う（図５１のＳ１８、Ｓ３）。この場合のビデオスキミング手段２４０３のフローチャートを図５３に示す。また、本実施の形態においては文脈内容記述データと物理内容記述データを別々に記述したが、物理内容記述データの属性seqとlocatorを、文脈内容記述データの<mediaobject>の属性として付加することにより、ひとつにまとめても良い。 In the present embodiment, when the media content is divided into a plurality of files and stored, a process for each file is performed in the process of the extracting step 1802. The same effect can be obtained even if all the video streams and audio streams are connected and output, and in the configuration step 1803, one MPEG1 system stream is configured by multiplexing the video stream and the audio stream. In this case, the time code changing process in the video skimming means 2403 needs to be performed as follows. That is, a counter C for the number of output frames is prepared for the number of video streams, and initialization of C is performed only for the first file (S18, S3 in FIG. 51). FIG. 53 shows a flowchart of the video skimming means 2403 in this case. Further, in the present embodiment, the context description data and the physical description data are described separately, but by adding the attributes seq and locator of the physical description data as <mediaobject> attributes of the context description data. , May be put together.

〔第６の実施の形態〕
　以下、本発明に係る第６の実施の形態について述べる。本実施の形態においては、メディアコンテンツとして、MPEG1システムストリームの動画像を想定する。この場合、メディアセグメントは、ひとつのシーンカットに相当する。また本実施の形態において、スコアは、該当する場面における文脈内容に基づいた客観的な重要度とする。 [Sixth Embodiment]
Hereinafter, a sixth embodiment according to the present invention will be described. In the present embodiment, a moving image of an MPEG1 system stream is assumed as the media content. In this case, the media segment corresponds to one scene cut. In the present embodiment, the score is an objective importance based on the context content in the corresponding scene.

　図５４は、本発明の実施の形態に係わるデータ処理方法のブロック図である。
　図５４において、３１０１は選択ステップを、３１０２は抽出ステップを、３１０３は構成ステップを、３１０４は配送ステップを、３１０５はデータベースを表す。選択ステップ３１０１は、文脈内容記述データからメディアコンテンツの場面を選択し、その場面の開始時間と終了時間と、それが格納されているファイルを表すデータを出力する処理を行うもので、第５の実施の形態記載の選択ステップと同様のものである。抽出ステップ３１０２は、選択ステップ３１０１が出力したファイルを表すデータ、開始時間、終了時間を受けとり、物理内容記述データを参照して、メディアコンテンツのファイルから、入力した開始時間と終了時間で区切られる区間のデータを抽出する処理を行うもので、第５の実施の形態記載の抽出ステップと同様のものである。構成ステップ３１０３は、配送ステップ３１０４が判断した回線状況に応じて、抽出ステップ３１０２が出力したストリームの一部またはすべてを多重化し、MPEG1システムストリームを構成する処理を行う。配送ステップ３１０４は、配送する回線状況を判断してその結果を構成ステップ３１０３に伝える処理と、構成ステップ３１０３が作成したMPEG1システムストリームを、回線を通じて配送する処理を行う。３１０５はメディアコンテンツと、その物理内容記述データ、文脈内容記述データを格納したデータベースである。 FIG. 54 is a block diagram of a data processing method according to the embodiment of the present invention.
In FIG. 54, 3101 represents a selection step, 3102 represents an extraction step, 3103 represents a configuration step, 3104 represents a delivery step, and 3105 represents a database. The selecting step 3101 performs a process of selecting a scene of the media content from the context description data and outputting a start time and an end time of the scene and data representing a file in which the scene is stored. This is the same as the selection step described in the embodiment. The extraction step 3102 receives the data representing the file output by the selection step 3101, the start time, and the end time, and refers to the physical content description data to separate the media content file from the file of the input start time and end time. This is the same as the extraction step described in the fifth embodiment. The configuration step 3103 performs a process of multiplexing a part or all of the stream output by the extraction step 3102 according to the line condition determined by the distribution step 3104 to configure an MPEG1 system stream. The delivery step 3104 performs a process of determining the status of the line to be delivered and transmitting the result to the configuration step 3103, and a process of delivering the MPEG1 system stream created by the configuration step 3103 through the line. Reference numeral 3105 denotes a database that stores media content, its physical content description data, and context content description data.

　図５５に、本実施の形態に係わる構成ステップ３１０３と、配送ステップ３１０４とのブロック図を示す。図５５において、構成ステップ３１０３はストリーム選択手段３２０１と、多重化手段３２０２とから構成され、配送ステップ３１０４は回線状況判定手段３２０３と、配送手段３２０４とから構成される。ストリーム選択手段３２０１は、抽出ステップ３１０２が出力したビデオストリーム、オーディオストリームと、回線状況判定手段３２０３の出力した回線状況を入力とし、回線がすべてのデータを送出するのに十分な状態である場合は、すべてのストリームを多重化手段３２０２へ出力する。回線が混雑している、あるいは容量の小さい回線であるなど、すべてのデータを送出すると多大な時間を要する場合は、ビデオストリーム、オーディオストリームそれぞれ複数あるうちの一部だけを選択して多重化手段３２０２へ出力する。この場合の選択の方法には、ビデオストリームに関しては基本レイヤのストリームだけ、など、また、オーディオストリームに関しては、モノラルだけ、あるいはステレオのレフトだけ、ステレオのライトだけ、などさまざまな組合せがある。ただし、ビデオストリーム、オーディオストリームともひとつのストリームしかない場合は、回線状況に係わらずそのストリームを出力する。多重化手段３２０２は、ストリーム選択手段３２０１が出力したビデオストリームとオーディオストリームを、国際標準ISO/IDE IS 11172-1で標準化されたMPEG1システムの多重化方式により、時分割多重化を行うものである。回線状況判定手段３２０３は、配送する回線の容量や現在の使用状況などを調べて、ストリーム選択手段３２０１へ出力するものである。配送手段３２０４は、多重化手段３２０２が多重化したMPEG1システムストリームを回線を通じて配送するものである。 FIG. 55 shows a block diagram of the configuration step 3103 and the delivery step 3104 according to the present embodiment. In FIG. 55, the configuration step 3103 includes a stream selection unit 3201 and a multiplexing unit 3202, and the distribution step 3104 includes a line status determination unit 3203 and a distribution unit 3204. The stream selection unit 3201 receives the video stream and audio stream output from the extraction step 3102 and the line status output from the line status determination unit 3203 as inputs, and if the line is in a state sufficient to transmit all data. , All streams are output to the multiplexing means 3202. If it takes a lot of time to send out all data, such as when the line is congested or the line is small, select a multiplexing unit by selecting only a part of each of the video stream and audio stream. Output to 3202. There are various combinations of selection methods in this case, such as only a base layer stream for a video stream, and only a monaural or only a stereo left or a stereo right for an audio stream. However, when there is only one stream for both the video stream and the audio stream, the stream is output regardless of the line condition. The multiplexing unit 3202 performs time-division multiplexing of the video stream and the audio stream output from the stream selection unit 3201 according to the multiplexing method of the MPEG1 system standardized by the international standard ISO / IDE IS 11172-1. . The line status determination unit 3203 checks the capacity of the line to be delivered, the current usage status, and the like, and outputs it to the stream selection unit 3201. The delivery unit 3204 delivers the MPEG1 system stream multiplexed by the multiplexing unit 3202 via a line.

　なお、本実施の形態においては、ストリーム選択手段３２０１において、ビデオストリームがひとつの場合、回線状況に関わらずそれを出力するとしたが、回線がすべてのデータを送出すると多大な時間を要する場合は、ビデオストリームの代表画像のみを選択して送出する、としてもよい。代表画像の選択方法としては、文脈内容記述データに代表画像のタイムコードを記述しておく、あるいは各フレームのうち独立で復号可能なＩピクチャと呼ばれるフレームだけ選択する、などがある。 In the present embodiment, the stream selecting means 3201 outputs a single video stream regardless of the line condition when the video stream is one. However, if it takes a lot of time to send all data through the line, Only the representative image of the video stream may be selected and transmitted. As a method of selecting a representative image, there is a method of describing the time code of the representative image in the context description data, or a method of selecting only an independently decodable I-picture frame among the frames.

〔第７の実施の形態〕
　以下、本発明に係る第７の実施の形態について述べる。本実施の形態においては、メディアコンテンツとして、MPEG1システムストリームの動画像を想定する。この場合、メディアセグメントは、ひとつのシーンカットに相当する。また本実施の形態において、スコアは、該当する場面における、ユーザ等が選択した登場人物や事柄等のキーワードの観点に基づいた重要度とする。 [Seventh Embodiment]
Hereinafter, a seventh embodiment according to the present invention will be described. In the present embodiment, a moving image of an MPEG1 system stream is assumed as the media content. In this case, the media segment corresponds to one scene cut. In the present embodiment, the score is an importance based on the viewpoint of keywords such as characters and matters selected by the user or the like in the corresponding scene.

　図５６は、本実施の形態におけるデータ処理方法のブロック図である。図５６において、３３０１は選択ステップを、３３０２は抽出ステップを表す。選択ステップ３３０１は、文脈内容記述データのキーワードとそのスコアから、メディアコンテンツの場面を選択し、その場面の開始時間と終了時間を出力する処理を行う。また、抽出ステップ３３０２は、選択ステップ３３０１が出力した開始時間と終了時間によって区切られるメディアコンテンツの区間のデータを抽出する処理を行う。 FIG. 56 is a block diagram of a data processing method according to the present embodiment. In FIG. 56, 3301 represents a selection step, and 3302 represents an extraction step. The selection step 3301 performs a process of selecting a scene of the media content from the keywords of the context description data and the scores thereof, and outputting the start time and end time of the scene. The extracting step 3302 performs a process of extracting data of a section of the media content separated by the start time and the end time output by the selecting step 3301.

　図５７に、本実施の形態の文脈内容記述データの構成を示す。本実施の形態では、文脈内容を木構造で記述する。また、木構造の兄弟関係は、左から時間順にならんでいるものとする。図５７において、<contents>と記されている木構造の根(root)は、ひとつのコンテンツを表し、属性としてそのコンテンツのタイトルが付けられる。 FIG. 57 shows the structure of the context description data according to the present embodiment. In the present embodiment, context contents are described in a tree structure. Further, it is assumed that the siblings of the tree structure are arranged in chronological order from the left. In FIG. 57, the root of the tree structure described as <contents> represents one content, and the title of the content is given as an attribute.

　<contents>の子要素は、<section>である。<section>には、その場面の内容や登場人物などを表すキーワードであるkeywordと、このキーワードの重要度を表すpriorityとの組(keyword, priority)が属性として付加される。priorityは1から5までの整数値とし、1が最も重要度が低く、5が最も重要度が高い、とする。(keyword, priority)組は、ユーザが見たいと思う場面、人物などを検索する時のキーに用いることができるように設定する。そのため、(keyword, priority)組は、ひとつの<section>に複数付加することが可能とする。例えば登場人物を記述する場合、その場面に現れる人物の数だけ(keyword,priority)組を付加し、また、priorityは、該当場面に該当keywordの人物が、数多く登場する場合はその値が高い、といったように設定する。子 The child element of <contents> is <section>. To <section>, a combination (keyword, priority) of a keyword, which is a keyword indicating the content of the scene, a character, and the like, and a priority, which indicates the importance of the keyword, is added as an attribute. priority is an integer from 1 to 5, where 1 is the least important and 5 is the most important. The (keyword, priority) group is set so that it can be used as a key when searching for a scene, a person, or the like that the user wants to see. Therefore, a plurality of (keyword, priority) pairs can be added to one <section>. For example, when describing a character, the number of (keyword, priority) pairs is added as many as the number of people appearing in the scene, and the priority is high if a person with the corresponding keyword appears in the scene in large numbers, And so on.

　<segment>は、ひとつのシーンカットを表し、<section>と同様の(keyword, priority)組と、該当シーンの時間情報として、開始時間を表すstartと、終了時間を表すendとが、属性として付加される。シーンカットの方法は、市販されていたり、ネットワークで流通しているソフトを用いても良いし、人手で行っても良い。なお、本実施の形態では、時間情報をシーンカットの開始時間と終了時間としたが、時間情報として開始時間と該当シーンの継続時間としても同様の効果が得られる。この場合、該当シーンの終了時間は、開始時間に継続時間を加算して求められる。 <segment> represents one scene cut, and a (keyword, priority) group similar to <section>, and start indicating start time and end indicating end time as time information of the corresponding scene are attributes. Will be added. The method of scene cut may be software that is commercially available or distributed on a network, or may be manually performed. In the present embodiment, the time information is the start time and the end time of the scene cut, but the same effect can be obtained by using the start time and the duration of the scene as the time information. In this case, the end time of the scene is obtained by adding the duration to the start time.

　この構成の文脈内容記述データをコンピュータ上で表現する一例として、Extensible Markup Language(XML)による記述を用いることができる。XMLはWorld Wide Web Consortiumによって標準化が進められているデータ記述言語であり、1998年2月10日にVer. 1.0が勧告された。XML ver. 1.0の仕様書は、http://www.w3.org/TR/1998/REC-xml-19980210で得られる。図５８〜図６６は、本実施の形態の文脈内容記述データをXMLで記述するためのDocument Type Definition(DTD)と、このDTDによる文脈内容記述データの一例である。また、図６７〜図８０は、図５８〜図６６に示す文脈内容記述データに、代表画像（映像情報）やキーワード（音情報）などのメディアセグメントの代表データ（dominant-data）を追加した文脈内容記述データの一例と、該文脈内容記述データをXMLで記述するためのDTDである。記述 As an example of expressing the context content description data of this configuration on a computer, a description in Extensible Markup Language (XML) can be used. XML is a data description language being standardized by the World Wide Web Consortium, and Ver. 1.0 was recommended on February 10, 1998. The XML ver. 1.0 specification is available at http://www.w3.org/TR/1998/REC-xml-19980210. FIGS. 58 to 66 show an example of a Document Type Definition (DTD) for describing the context description data of the present embodiment in XML, and an example of the context description data based on the DTD. FIGS. 67 to 80 show contexts in which representative data (dominant-data) of media segments such as representative images (video information) and keywords (sound information) are added to the context description data shown in FIGS. 58 to 66. An example of the content description data and a DTD for describing the context description data in XML.

　以下、選択ステップ３３０１での処理について説明する。本実施の形態における選択ステップ３３０１での処理は、<segment>と<segment>を子要素に持つ<section>に着目して処理を行う。図８１に、本実施の形態における選択ステップ３３０１の処理のフローチャートを示す。本実施の形態における選択ステップ３３０１は、場面選択のキーとなるキーワードとそのpriorityのしきい値を入力とし、文脈内容記述データの<segment>を子要素としてもつ<section>から、キーと同じキーワードを持ち、かつ、そのpriorityがしきい値以上の<section>を選択する（Ｓ２、Ｓ３）。続いて、選択された<section>の<segment>のうち、キーと同じキーワードを持ち、かつ、そのpriorityがしきい値以上の<segment>のみを選択する（Ｓ５、Ｓ６）。以上の処理から選択された<segment>の属性であるstartとendより、選択された場面の開始時間と終了時間を求め、それを出力する（Ｓ７、Ｓ８、Ｓ９、Ｓ１０、Ｓ１１、Ｓ１、Ｓ４）。 Hereinafter, the processing in the selection step 3301 will be described. The processing in the selection step 3301 in this embodiment focuses on <section> having <segment> and <segment> as child elements. FIG. 81 shows a flowchart of the process of selection step 3301 in the present embodiment. The selection step 3301 according to the present embodiment is performed by inputting a keyword serving as a key for scene selection and a threshold value of the priority, and extracting the same keyword as the key from <section> having <segment> of context content description data as a child element. And selects a <section> whose priority is equal to or greater than the threshold (S2, S3). Subsequently, of the selected <segments> of <section>, only <segments> having the same keyword as the key and having a priority equal to or greater than the threshold are selected (S5, S6). The start time and end time of the selected scene are obtained from the start and end attributes of the <segment> selected from the above processing and output (S7, S8, S9, S10, S11, S1, S4). ).

　なお、本実施の形態では、<segment>と<segment>を子要素として持つ<section>に着目して処理を行ったが、ある階層の<section>とその子要素である<section>の親子関係に着目して、同様の処理を行っても良い。また、親子関係も2階層のみではなく、さらに階層を増やして、木構造の葉である<segment>まで同様の処理を行ってもよい。さらに、検索のキーを、複数のキーワードとその間の条件との組としてもよい。キーワード間の条件には、「どちらか」、「ともに」、「どちらか」と「ともに」の組合せと、いったものがある。選択のしきい値も、キーワードが複数の場合はキーワード毎に指定して処理を行っても良い。この検索キーとなるキーワードは、ユーザの入力によって受けとっても良いし、ユーザプロファイルなどからシステムが自動的に設定する構成でも良い。 In the present embodiment, processing is performed by focusing on <section> having <segment> and <segment> as child elements, but the parent-child relationship between <section> in a certain hierarchy and <section> as its child element The same processing may be performed by focusing on. Also, the parent-child relationship is not limited to two layers, and the same processing may be performed by increasing the number of layers and up to <segment> which is a leaf of the tree structure. Further, the search key may be a set of a plurality of keywords and conditions therebetween. Conditions between keywords include "either", "both", and a combination of "either" and "both". If there are a plurality of keywords, the selection threshold may be specified and processed for each keyword. The keyword serving as the search key may be received by a user input, or may be configured to be automatically set by the system from a user profile or the like.

　抽出ステップ３３０２の動作は、第１の実施の形態で述べた抽出ステップと同様のものである。 The operation of the extraction step 3302 is the same as the extraction step described in the first embodiment.

　本実施の形態の効果としては、図８２に示すように、抽出ステップ３３０２の出力であるビデオストリームとオーディオストリームをそれぞれビデオ再生手段、オーディオ再生手段に入力させ、ビデオストリームとオーディオストリームを同期させて再生することにより、該当メディアコンテンツの、視聴者個人が見たいシーンのみを再生することができる。また、上記得られたビデオストリームとオーディオストリームを多重化することによって、該当メディアコンテンツの視聴者個人が見たいシーン集のMPEG1システムストリームを作成することができる。 As an effect of the present embodiment, as shown in FIG. 82, the video stream and the audio stream output from the extraction step 3302 are input to the video playback unit and the audio playback unit, respectively, and the video stream and the audio stream are synchronized. By playing back, only the scenes that the individual viewer wants to see of the corresponding media content can be played back. Also, by multiplexing the obtained video stream and audio stream, it is possible to create an MPEG1 system stream of a scene collection that a viewer of the corresponding media content wants to see.

〔第８の実施の形態〕
　以下、本発明に係る第８の実施の形態について述べる。本実施の形態は、第７の実施の形態と比較して、選択ステップの処理のみが異なるものである。 [Eighth Embodiment]
Hereinafter, an eighth embodiment according to the present invention will be described. This embodiment is different from the seventh embodiment only in the processing of the selection step.

　以下、図を参照しながら本実施の形態における選択ステップ３３０１の処理について記述する。本実施の形態における選択ステップ３３０１では、<segment>のみに着目して処理を行う。図８３に、本実施の形態における選択ステップ３３０１のフローチャートを示す。図８３に示す通り、本実施の形態における選択ステップ３３０１は、検索キーとなるキーワードとそのpriorityのしきい値を入力とし、文脈内容記述データの<segment>から、キーと同じキーワードを持ち、かつ、そのpriorityがしきい値以上の<segment>を選択するものである（Ｓ１〜Ｓ６）。 Hereinafter, the processing of the selection step 3301 in the present embodiment will be described with reference to the drawings. In the selection step 3301 in the present embodiment, processing is performed by focusing on only <segment>. FIG. 83 shows a flowchart of selection step 3301 in the present embodiment. As shown in FIG. 83, the selection step 3301 in the present embodiment is performed by inputting a keyword serving as a search key and a threshold value of the priority, and having the same keyword as the key from <segment> of the context description data, and And <segment> whose priority is equal to or greater than the threshold value (S1 to S6).

　なお、本実施の形態では、<segment>のみに着目して処理を行ったが、ある階層の<section>に着目して処理を行っても良い。また、検索のキーを、複数のキーワードとその間の条件との組としてもよい。キーワード間の条件には、「どちらか」、「ともに」、「どちらか」と「ともに」の組合せと、いったものがある。選択のしきい値も、キーワードが複数の場合はキーワード毎に指定して処理を行っても良い。 In the present embodiment, the processing is performed by focusing on only <segment>, but the processing may be performed by focusing on <section> of a certain hierarchy. The search key may be a set of a plurality of keywords and conditions between them. Conditions between keywords include "either", "both", and a combination of "either" and "both". If there are a plurality of keywords, the selection threshold may be specified and processed for each keyword.

〔第９の実施の形態〕
　以下、本発明に係る第９の実施の形態について述べる。本実施の形態も、第７の実施の形態と比較して、選択ステップの処理のみが異なるものである。 [Ninth embodiment]
Hereinafter, a ninth embodiment according to the present invention will be described. This embodiment also differs from the seventh embodiment only in the processing of the selection step.

　以下、図を参照しながら本実施の形態における選択ステップ３３０１の処理について記述する。本実施の形態における選択ステップ３３０１は、第７の実施の形態における処理と同様に、<segment>と<segment>を子要素にもつ<section>のみに着目し、その選択を行う。本実施の形態においては、選択する場面すべての継続時間の和にしきい値を設ける。　すなわち、それまでに選択された場面の継続時間の和が、このしきい値以下で最大となるような選択を行う。図８４に本実施の形態における選択ステップのフローチャートを示す。まず、選択ステップ３３０１は検索するキーとなるキーワードをひとつ受けとる。続いて、<segment>を子要素にもつ<section>のうち、検索キーのキーワードを持つものすべてを抽出する。この集合をΩとする（Ｓ１、Ｓ２）。Ωの要素を検索キーのキーワードのpriorityの大きい順にソートする（Ｓ３）。続いて、ソートしたΩから、検索キーのキーワードのpriorityの最も大きい<section>を取り出し（Ｓ５）、Ωからその<section>を削除する（Ｓ６）。この場合、最もpriorityの大きい<section>が複数ある場合は、そのすべての<section>を取り出す。取り出した<section>の子要素<segment>のうち、検索キーを持つ<segment>のみを選択し、集合Ω’に加える（Ｓ７）。なお、集合Ω’の初期値は空集合である（Ｓ２）。Ω’の場面の継続時間の総和を計算し（Ｓ８）、しきい値と比較する（Ｓ９）。継続時間の総和がしきい値と等しい場合は、Ω’の要素<segment>のすべての区間を出力し、処理を終了する（Ｓ１４）。継続時間のしきい値より小さい場合は、Ωから検索キーのキーワードのpriorityの最も大きい<section>の選択に戻り（Ｓ５）、以上の処理を繰り返す。ただし、Ωが空集合である場合は、Ω’の要素<segment>のすべての区間を出力し、処理を終了する（Ｓ４）。Ω’の場面の継続時間の総和がしきい値を越えている場合は、以下の処理を行う。集合Ω’の要素<segment>のうち、検索キーのキーワードのpriorityが最も小さい<segment>を削除する（Ｓ１１）。この場合、最も小さいpriorityの<segment>が複数ある場合は、そのすべての<segment>を削除する。Ω’の継続時間の総和を算出し（Ｓ１２）、しきい値との比較を行う（Ｓ１３）。継続時間の総和がしきい値よりも大きい場合は、Ω’から<segment>の削除処理に戻り（Ｓ１１）、この処理を繰り返す。ただし、Ω’が空集合の場合は処理を終了する（Ｓ１０）。継続時間の総和がしきい値以下の場合は、Ω’の要素<segment>のすべての区間を出力し、処理を終了する（Ｓ１４）。 Hereinafter, the processing of the selection step 3301 in the present embodiment will be described with reference to the drawings. In the selection step 3301 according to the present embodiment, similarly to the processing according to the seventh embodiment, the selection is performed by focusing only on <segment> and <section> having <segment> as a child element. In the present embodiment, a threshold value is provided for the sum of the durations of all selected scenes. That is, a selection is made such that the sum of the durations of the scenes selected so far is the maximum below this threshold. FIG. 84 shows a flowchart of the selecting step in the present embodiment. First, a selection step 3301 receives one keyword serving as a search key. Next, of <section> having <segment> as a child element, all those having the keyword of the search key are extracted. This set is Ω (S1, S2). The elements of Ω are sorted in descending order of the priority of the keyword of the search key (S3). Subsequently, a <section> having the highest priority of the keyword of the search key is extracted from the sorted Ω (S5), and the <section> is deleted from Ω (S6). In this case, if there are a plurality of <sections> having the highest priority, all the <section> s are extracted. Of the extracted <section> child elements <segment>, only <segment> having a search key is selected and added to the set Ω '(S7). Note that the initial value of the set Ω 'is an empty set (S2). The sum of the durations of the scene of Ω 'is calculated (S8) and compared with the threshold (S9). If the sum of the durations is equal to the threshold value, all sections of the element <segment> of Ω ′ are output, and the process ends (S14). If the duration is smaller than the threshold value, the process returns from Ω to the selection of <section> having the highest priority of the keyword of the search key (S5), and the above processing is repeated. However, if Ω is an empty set, all sections of the element <segment> of Ω ′ are output, and the process ends (S4). If the sum of the durations of the scene of Ω ′ exceeds the threshold value, the following processing is performed. Of the elements <segment> of the set Ω ', the <segment> with the lowest priority of the keyword of the search key is deleted (S11). In this case, if there are a plurality of <segments> having the smallest priority, all the <segments> are deleted. The sum of the durations of Ω 'is calculated (S12) and compared with a threshold value (S13). If the total of the durations is larger than the threshold value, the process returns to the process of deleting <segment> from Ω '(S11), and this process is repeated. However, if Ω ′ is an empty set, the process ends (S10). If the total of the durations is equal to or less than the threshold value, all the sections of the element <segment> of Ω ′ are output, and the process ends (S14).

　なお、本実施の形態では、<segment>と<segment>を子要素として持つ<section>に着目して処理を行ったが、ある階層の<section>とその子要素である<section>の親子関係に着目して処理を行っても良い。また、親子関係も2階層のみではなく、さらに階層を増やして処理を行っても良い。例えば、最上位の<section>から<segment>までの階層で処理を行う場合、まず最上位の<section>を選択し、選択した<section>からその子要素である<section>を選択し、選択した<section>からその子要素を選択、といった処理を<segment>の選択まで繰り返して、選択された<segment>の集合Ω’を生成する。 In the present embodiment, processing is performed by focusing on <section> having <segment> and <segment> as child elements, but the parent-child relationship between <section> in a certain hierarchy and <section> as its child element The processing may be performed by focusing on. Further, the parent-child relationship is not limited to two layers, and the processing may be performed by further increasing the layers. For example, when processing in the hierarchy from the top <section> to <segment>, first select the top <section>, select the child element <section> from the selected <section>, and select The process of selecting the child element from the selected <section> is repeated until the selection of <segment>, thereby generating a set Ω ′ of the selected <segment>.

　また、本実施の形態では、検索キーのキーワードのpriorityの大きい順としたが、priorityにしきい値を設定し、priorityがしきい値以上で大きい順に選択、としても良い。このしきい値は、<section>と<segment>それぞれ別々に設定しても良い。 Also, in the present embodiment, the priority of the keyword of the search key is set in descending order of priority. However, a threshold may be set for priority, and the priority may be selected in descending order of priority. This threshold may be set separately for <section> and <segment>.

　さらに、本実施の形態では、検索キーをひとつのキーワードとしたが、これを複数のキーワードとその間の条件との組としてもよい。キーワード間の条件には、「どちらか」、「ともに」、「どちらか」と「ともに」の組合せと、いったものがある。この場合、<section>、<segment>の選択または削除に用いているキーワードのprirorityを決めるルールも必要となる。このルールの一例として、以下のものがある。すなわち、条件が「どちらか」の場合は、該当キーワードのpriorityのうち最も大きい値をpriorityとする。また、「ともに」の場合は、該当キーワードのpriorityのうち最も小さいをpriorityとする。「どちらか」と「ともに」の組合せの場合も、このルールによりpriorityの値は求めることができる。また、検索キーのキーワードが複数の場合でも、そのpriorityにしきい値を設定し、そのしきい値以上のpriorityを持つものに対して処理を行っても良い。 Further, in the present embodiment, the search key is a single keyword, but this may be a set of a plurality of keywords and conditions therebetween. Conditions between keywords include "either", "both", and a combination of "either" and "both". In this case, a rule for determining the priority of the keyword used for selecting or deleting <section> and <segment> is also required. An example of this rule is as follows. That is, when the condition is “either”, the highest value among the priorities of the corresponding keywords is set as the priority. In the case of “both”, the smallest of the priorities of the corresponding keywords is set as the priority. In the case of a combination of “either” and “both”, the value of the priority can be obtained by this rule. Further, even when there are a plurality of keywords of the search key, a threshold value may be set for the priority, and the processing may be performed on a keyword having a priority equal to or higher than the threshold value.

〔第１０の実施の形態〕
　以下、本発明に係る第１０の実施の形態について述べる。本実施の形態は、第７の実施の形態と比較して、選択ステップの処理のみが異なるものである。 [Tenth embodiment]
Hereinafter, a tenth embodiment according to the present invention will be described. This embodiment is different from the seventh embodiment only in the processing of the selection step.

　以下、図を参照しながら本実施の形態における選択ステップ３３０１の処理について記述する。本実施の形態における選択ステップ３３０１では、第８の実施の形態と同様に<segment>のみに着目して処理を行う。また、第９の実施の形態と同様に、選択する場面すべての継続時間の和にしきい値を設ける。すなわち、それまでに選択された場面の継続時間の和が、このしきい値以下で最大となるような選択を行う。図８５に本実施の形態における選択ステップのフローチャートを示す。
　まず、選択ステップ３３０１は検索するキーとなるキーワードをひとつ受けとる。初期化として、集合Ω’を空集合とする（Ｓ２）。続いて、<segment>のうち、検索キーのキーワードを持つものすべてを抽出する（Ｓ１）。この集合をΩとする。Ωの要素を検索キーのキーワードのpriorityの大きい順にソートする（Ｓ３）。続いて、ソートしたΩから、検索キーのキーワードのpriorityの最も大きい<segment>を取り出し（Ｓ５）、Ωからその<segment>を削除する。この場合、最もpriorityの大きい<segment>が複数ある場合は、そのすべての<segment>を取り出す。Ωが空集合の場合は、Ω’の要素<segment>すべての区間を出力し、処理を終了する（Ｓ４）。取り出した<segment>の継続時間の総和T1と（Ｓ６）、Ω’の場面の継続時間の総和T2を計算し（Ｓ７）、T1+T2としきい値とを比較する（Ｓ８）。T1+T2がしきい値を越えている場合は、Ω’の要素<segment>のすべての区間を出力し、処理を終了する（Ｓ１１）。T1+T2が、しきい値と等しい場合は、取り出した<segment>すべてをΩ’の要素として加えたうえで（Ｓ９、Ｓ１０）、Ω’の要素<segment>のすべての区間を出力し、処理を終了する（Ｓ１１）。T1+T2がしきい値より小さい場合は、取り出した<segment>すべてをΩ’の要素として加え、Ωから<segment>の選択処理へ戻る（Ｓ１０）。 Hereinafter, the processing of the selection step 3301 in the present embodiment will be described with reference to the drawings. In the selection step 3301 in the present embodiment, as in the eighth embodiment, processing is performed by focusing on only <segment>. As in the ninth embodiment, a threshold value is set for the sum of the durations of all the scenes to be selected. That is, the selection is made such that the sum of the durations of the scenes selected so far is the maximum below this threshold. FIG. 85 shows a flowchart of the selection step in the present embodiment.
First, a selection step 3301 receives one keyword serving as a search key. As initialization, the set Ω ′ is set to an empty set (S2). Next, all of the <segment> that have the keyword of the search key are extracted (S1). Let this set be Ω. The elements of Ω are sorted in descending order of the priority of the keyword of the search key (S3). Next, <segment> having the highest priority of the keyword of the search key is extracted from the sorted Ω (S5), and the <segment> is deleted from Ω. In this case, when there are a plurality of <segments> having the highest priority, all of the <segments> are extracted. If Ω is an empty set, all sections of the element <segment> of Ω ′ are output, and the process ends (S4). The total T1 of the duration of the extracted <segment> and (S6), the total T2 of the duration of the scene of Ω 'are calculated (S7), and T1 + T2 is compared with the threshold (S8). If T1 + T2 exceeds the threshold value, all sections of the element <segment> of Ω ′ are output, and the process ends (S11). If T1 + T2 is equal to the threshold, all the extracted <segments> are added as elements of Ω ′ (S9, S10), and all the sections of the elements <segment> of Ω ′ are output. The process ends (S11). If T1 + T2 is smaller than the threshold value, all the extracted <segments> are added as elements of Ω ′, and the process returns to Ω to select <segment> (S10).

　なお、本実施の形態では、<segment>のみに着目して処理を行ったが、ある階層の<section>に着目して処理を行っても良い。また、本実施の形態では、検索キーのキーワードのpriorityの大きい順としたが、priorityにしきい値を設定し、priorityがしきい値以上で大きい順に選択、としても良い。
　さらに、本実施の形態では、検索キーをひとつのキーワードとしたが、これを複数のキーワードとその間の条件との組としてもよい。キーワード間の条件には、「どちらか」、「ともに」、「どちらか」と「ともに」の組合せと、いったものがある。この場合、<section>、<segment>の選択または削除に用いているキーワードのprirorityを決めるルールも必要となる。このルールの一例として、以下のものがある。すなわち、条件が「どちらか」の場合は、該当キーワードのpriorityのうち最も大きい値をpriorityとする。また、「ともに」の場合は、該当キーワードのpriorityのうち最も小さい値をpriorityとする。「どちらか」と「ともに」の組合せの場合も、このルールによりpriorityの値は求めることができる。また、検索キーのキーワードが複数の場合でも、そのpriorityにしきい値を設定し、そのしきい値以上のpriorityを持つものに対して処理を行っても良い。 In this embodiment, the processing is performed by focusing on only <segment>, but the processing may be performed by focusing on <section> of a certain hierarchy. Further, in the present embodiment, the priority of the keyword of the search key is set in descending order of priority. However, a threshold may be set for priority, and the priority may be selected in descending order of priority.
Furthermore, in the present embodiment, the search key is a single keyword, but this may be a set of a plurality of keywords and conditions between them. Conditions between keywords include "either", "both", and a combination of "either" and "both". In this case, a rule for determining the priority of the keyword used for selecting or deleting <section> and <segment> is also required. An example of this rule is as follows. That is, when the condition is “either”, the highest value among the priorities of the corresponding keywords is set as the priority. In the case of “both”, the smallest value among the priorities of the keywords is set as the priority. In the case of a combination of “either” and “both”, the value of the priority can be obtained by this rule. Further, even when there are a plurality of keywords of the search key, a threshold value may be set for the priority, and the processing may be performed on a keyword having a priority equal to or higher than the threshold value.

〔第１１の実施の形態〕
　以下、本発明に係る第１１の実施の形態について述べる。本実施の形態は、第７〜第１０の実施の形態の文脈内容記述データにおいて、場面選択のキーワードとなる観点およびその重要度の記述が異なるものである。第７〜第１０の実施の形態では、図５７に示すように、キーワードと重要度との組(keyword, priority)を<section>,<segment>に属性として付与することによって観点およびその観点から見た重要度を記述していたが、本実施の形態では、図１３３に示すように、<contents>に属性povlistを付加し、<section>,<segment>には属性povvalueを付加することによって観点および重要度を記述している。 [Eleventh embodiment]
Hereinafter, an eleventh embodiment according to the present invention will be described. This embodiment differs from the context content description data of the seventh to tenth embodiments in the description of the viewpoint to be a keyword for scene selection and the description of its importance. In the seventh to tenth embodiments, as shown in FIG. 57, a set of a keyword and importance (keyword, priority) is assigned to <section>, <segment> as an attribute, and from the viewpoint and from that viewpoint, In the present embodiment, as shown in FIG. 133, the attribute povlist is added to <contents>, and the attribute povvalue is added to <section>, <segment>. Describes perspective and importance.

　属性povlistは、図１３４に示すように、観点をベクトル形式で表したものであり、属性povvalueは、図１３５に示すように、重要度をベクトル形式で表したものであり、それぞれ一対一に対応した観点および重要度が順に並んで属性povlistおよび属性povvalueを形成している。例えば、図１３４および図１３５では、観点１に関する重要度が５、観点２に関する重要度が０、観点３に関する重要度が２、観点ｎ（但し、ｎは正の整数である）に関する重要度が０である。なお、観点２に関する重要度０とは、第７の実施の形態の場合、観点２がキーワードである属性(keyword, priority)が付加されていないことに対応している。 The attribute povlist represents the viewpoint in a vector format as shown in FIG. 134, and the attribute povvalue represents the importance in a vector format as shown in FIG. 135. The viewpoints and degrees of importance are arranged in order to form an attribute povlist and an attribute povvalue. For example, in FIG. 134 and FIG. 135, the importance regarding viewpoint 1 is 5, the importance regarding viewpoint 2 is 0, the importance regarding viewpoint 3 is 2, and the importance regarding viewpoint n (where n is a positive integer). 0. It should be noted that the importance level 0 relating to viewpoint 2 corresponds to the case where the attribute (keyword, priority) which is a keyword is not added to viewpoint 2 in the seventh embodiment.

　また、図１３６〜図１６３および図１６４〜図１９６には、本実施の形態の文脈内容記述データをコンピュータ上で表現するために用いられるExtensible Markup Language(XML)で記述するためのDocument Type Definition(DTD)と、このDTDによる文脈内容記述データの一例をそれぞれ示す。本実施の形態においても、これらの文脈内容記述データを用いて第７〜第１０の実施の形態で説明した処理と同様の処理を行う。 FIGS. 136 to 163 and FIGS. 164 to 196 show Document Type Definitions (XML) used to express the context description data according to the present embodiment on a computer. DTD) and an example of context description data according to the DTD. Also in the present embodiment, the same processing as the processing described in the seventh to tenth embodiments is performed using these context description data.

　なお、本実施の形態では、<contents>に属性povlistを付加し、<section>,<segment>には属性povvalueを付加しているが、図１９７に示すように、<section>,<segment>にも属性povlistを付加して良い。但し、属性povlistが付加された<section>または<segment>において、属性povvalueは、その<section>または<segment>に付加されている属性povlistに対応したものである。また、属性povlistが付加されていない<section>または<segment>において、属性povvalueは、<contents>に付加された属性povlistに対応するものであっても、属性povlistが付加されていない<section>または<segment>の先祖の内、属性povlistが付加された最も近い<section>の属性povlistであっても良い。 In this embodiment, the attribute povlist is added to <contents>, and the attribute povvalue is added to <section> and <segment>. However, as shown in FIG. 197, <section>, <segment> The attribute povlist may be added to. However, in <section> or <segment> to which the attribute povlist is added, the attribute povvalue corresponds to the attribute povlist added to the <section> or <segment>. In <section> or <segment> where the attribute povlist is not added, even if the attribute povvalue corresponds to the attribute povlist added to <contents>, the attribute povlist is not added <section> Alternatively, among the ancestors of <segment>, the attribute povlist of the closest <section> to which the attribute povlist is added may be used.

　また、図１９８〜図２２２および図２２３〜図２５２には、図１９７に対応した、文脈内容記述データをコンピュータ上で表現するために用いられるXMLで記述するためのDTDと、このDTDによる文脈内容記述データの一例をそれぞれ示す。これらの図面に示す例では、属性povlistが付加されていない<section>,<segment>の属性povvalueは、<contents>に付加された属性povlistに対応している。 FIGS. 198 to 222 and FIGS. 223 to 252 show DTDs corresponding to FIG. 197 for describing context description data in XML used for expressing on a computer, and context contents by the DTD. An example of the description data is shown below. In the examples shown in these drawings, the attribute povvalue of <section> and <segment> to which the attribute povlist is not added corresponds to the attribute povlist added to <contents>.

〔第１２の実施の形態〕
　以下、本発明に係る第１２の実施の形態について述べる。本実施の形態においては、メディアコンテンツとして、MPEG1システムストリームの動画像を想定する。この場合、メディアセグメントは、ひとつのシーンカットに相当する。 [Twelfth embodiment]
Hereinafter, a twelfth embodiment according to the present invention will be described. In the present embodiment, a moving image of an MPEG1 system stream is assumed as the media content. In this case, the media segment corresponds to one scene cut.

　図８６は、本発明の実施の形態に係わるデータ処理方法のブロック図である。図８６において、４１０１は選択ステップを、４１０２は抽出ステップを、４１０３は構成ステップを、４１０４は配送ステップを、４１０５はデータベースを表す。選択ステップ４１０１は、文脈内容記述データからメディアコンテンツの場面を選択し、その場面の開始時間と終了時間と、それが格納されているファイルを表すデータを出力する処理を行う。抽出ステップ４１０２は、選択ステップ４１０１が出力したファイルを表すデータ、開始時間、終了時間を受けとり、物理内容記述データを参照して、メディアコンテンツのファイルから、入力した開始時間と終了時間で区切られる区間のデータを抽出する処理を行う。構成ステップ４１０３は、抽出ステップ４１０２が出力したデータを多重化し、MPEG1システムストリームを構成する処理を行う。配送ステップ４１０４は、構成ステップ４１０３が作成したMPEG1システムストリームを、回線を通じて配送する処理を行う。４１０５はメディアコンテンツと、その物理内容記述データ、文脈内容記述データを格納したデータベースである。 FIG. 86 is a block diagram of a data processing method according to the embodiment of the present invention. 86, reference numeral 4101 denotes a selection step, 4102 denotes an extraction step, 4103 denotes a configuration step, 4104 denotes a delivery step, and 4105 denotes a database. The selection step 4101 performs a process of selecting a scene of the media content from the context description data, and outputting data indicating a start time and an end time of the scene and a file storing the scene. The extraction step 4102 receives the data representing the file output by the selection step 4101, the start time, and the end time, and refers to the physical content description data to separate the section of the media content file from the input start time and end time. The process of extracting the data of the data is performed. The configuration step 4103 performs processing for multiplexing the data output from the extraction step 4102 to configure an MPEG1 system stream. The delivery step 4104 performs a process of delivering the MPEG1 system stream created by the configuration step 4103 via a line. Reference numeral 4105 denotes a database that stores media contents, their physical content description data, and context content description data.

　本実施の形態における物理内容記述データの構成は、第５の実施の形態で記述したものと同様のものを用いる。すなわち、図３７に示した構成の物理内容記述データを用いる。物理 The structure of the physical content description data in the present embodiment is the same as that described in the fifth embodiment. That is, the physical content description data having the configuration shown in FIG. 37 is used.

　図８７に、本実施の形態における文脈内容記述データの構成を示す。これは、第７の実施の形態における文脈内容記述データに、物理内容記述データの<mediaobject>との関連を加えたものである。すなわち、文脈内容記述データの根<contents>の子要素は<mediaobject>で、この<mediaobject>の子要素が<section>となる。<section>、<segment>は第７の実施の形態と同様ものである。文脈内容記述データの<mediaobject>には、属性idが付加され、このidによって、物理内容記述データの<mediaobject>との対応を取る。すなわち、文脈内容記述データの<mediaobject>の子孫で記述されるメディアコンテンツの場面は、同じ値の属性idをもつ物理内容記述データの<mediaobject>が示すファイルに格納されている。また、<segment>の時間情報startとendは、各ファイルの先頭からの時間を設定することとする。すなわち、ひとつのメディアコンテンツが複数のファイルから構成される場合、各ファイルの先頭時間は０であり、各場面の開始時間は、それが格納されているファイルの先頭からそこまでの経過時間で表すこととする。 FIG. 87 shows the configuration of context description data in the present embodiment. This is obtained by adding the association with the <mediaobject> of the physical content description data to the context content description data in the seventh embodiment. That is, the child element of the root <contents> of the context description data is <mediaobject>, and the child element of this <mediaobject> is <section>. <section> and <segment> are the same as in the seventh embodiment. An attribute id is added to <mediaobject> of the context description data, and the id is used to correspond to <mediaobject> of the physical description data. That is, the scene of the media content described by the descendants of <mediaobject> of the context description data is stored in the file indicated by <mediaobject> of the physical description data having the attribute id of the same value. The time information start and end of <segment> is set to the time from the beginning of each file. That is, when one media content is composed of a plurality of files, the head time of each file is 0, and the start time of each scene is represented by the elapsed time from the head of the file in which it is stored. It shall be.

　本実施の形態における物理内容記述データと文脈内容記述データとをコンピュータ上で表現する一例として、Extensible Markup Language(XML)による記述を用いることができる。物理内容記述データに関しては、第５の実施の形態で示した図３９が一例である。また、図８８〜図９６は、図８７に示す文脈内容記述データをXMLで記述するためのDTDと、このDTDによる文脈内容記述データの一例である。 As an example of expressing the physical content description data and the context content description data in the present embodiment on a computer, a description in Extensible Markup Language (XML) can be used. FIG. 39 shown in the fifth embodiment is an example of the physical content description data. FIGS. 88 to 96 show examples of a DTD for describing the context description data shown in FIG. 87 in XML, and an example of the context description data based on the DTD.

　以下、選択ステップ４１０１の処理について説明する。選択ステップ４１０１での場面の選択の手法は、第７〜第１０の実施の形態に記載のいずれかの手法を用いる。ただし、結果として開始時間、終了時間とともに、対応する物理内容記述データの<mediaobject>のidも同時に出力する。物理内容記述データを図３９に示すDTDによるXML文書で表し、文脈内容記述データを図８８〜図９６に示すDTDによるXML文書で表した場合の、選択ステップ４１０１の出力の一例は、第５の実施の形態において示した図４６のものと同様の形態のものである。 Hereinafter, the processing of the selection step 4101 will be described. As a technique for selecting a scene in the selection step 4101, any of the techniques described in the seventh to tenth embodiments is used. However, as a result, the id of <mediaobject> of the corresponding physical content description data is output at the same time as the start time and the end time. When the physical content description data is represented by the DTD XML document shown in FIG. 39 and the context content description data is represented by the DTD XML document shown in FIGS. 88 to 96, an example of the output of the selection step 4101 is a fifth example. This is similar to the embodiment of FIG. 46 shown in the embodiment.

　抽出ステップ４１０２の処理は、第５の実施の形態に記載の抽出ステップと同様のものである。また、構成ステップ４１０３も、第５の実施の形態に記載の構成ステップと同様のものである。配送ステップ４１０４も、第５の実施の形態に記載の配送ステップと同様のものである。 The processing of the extraction step 4102 is the same as the extraction step described in the fifth embodiment. The configuration step 4103 is the same as the configuration step described in the fifth embodiment. The delivery step 4104 is the same as the delivery step described in the fifth embodiment.

〔第１３の実施の形態〕
　以下、本発明に係る第１３の実施の形態について述べる。本実施の形態においては、メディアコンテンツとして、MPEG1システムストリームの動画像を想定する。この場合、メディアセグメントは、ひとつのシーンカットに相当する。 [Thirteenth embodiment]
Hereinafter, a thirteenth embodiment according to the present invention will be described. In the present embodiment, a moving image of an MPEG1 system stream is assumed as the media content. In this case, the media segment corresponds to one scene cut.

　図９７は、本発明の実施の形態に係わるデータ処理方法のブロック図である。図９７において、４４０１は選択ステップを、４４０２は抽出ステップを、４４０３は構成ステップを、４４０４は配送ステップを、４４０５はデータベースを表す。選択ステップ４４０１は、文脈内容記述データからメディアコンテンツの場面を選択し、その場面の開始時間と終了時間と、それが格納されているファイルを表すデータを出力する処理を行うもので、第１２の実施の形態記載の選択ステップと同様のものである。抽出ステップ４４０２は、選択ステップ４４０１が出力したファイルを表すデータ、開始時間、終了時間を受けとり、物理内容記述データを参照して、メディアコンテンツのファイルから、入力した開始時間と終了時間で区切られる区間のデータを抽出する処理を行うもので、第１２の実施の形態記載の抽出ステップと同様のものである。構成ステップ４４０３は、配送ステップ４４０４が判断した回線状況に応じて、抽出ステップ４４０２が出力したストリームの一部またはすべてを多重化し、MPEG1システムストリームを構成する処理を行うもので、第６の実施の形態に記載の構成ステップと同様のものである。配送ステップ４４０４は、配送する回線状況を判断してその結果を構成ステップ４４０３に伝えることと、構成ステップ４４０３が作成したMPEG1システムストリームを、回線を通じて配送する処理を行うもので、第６の実施例に記載の配送ステップと同様のものである。４４０５はメディアコンテンツと、その物理内容記述データ、文脈内容記述データを格納したデータベースである。 FIG. 97 is a block diagram of a data processing method according to the embodiment of the present invention. In FIG. 97, 4401 represents a selection step, 4402 represents an extraction step, 4403 represents a configuration step, 4404 represents a delivery step, and 4405 represents a database. The selecting step 4401 performs a process of selecting a scene of the media content from the context description data and outputting a start time and an end time of the scene and data representing a file in which the scene is stored. This is the same as the selection step described in the embodiment. The extraction step 4402 receives the data representing the file output from the selection step 4401, the start time, and the end time, and refers to the physical content description data to separate the section of the media content file from the input start time and end time. This is the same as the extraction step described in the twelfth embodiment. The configuration step 4403 performs a process of multiplexing a part or all of the stream output by the extraction step 4402 and configuring an MPEG1 system stream according to the line condition determined by the delivery step 4404. It is the same as the configuration step described in the embodiment. The delivery step 4404 is for determining the status of the line to be delivered and transmitting the result to the configuration step 4403, and performing the process of delivering the MPEG1 system stream created by the configuration step 4403 through the line. Is the same as the delivery step described in. Reference numeral 4405 denotes a database that stores media contents, their physical description data, and context description data.

　なお、本実施の形態では、メディアコンテンツとして、MPEG1システムストリームを想定したが、各画面のタイムコードを得ることができるものであれば、他のフォーマットでも同様の効果が得られる。 In the present embodiment, the MPEG1 system stream is assumed as the media content, but the same effect can be obtained in other formats as long as the time code of each screen can be obtained.

　以下に示す実施の形態は、特許請求の範囲に示す発明に対応した形態の要約を説明したものである。なお、以下、「音情報」という言葉を、有音、無音、スピーチ、音楽、静寂、外部雑音などを含む音に関する情報として用い、「映像情報」という言葉を、動画、静止画、テロップなどの文字を含む視覚できる情報として用いる。また、スコアは、有音、無音、スピーチ、音楽、静寂、外部雑音など、音情報の内容から算出されるスコア、または映像情報中のテロップの有無に従って付けられるスコア、またはそれらの組み合わせを利用できる。また、スコアは上記スコア以外のスコアであっても良い。 The following embodiments are intended to describe the summary of embodiments corresponding to the inventions set forth in the claims. Hereinafter, the term "sound information" is used as information on sound including sound, silence, speech, music, silence, external noise, and the like, and the term "video information" is used for moving images, still images, telops, and the like. Used as visual information including characters. The score can be a score calculated from the content of sound information such as sound, silence, speech, music, silence, external noise, or a score attached according to the presence or absence of a telop in video information, or a combination thereof. . The score may be a score other than the above score.

〔第１４の実施の形態〕
　以下、本発明に係る第１４の実施の形態について述べる。図９８は、本実施の形態におけるデータ処理方法の処理を示すブロック図である。同図において、５０１は選択ステップを、５０３は抽出ステップを表す。選択ステップ５０１は、文脈内容記述データのスコアから、メディアコンテンツの少なくともひとつの区間または場面を選択し、その選択された区間または場面を出力するステップである。なお、選択された区間とは、例えば、選択区間の開始時間および終了時間である。また、抽出ステップ５０３は、選択ステップ５０１が出力した選択区間によって区切られたメディアコンテンツの区間（以下、メディアセグメントと称す）のデータ、すなわち選択区間のデータのみを抽出する処理を行うステップである。 [Fourteenth Embodiment]
Hereinafter, a fourteenth embodiment according to the present invention will be described. FIG. 98 is a block diagram showing processing of the data processing method according to the present embodiment. In the figure, 501 represents a selection step, and 503 represents an extraction step. The selection step 501 is a step of selecting at least one section or scene of the media content from the score of the context description data and outputting the selected section or scene. The selected section is, for example, a start time and an end time of the selected section. Further, the extraction step 503 is a step of performing processing for extracting data of a section (hereinafter, referred to as a media segment) of the media content divided by the selection section output by the selection step 501, that is, only data of the selection section.

　なお、スコアは、文脈内容における客観的な重要性に基づいた重要度でもよいし、ユーザ等が選択した登場人物や事柄等のキーワードの観点に基づいた重要度でもよい。 The score may be an importance based on the objective importance in the context content, or may be an importance based on a keyword such as a character or a matter selected by a user or the like.

〔第１５の実施の形態〕
　以下、本発明に係る第１５の実施の形態について述べる。図９９は、本実施の形態におけるデータ処理方法の処理を示すブロック図である。同図において、５０１は選択ステップを、５０３は再生ステップを表す。再生ステップ５０５は、選択ステップ５０１が出力した選択区間によって区切られた選択区間のデータのみを再生する処理を行うステップである。なお、選択ステップ５０１は、第１〜第１３の実施の形態に示した選択ステップと同様であるため説明を省略する。 [Fifteenth Embodiment]
Hereinafter, a fifteenth embodiment according to the present invention will be described. FIG. 99 is a block diagram showing processing of the data processing method according to the present embodiment. In the figure, reference numeral 501 denotes a selection step, and 503 denotes a reproduction step. The reproduction step 505 is a step of performing processing for reproducing only data of the selected section divided by the selected section output by the selection step 501. Note that the selection step 501 is the same as the selection steps described in the first to thirteenth embodiments, and thus the description is omitted.

〔第１６の実施の形態〕
　以下、本発明に係る第１６の実施の形態について述べる。図１００は、本実施の形態におけるデータ処理方法の処理を示すブロック図である。同図において、５０７は映像選択ステップを、５０９は音選択ステップを表す。なお、映像選択ステップ５０７および音選択ステップ５０９は、第１４および第１５の実施の形態に示した選択ステップ５０１に含まれる。 [Sixteenth embodiment]
Hereinafter, a sixteenth embodiment according to the present invention will be described. FIG. 100 is a block diagram showing processing of the data processing method according to the present embodiment. In the figure, reference numeral 507 denotes a video selection step, and 509 denotes a sound selection step. Note that the video selection step 507 and the sound selection step 509 are included in the selection step 501 shown in the fourteenth and fifteenth embodiments.

　映像選択ステップ５０７は、映像情報の文脈内容記述データを参照して映像情報の区間または場面の選択を行い、その選択された区間を出力するステップである。また、音選択ステップ５０９は、音情報の文脈内容記述データを参照して音情報の区間または場面の選択を行い、その選択された区間を出力するステップである。なお、選択された区間とは、例えば、選択区間の開始時間および終了時間である。また、映像選択ステップ５０７で選択された映像情報の選択区間および音選択ステップ５０９で選択された音情報の選択区間は、第１４の実施の形態に示した抽出ステップ５０３または第１５の実施の形態に示した再生ステップ５０５によって、選択区間のデータのみが抽出または再生される。 The video selection step 507 is a step of selecting a section or a scene of the video information with reference to the context description data of the video information, and outputting the selected section. The sound selection step 509 is a step of selecting a section or a scene of the sound information with reference to the context description data of the sound information, and outputting the selected section. The selected section is, for example, a start time and an end time of the selected section. The selection section of the video information selected in the video selection step 507 and the selection section of the sound information selected in the sound selection step 509 are the same as the extraction step 503 or the fifteenth embodiment described in the fourteenth embodiment. In the reproduction step 505 shown in (1), only the data in the selected section is extracted or reproduced.

〔第１７の実施の形態〕
　以下、本発明に係る第１７の実施の形態について述べる。図１０１は、本実施の形態におけるデータ処理方法の処理を示すブロック図である。同図において、５１１は判定ステップを、５１３は選択ステップを、５０３は抽出ステップを、５０５は再生ステップを示す。 [Seventeenth embodiment]
Hereinafter, a seventeenth embodiment according to the present invention will be described. FIG. 101 is a block diagram showing processing of the data processing method according to the present embodiment. In the figure, 511 indicates a determination step, 513 indicates a selection step, 503 indicates an extraction step, and 505 indicates a reproduction step.

（実施例１）
　まず、実施例１では、メディアコンテンツが同一時間においてそれぞれ異なる複数のメディア情報を有しており、判定ステップ５１１が、メディアコンテンツのデータ構成を記述した物理内容記述データを入力として、受信端末の能力、配送する回線の状況、およびユーザからの要求などの判定条件から、どのメディア情報を選択対象とするかを判定するステップである。また、選択ステップ５１３は、判定ステップ５１１で選択対象と判定されたデータ、物理内容記述データおよび文脈内容記述データを入力とし、入力された物理内容記述データを参照して、判定ステップ５１１が選択対象と判定したデータからのみ選択処理を行うステップである。なお、抽出ステップ５０３および再生ステップ５０５は、それぞれ第１４の実施の形態に示した抽出ステップおよび第１５の実施の形態に示した再生ステップと同様であるため説明を省略する。また、メディア情報は、映像情報や音情報、テキスト情報などのデータを含み、以下、本実施例においては、メディア情報が特に映像情報および音情報に関するデータの少なくとも一方を含んだものとする。 (Example 1)
First, in the first embodiment, the media content has a plurality of different pieces of media information at the same time, and the determining step 511 determines that the physical content description data describing the data configuration of the media content is input and the capability of the receiving terminal is determined. This is a step of determining which media information is to be selected from determination conditions such as the status of a line to be delivered and a request from a user. The selection step 513 receives the data determined as the selection target in the determination step 511, the physical content description data, and the context content description data, and refers to the input physical content description data to determine the selection target 511. This is the step of performing the selection process only from the data determined to be. Note that the extraction step 503 and the reproduction step 505 are the same as the extraction step shown in the fourteenth embodiment and the reproduction step shown in the fifteenth embodiment, respectively, and thus description thereof will be omitted. The media information includes data such as video information, sound information, and text information. Hereinafter, in the present embodiment, it is assumed that the media information particularly includes at least one of data related to video information and sound information.

　また、本実施例においては、メディアコンテンツが有する同一時間において異なる映像情報または音情報が、図１０２に示すようなチャネル、さらに一チャネルを階層化したレイヤーにそれぞれ割り当てられており、例えば、動画を伝達するチャネル１，レイヤー１には標準解像度の映像情報が、チャネル１，レイヤー２には高解像度の映像情報が割り当てられ、また、音情報を伝達するチャネル１にはステレオの音情報が、チャネル２にはモノラルの音情報が割り当てられている。図１０３および図１０４は、物理内容記述データをXMLで記述するためのDocument Type Definition(DTD)と、このDTDによる物理内容記述データの一例である。 Also, in the present embodiment, different video information or audio information at the same time of the media content is assigned to a channel as shown in FIG. 102, and further to a layer in which one channel is hierarchized. Standard resolution video information is assigned to channel 1 and layer 1 for transmission, and high resolution video information is assigned to channel 1 and layer 2. Stereo sound information is assigned to channel 1 for transmitting sound information. 2 is assigned monaural sound information. FIGS. 103 and 104 show an example of a Document Type Definition (DTD) for describing the physical content description data in XML and physical content description data based on the DTD.

　次に、メディアコンテンツがこのようなチャネルおよびレイヤー構成となっているときの、本実施例の判定ステップ５１１の処理について、図１０５〜図１０８を参照して説明する。まず、図１０５に示すように、ステップＳ１０１ではユーザからの要求があるかを判別する。このステップＳ１０１において、ユーザ要求があれば図１０６に示すユーザ要求による判定処理ＳＲ−Ａを実行する。 Next, the processing of the determination step 511 of this embodiment when the media content has such a channel and layer configuration will be described with reference to FIGS. First, as shown in FIG. 105, in step S101, it is determined whether there is a request from the user. In step S101, if there is a user request, a determination process SR-A based on the user request shown in FIG. 106 is executed.

　また、ステップＳ１０１において、ユーザ要求がなければステップＳ１０３に進み、受信可能な情報が映像情報のみか、音情報のみか、映像情報および音情報の両方であるかを判別する。このステップＳ１０３において、受信可能な情報が映像情報のみであるときは図１０７に示す映像情報に関する判定処理ＳＲ−Ｂを実行し、音情報のみであるときは図１０８に示す音情報に関する判定処理ＳＲ−Ｃを実行し、映像情報および音情報の両方であるときはステップＳ１０５に進む。ステップＳ１０５では、映像情報および音情報を受信する受信端末の能力、例えば、映像表示能力や音再生能力、圧縮された情報の解凍処理速度などを判別して、能力が高ければステップＳ１０７に進み、低ければステップＳ１０９に進む。ステップＳ１０７では、映像情報や音情報を伝送する回線の状況を判別し、回線が混雑していればステップＳ１０９に進み、混雑していなければステップＳ１１１に進む。 If there is no user request in step S101, the process proceeds to step S103 to determine whether receivable information is only video information, only audio information, or both video information and audio information. In this step S103, when the receivable information is only the video information, the determination process SR-B for the video information shown in FIG. 107 is executed, and when the receivable information is only the audio information, the determination process SR for the audio information shown in FIG. If -C is executed and both the video information and the sound information are present, the process proceeds to step S105. In step S105, the ability of the receiving terminal to receive the video information and the sound information, for example, the video display ability, the sound reproduction ability, the decompression processing speed of the compressed information, and the like are determined. If lower, the process proceeds to step S109. In step S107, the state of the line transmitting the video information and the sound information is determined. If the line is congested, the process proceeds to step S109, and if not, the process proceeds to step S111.

　ステップＳ１０９は受信端末の能力が低いかまたは回線が混雑しているときに実行され、このとき受信端末は、チャネル１，レイヤー１の標準解像度の映像情報と、チャネル２のモノラルの音情報とを受信する。一方、ステップＳ１１１は受信端末の能力が高く回線が混雑していないときに実行され、このとき受信端末は、チャネル１，レイヤー２の高解像度の映像情報と、チャネル１のステレオの音情報とを受信する。 Step S109 is executed when the capability of the receiving terminal is low or the line is congested. At this time, the receiving terminal transmits the standard resolution video information of channel 1 and layer 1 and the monaural sound information of channel 2 Receive. On the other hand, step S111 is executed when the capacity of the receiving terminal is high and the line is not congested. At this time, the receiving terminal transmits the high-resolution video information of channel 1 and layer 2 and the stereo sound information of channel 1. Receive.

　次に、図１０６に示すユーザ要求による判定処理ＳＲ−Ａについて説明する。本実施例においては、ユーザからの要求が、映像のレイヤー、音のチャネルを選択するものである。まず、ステップＳ１５１では、ユーザによる映像に関する要求があるかを判別する。このステップＳ１５１において、映像に関するユーザ要求があればステップＳ１５３に進み、ユーザ要求がなければステップＳ１５９に進む。ステップＳ１５３では、ユーザによる映像の要求がレイヤー２を選択するものかを判別し、ＹＥＳであればステップＳ１５５に進んで映像情報としてレイヤー２を選択し、ＮＯであればステップＳ１５７に進んでレイヤー１を選択する。ステップＳ１５９では、ユーザによる音に関する要求があるかを判別する。このステップＳ１５９において、音に関するユーザ要求があればステップＳ１６１に進み、ユーザ要求がなければ処理を終了する。ステップＳ１６１では、ユーザによる音の要求がチャネル１を選択するものかを判別し、ＹＥＳであればステップＳ１６３に進んで音情報としてチャネル１を選択し、ＮＯであればステップＳ１６５に進んでチャネル２を選択する。 Next, the determination process SR-A based on a user request shown in FIG. 106 will be described. In the present embodiment, the request from the user is to select a video layer and a sound channel. First, in a step S151, it is determined whether or not there is a request for a video by the user. In this step S151, if there is a user request for video, the process proceeds to step S153, and if not, the process proceeds to step S159. In step S153, it is determined whether the user's request for video is to select layer 2. If YES, the flow advances to step S155 to select layer 2 as video information. If NO, the flow advances to step S157 to go to layer 1 Select In the step S159, it is determined whether or not there is a request for a sound by the user. In this step S159, if there is a user request for sound, the process proceeds to step S161, and if there is no user request, the process ends. In step S161, it is determined whether or not the user's request for sound selects channel 1. If YES, the process proceeds to step S163 to select channel 1 as sound information. If NO, the process proceeds to step S165 and channel 2 Select

　次に、図１０７に示す映像情報に関する判定処理ＳＲ−Ｂについて説明する。まず、ステップＳ１７１では、映像情報を受信する受信端末の能力を判別して、能力が高ければステップＳ１７３に進み、低ければステップＳ１７５に進む。ステップＳ１７３では、回線の状況を判別し、回線が混雑していればステップＳ１７５に進み、混雑していなければステップＳ１７７に進む。 Next, the determination process SR-B for the video information shown in FIG. 107 will be described. First, in step S171, the capability of the receiving terminal that receives video information is determined. If the capability is high, the process proceeds to step S173, and if low, the process proceeds to step S175. In step S173, the state of the line is determined. If the line is congested, the process proceeds to step S175, and if not, the process proceeds to step S177.

　ステップＳ１７５は受信端末の能力が低いかまたは回線が混雑しているときに実行され、このとき受信端末は、チャネル１，レイヤー１の標準解像度の映像情報のみを受信する。一方、ステップＳ１７７は受信端末の能力が高く回線が混雑していないときに実行され、このとき受信端末は、チャネル１，レイヤー２の高解像度の映像情報のみを受信する。 Step S175 is executed when the capacity of the receiving terminal is low or the line is congested. At this time, the receiving terminal receives only the channel 1 and layer 1 standard resolution video information. On the other hand, step S177 is executed when the capacity of the receiving terminal is high and the line is not congested. At this time, the receiving terminal receives only high-resolution video information of channel 1 and layer 2.

　次に、図１０８に示す音情報に関する判定処理ＳＲ−Ｃについて説明する。まず、ステップＳ１８１では、音情報を受信する受信端末の能力を判別して、能力が高ければステップＳ１８３に進み、低ければステップＳ１８５に進む。ステップＳ１８３では、回線の状況を判別し、回線が混雑していればステップＳ１８５に進み、混雑していなければステップＳ１８７に進む。 Next, the determination process SR-C for the sound information shown in FIG. 108 will be described. First, in step S181, the capability of the receiving terminal that receives the sound information is determined. If the capability is high, the process proceeds to step S183, and if the capability is low, the process proceeds to step S185. In step S183, the state of the line is determined. If the line is congested, the process proceeds to step S185, and if not, the process proceeds to step S187.

　ステップＳ１８５は受信端末の能力が低いかまたは回線が混雑しているときに実行され、このとき受信端末は、チャネル２のモノラルの音情報のみを受信する。一方、ステップＳ１８７は受信端末の能力が高く回線が混雑していないときに実行され、このとき受信端末は、チャネル１のステレオの音情報のみを受信する。 Step S185 is executed when the capacity of the receiving terminal is low or the line is congested. At this time, the receiving terminal receives only monaural sound information of channel 2. On the other hand, step S187 is executed when the capacity of the receiving terminal is high and the line is not congested. At this time, the receiving terminal receives only the stereo sound information of channel 1.

（実施例２）
　また、実施例２では、実施例１と比較して、判定ステップＳ５１１のみが異なる。本実施例に係る判定ステップ５１１は、メディアコンテンツのデータ構成を記述した物理内容記述データを入力として、受信端末の能力、配送する回線の状況、およびユーザからの要求などの判定条件から、映像情報のみ、音情報のみ、または映像情報および音情報のいずれかを選択対象とするかを判定するかを判定するステップである。なお、選択ステップ５１３、抽出ステップ５０３および再生ステップ５０５については、上述の各ステップと同様であるため説明を省略する。 (Example 2)
Further, the second embodiment is different from the first embodiment only in the determination step S511. The determination step 511 according to the present embodiment uses the physical content description data describing the data configuration of the media content as input, and determines the video information based on the determination conditions such as the capability of the receiving terminal, the status of the line to be delivered, and the request from the user. It is a step of determining whether to select only the sound information, only the sound information, or any of the video information and the sound information. Note that the selection step 513, the extraction step 503, and the reproduction step 505 are the same as the above-described steps, and thus description thereof will be omitted.

　次に、本実施例の判定ステップ５１１の処理について、図１０９および図１１０を参照して説明する。まず、図１０９に示すように、ステップＳ２０１ではユーザからの要求があるかを判別する。このステップＳ２０１において、ユーザ要求があればステップＳ２０３に進み、ユーザ要求がなければステップＳ２０５に進む。ステップＳ２０３では、ユーザ要求が映像情報のみであるかを判別し、ＹＥＳであればステップＳ２５３に進んで映像情報のみを選択対象と判定し、ＮＯであればステップＳ２０７に進む。ステップＳ２０７では、ユーザ要求が音情報のみであるかを判別し、ＹＥＳであればステップＳ２５５に進んで音情報のみを選択対象と判定し、ＮＯであればステップＳ２５１に進んで映像情報および音情報の両方を選択対象と判定する。 Next, the processing of the determination step 511 of this embodiment will be described with reference to FIGS. First, as shown in FIG. 109, in step S201, it is determined whether there is a request from the user. In step S201, if there is a user request, the process proceeds to step S203. If there is no user request, the process proceeds to step S205. In step S203, it is determined whether the user request is only video information. If YES, the flow advances to step S253 to determine only video information as a selection target, and if NO, the flow advances to step S207. In step S207, it is determined whether the user request is only sound information. If YES, the process proceeds to step S255 to determine only the sound information as a selection target. If NO, the process proceeds to step S251 to perform the video information and the sound information. Are determined to be selection targets.

　また、ユーザ要求がないときに進んだステップＳ２０５では、受信可能な情報が映像情報のみか、音情報のみか、映像情報および音情報の両方であるかを判別する。このステップＳ２０５において、受信可能な情報が映像情報のみであるときはステップＳ２５３に進んで映像情報のみを選択対象と判定し、音情報のみであるときはステップＳ２５５に進んで音情報のみを選択対象と判定し、映像情報および音情報の両方であるときはステップＳ２０９に進む。 {Circle around (5)} In step S205, which is advanced when there is no user request, it is determined whether the receivable information is only video information, only audio information, or both video information and audio information. In this step S205, when the receivable information is only the video information, the process proceeds to step S253, and only the video information is determined as the selection target. Is determined, and if both are the video information and the sound information, the process proceeds to step S209.

　ステップＳ２０９では、回線の状況を判別し、回線が混雑していなければステップＳ２５１に進んで映像情報および音情報の両方を選択対象と判定し、混雑していればステップＳ１１１に進む。ステップＳ２１１では、回線を介して配送される情報に音情報が含まれているかを判別し、ＹＥＳであればステップＳ２５５に進んで音情報を選択対象と判定し、ＮＯであればステップＳ２５３に進んで映像情報を選択対象と判定する。 In step S209, the state of the line is determined. If the line is not congested, the process proceeds to step S251, where both the video information and the sound information are determined as selection targets. In step S211, it is determined whether or not the information delivered via the line includes sound information. If YES, the process proceeds to step S255 to determine the sound information as a selection target. If NO, the process proceeds to step S253. Determines that the video information is to be selected.

（実施例３）
　また、実施例３では、メディアコンテンツが同一時間においてそれぞれ異なる複数の映像情報および／または音情報を有しており、判定ステップ５１１が、実施例２の判定ステップ５１１が行う映像情報のみ、音情報のみ、または映像情報および音情報のいずれかを選択対象とするかについての判定に加えて、さらに、受信端末の能力、配送する回線の能力、および回線の状況などの判定条件から、どの映像情報／音情報を選択対象とするかを判定している。なお、選択ステップ５１３、抽出ステップ５０３および再生ステップ５０５については、上述の各ステップと同様であるため説明を省略する。 (Example 3)
Further, in the third embodiment, the media content has a plurality of different video information and / or sound information at the same time, and the determining step 511 determines that only the video information performed by the determining step 511 of the second embodiment is the sound information. In addition to determining whether only video information or audio information is to be selected, the video information is further determined based on the determination conditions such as the capability of the receiving terminal, the capability of the line to be delivered, and the status of the line. It is determined whether the sound information is to be selected. Note that the selection step 513, the extraction step 503, and the reproduction step 505 are the same as the above-described steps, and thus description thereof will be omitted.

　本実施例においては、実施例１と同様に、メディアコンテンツが有する同一時間において異なる映像情報または音情報が、チャネルやレイヤーにそれぞれ割り当てられており、例えば、動画を伝達するチャネル１，レイヤー１には標準解像度の映像情報が、チャネル１，レイヤー２には高解像度の映像情報が割り当てられ、また、音情報を伝達するチャネル１にはステレオの音情報が、チャネル２にはモノラルの音情報が割り当てられている。 In the present embodiment, as in the first embodiment, different video information or audio information at the same time of the media content is assigned to each channel or layer. For example, channels 1 and layer 1 for transmitting a moving image are assigned to channels and layers. Is standard resolution video information, channel 1 and layer 2 are allocated high resolution video information, and channel 1 for transmitting sound information is stereo sound information and channel 2 is monaural sound information. Have been assigned.

　次に、本実施例の判定ステップ５１１の処理について、図１１１〜図１１３を参照して説明する。図１１１に示すように、本実施例では、まず実施例２の判定ステップ５１１によって、選択対象とする情報を決定する（選択対象の判定ＳＲ−Ｄ）。次に、ステップＳ３０１では、選択対象の判定処理ＳＲ−Ｄによって判定された情報を判定する。このステップＳ３０１において、選択対象となった情報が映像情報のみであるときは、図１１２に示す映像情報に関する判定処理ＳＲ−Ｅを実行し、音情報のみであるときは図１１３に示す音情報に関する判定処理ＳＲ−Ｆを実行し、映像情報および音情報の両方であるときはステップＳ３０３に進む。ステップＳ３０３では、映像情報および音情報を受信する受信端末の能力を判別して、能力が高ければステップＳ３０５に進み、低ければステップＳ３０７に進む。ステップＳ３０７では、伝送速度などの回線の能力を判別して、能力が高ければステップＳ３０９に進み、低ければステップＳ３０７に進む。ステップＳ３０９では回線の状況を判別し、回線が混雑していればステップＳ３０７に進み、混雑していなければステップＳ３１１に進む。 Next, the processing of the determination step 511 of this embodiment will be described with reference to FIGS. As shown in FIG. 111, in the present embodiment, first, information to be selected is determined by the determination step 511 of the second embodiment (selection target determination SR-D). Next, in step S301, the information determined by the selection target determination process SR-D is determined. In this step S301, when the information to be selected is only the video information, the determination process SR-E for the video information shown in FIG. 112 is executed. The determination process SR-F is executed, and if it is both video information and sound information, the process proceeds to step S303. In step S303, the capability of the receiving terminal that receives the video information and the sound information is determined. If the capability is high, the process proceeds to step S305, and if the capability is low, the process proceeds to step S307. In step S307, the capability of the line such as the transmission speed is determined. If the capability is high, the process proceeds to step S309, and if the capability is low, the process proceeds to step S307. In step S309, the state of the line is determined. If the line is congested, the process proceeds to step S307, and if not, the process proceeds to step S311.

　ステップＳ３０７は受信端末の能力が低いか、回線の能力が低いか、または回線が混雑しているときに実行され、このとき受信端末は、チャネル１，レイヤー１の標準解像度の映像情報と、チャネル２のモノラルの音情報とを受信する。一方、ステップＳ３１１は受信端末の能力が高く、回線の能力が高く、かつ回線が混雑していないときに実行され、このとき受信端末は、チャネル１，レイヤー２の高解像度の映像情報と、チャネル１のステレオの音情報とを受信する。 Step S307 is executed when the capacity of the receiving terminal is low, the capacity of the line is low, or the line is congested. At this time, the receiving terminal transmits video information of standard resolution of channel 1 and layer 1 and channel 2 monaural sound information. On the other hand, step S311 is executed when the capacity of the receiving terminal is high, the capacity of the line is high, and the line is not congested. At this time, the receiving terminal transmits the high-resolution video information of channel 1 and layer 2 and the channel 1 stereo sound information.

　次に、図１１２に示す映像情報に関する判定処理ＳＲ−Ｅについて説明する。まず、ステップＳ３５１では、映像情報を受信する受信端末の能力を判別して、能力が高ければステップＳ３５３に進み、低ければステップＳ３５５に進む。ステップＳ３５３では、回線の能力を判別し、能力が高ければステップＳ３５７に進み、低ければステップＳ３５５に進む。ステップＳ３５７では、回線の状況を判別し、回線が混雑していればステップＳ３５５に進み、混雑していなければステップＳ３５９に進む。 Next, the determination process SR-E for the video information shown in FIG. 112 will be described. First, in step S351, the capability of the receiving terminal that receives the video information is determined. If the capability is high, the process proceeds to step S353, and if the capability is low, the process proceeds to step S355. In step S353, the capability of the line is determined. If the capability is high, the process proceeds to step S357, and if the capability is low, the process proceeds to step S355. In step S357, the state of the line is determined. If the line is congested, the process proceeds to step S355, and if not, the process proceeds to step S359.

　ステップＳ３５５は受信端末の能力が低いか、回線の能力が低いか、または回線が混雑しているときに実行され、このとき受信端末は、チャネル１，レイヤー１の標準解像度の映像情報のみを受信する。一方、ステップＳ３５９は受信端末の能力が高く、回線の能力が高く、かつ回線が混雑していないときに実行され、このとき受信端末は、チャネル１，レイヤー２の高解像度の映像情報のみを受信する。 Step S355 is executed when the capacity of the receiving terminal is low, the capacity of the line is low, or the line is congested. At this time, the receiving terminal receives only the standard resolution video information of channel 1 and layer 1 I do. On the other hand, step S359 is executed when the capacity of the receiving terminal is high, the capacity of the line is high, and the line is not congested. At this time, the receiving terminal receives only high-resolution video information of channel 1 and layer 2 I do.

　次に、図１１３に示す音情報に関する判定処理ＳＲ−Ｆについて説明する。まず、ステップＳ３７１では、音情報を受信する受信端末の能力を判別して、能力が高ければステップＳ３７３に進み、低ければステップＳ３７５に進む。ステップＳ３７３では、回線の能力を判別し、能力が高ければステップＳ３７７に進み、低ければステップＳ３７５に進む。ステップＳ３７７では、回線の状況を判別し、回線が混雑していればステップＳ３７５に進み、混雑していなければステップＳ３７９に進む。 Next, the determination process SR-F relating to the sound information shown in FIG. 113 will be described. First, in step S371, the capability of the receiving terminal that receives the sound information is determined. If the capability is high, the process proceeds to step S373, and if low, the process proceeds to step S375. In step S373, the capability of the line is determined. If the capability is high, the process proceeds to step S377, and if the capability is low, the process proceeds to step S375. In step S377, the state of the line is determined. If the line is congested, the process proceeds to step S375, and if not, the process proceeds to step S379.

　ステップＳ３７５は受信端末の能力が低いか、回線の能力が低いか、または回線が混雑しているときに実行され、このとき受信端末は、チャネル２のモノラルの音情報のみを受信する。一方、ステップＳ３７９は受信端末の能力が高く、回線の能力が高く、かつ回線が混雑していないときに実行され、このとき受信端末は、チャネル１のステレオの音情報のみを受信する。 Step S375 is executed when the capacity of the receiving terminal is low, the capacity of the line is low, or the line is congested. At this time, the receiving terminal receives only monaural sound information of channel 2. On the other hand, step S379 is executed when the capacity of the receiving terminal is high, the capacity of the line is high, and the line is not congested. At this time, the receiving terminal receives only the stereo sound information of channel 1.

（実施例４）
　また、実施例４では、文脈内容記述データの最下位層の各要素に該当するメディアセグメントの代表データが属性として付加され、メディアコンテンツが同一時間においてそれぞれ異なる複数のメディア情報を有している。判定ステップ５１１は、メディアコンテンツのデータ構成を記述した物理内容記述データを入力として、受信端末の能力、配送する回線の能力、および回線の状況、回線の能力、およびユーザからの要求などの判定条件から、どのメディア情報および／または代表データを選択対象とするかを判定するステップである。 (Example 4)
In the fourth embodiment, the representative data of the media segment corresponding to each element of the lowest layer of the context description data is added as an attribute, and the media content has a plurality of different pieces of media information at the same time. The determination step 511 receives the physical content description data describing the data configuration of the media content as input, and determines determination conditions such as the capability of the receiving terminal, the capability of the line to be delivered, and the status of the line, the capability of the line, and the request from the user. , The step of determining which media information and / or representative data is to be selected.

　なお、選択ステップ５１３、抽出ステップ５０３および再生ステップ５０５については、説明を省略する。なお、メディア情報は、映像情報や音情報、テキストデータなどの情報であり、以下、本実施例においては、メディア情報が映像情報および音情報の少なくとも一方を含んだものとする。また、代表データは、映像情報であれば、例えば各メディアセグメントごとの代表画像データや低解像度の映像データであり、音情報であれば、例えば各メディアセグメントごとのキーフレーズのデータである。 The description of the selection step 513, the extraction step 503, and the reproduction step 505 is omitted. The media information is information such as video information, sound information, and text data. Hereinafter, in the present embodiment, it is assumed that the media information includes at least one of the video information and the sound information. The representative data is, for example, representative image data or low-resolution video data for each media segment in the case of video information, and is key phrase data for each media segment in the case of sound information.

　また、本実施例においては、実施例と同様に、メディアコンテンツが有する同一時間において異なる映像情報または音情報が、チャネルやレイヤーにそれぞれ割り当てられており、例えば、動画を伝達するチャネル１，レイヤー１には標準解像度の映像情報が、チャネル１，レイヤー２には高解像度の映像情報が割り当てられ、また、音情報を伝達するチャネル１にはステレオの音情報が、チャネル２にはモノラルの音情報が割り当てられている。 Further, in the present embodiment, similarly to the embodiment, different video information or audio information at the same time of the media content is assigned to each channel or layer. For example, channel 1 for transmitting a moving image, layer 1 Is assigned high-definition video information to channels 1 and 2, stereo sound information is transmitted to channel 1 for transmitting sound information, and monaural sound information is transmitted to channel 2. Is assigned.

　次に、本実施例の判定ステップ５１１の処理について、図１１４〜図１１８を参照して説明する。図１１４に示すように、ステップＳ４０１ではユーザからの要求があるかを判別する。このステップＳ４０１において、ユーザ要求があれば図１１６に示すユーザ要求による判定処理ＳＲ−Ｇを実行する。 Next, the processing of the determination step 511 of this embodiment will be described with reference to FIGS. As shown in FIG. 114, in step S401, it is determined whether there is a request from the user. In step S401, if there is a user request, a determination process SR-G based on the user request shown in FIG. 116 is executed.

　また、ステップＳ４０１において、ユーザ要求がなければステップＳ４０３に進み、受信可能な情報が映像情報のみか、音情報のみか、映像情報および音情報の両方であるかを判別する。このステップＳ４０３において、受信可能な情報が映像情報のみであるときは図１１７に示す映像情報に関する判定処理ＳＲ−Ｈを実行し、音情報のみであるときは図１１８に示す音情報に関する判定処理ＳＲ−Ｉを実行し、映像情報および音情報の両方であるときは図１１５に示すステップＳ４０５に進む。 In step S401, if there is no user request, the process proceeds to step S403, and it is determined whether the receivable information is only video information, only audio information, or both video information and audio information. In this step S403, when the receivable information is only the video information, the determination process SR-H for the video information shown in FIG. 117 is executed, and when the receivable information is only the audio information, the determination process SR for the audio information shown in FIG. When -I is executed and both the video information and the sound information are present, the process proceeds to step S405 shown in FIG.

　ステップＳ４０５は、受信端末の能力を判別するステップであるが、このステップＳ４０５を実行した後、順に回線の能力を判別するステップＳ４０７、回線が混雑しているかを判別するステップＳ４０９を実行する。本実施例の判別ステップ５１１は、これらのステップＳ４０５、Ｓ４０７およびＳ４０９を実行して、下記の表１に従うよう、受信する映像情報および音情報のチャネル，レイヤーまたは代表データを判別する。 Step S405 is a step of determining the capability of the receiving terminal. After executing Step S405, Step S407 of sequentially determining the capability of the line and Step S409 of determining whether the line is congested are performed. The determination step 511 of this embodiment executes these steps S405, S407 and S409 to determine the channel, layer or representative data of the video information and audio information to be received as shown in Table 1 below.

　次に、図１１６に示すユーザ要求による判定処理ＳＲ−Ｇについて説明する。まず、ステップＳ４５１では、ユーザによる要求が映像情報のみであるかを判別し、ＹＥＳであれば映像情報に関する判定処理ＳＲ−Ｈを行い、ＮＯであればステップＳ４５３に進む。ステップＳ４５３では、ユーザによる要求が音情報のみであるかを判別し、ＹＥＳであれば音情報に関する判定処理ＳＲ−Ｉを行い、ＮＯであればメインルーチンに戻り、ステップＳ４０５に進む。 Next, the determination process SR-G based on the user request shown in FIG. 116 will be described. First, in step S451, it is determined whether the request by the user is only video information. If YES, the determination process SR-H for video information is performed, and if NO, the process proceeds to step S453. In step S453, it is determined whether or not the request by the user is only sound information. If YES, a determination process SR-I relating to sound information is performed. If NO, the process returns to the main routine and proceeds to step S405.

　次に、図１１７に示す映像情報に関する判定処理ＳＲ−Ｈについて説明する。まず、ステップＳ４６１では受信端末の能力を判別するが、このステップＳ４６１を実行した後、順に回線の能力を判別するステップＳ４６３、回線が混雑しているかを判別するステップＳ４６５を実行する。本実施例の映像情報に関する判定処理ＳＲ−Ｈは、これらのステップＳ４６１、Ｓ４６３、Ｓ４６５を実行して、端末の能力が高く、回線の能力が高く、かつ回線が混雑してないとき、チャネル１，レイヤー２の映像情報のみを受信し（ステップＳ４７１）、また、端末の能力が低く、回線の能力が低く、かつ回線が混雑してないとき、映像情報の代表データのみを受信する（ステップＳ４７３）。また、上記の条件に該当しないときは、チャネル１，レイヤー１の映像情報のみを受信する（ステップＳ４７５）。 Next, the determination process SR-H for the video information shown in FIG. 117 will be described. First, in step S461, the capability of the receiving terminal is determined. After performing step S461, step S463 for sequentially determining the capability of the line and step S465 for determining whether the line is congested are performed. The determination process SR-H relating to the video information of the present embodiment executes these steps S461, S463, and S465, and when the capability of the terminal is high, the capability of the line is high, and the line is not congested, the channel 1 , Only the video information of layer 2 is received (step S471). When the capability of the terminal is low, the capability of the line is low, and the line is not congested, only the representative data of the video information is received (step S473). ). If the above conditions are not satisfied, only the video information of channel 1 and layer 1 is received (step S475).

　次に、図１１８に示す音情報に関する判定処理ＳＲ−Ｉについて説明する。まず、ステップＳ４７１では受信端末の能力を判別するが、このステップＳ４７１を実行した後、順に回線の能力を判別するステップＳ４７３、回線が混雑しているかを判別するステップＳ４７５を実行する。本実施例の映像情報に関する判定処理ＳＲ−Ｉは、これらのステップＳ４７１、Ｓ４７３、Ｓ４７５を実行して、端末の能力が高く回線の能力が高いとき、および端末の能力が高く、回線の能力が低く、かつ回線が混雑していないとき、チャネル１の音情報のみを受信する（ステップＳ４９１）。また、端末の能力が低く、回線の能力が低く、回線が混雑しているとき、音情報の代表データのみを受信する（ステップＳ４９３）。また、上記の条件に該当しないときは、チャネル２の音情報のみを受信する（ステップＳ４９５）。 Next, the determination process SR-I for the sound information shown in FIG. 118 will be described. First, in step S471, the capability of the receiving terminal is determined. After executing step S471, step S473 for sequentially determining the capability of the line and step S475 for determining whether the line is congested are performed. The determination process SR-I relating to the video information in the present embodiment executes these steps S471, S473, and S475, and when the capability of the terminal is high and the capability of the line is high, and when the capability of the terminal is high and the capability of the line is low. When it is low and the line is not congested, only the sound information of channel 1 is received (step S491). When the terminal has a low capacity, the line has a low capacity, and the line is congested, only the representative data of the sound information is received (step S493). If the above condition is not satisfied, only the sound information of channel 2 is received (step S495).

（実施例５）
　また、実施例５では、判定ステップ５１１が、受信端末の能力、配送する回線の能力、および回線の状況、回線の能力、およびユーザからの要求などの判定条件から、メディアセグメントの全体データ、該当するメディアセグメントの代表データのみ、または該当するメディアセグメントの全体データおよび代表データのいずれかを選択対象とするかを判定するステップである。 (Example 5)
Further, in the fifth embodiment, the determination step 511 determines the entire data of the media segment from the determination conditions such as the capability of the receiving terminal, the capability of the line to be delivered, the status of the line, the capability of the line, and the request from the user. This is a step of determining whether only the representative data of the media segment to be selected, or the entire data or the representative data of the corresponding media segment is to be selected.

　なお、本実施例においても実施例４と同様に、文脈内容記述データの最下位層の各要素に該当するメディアセグメントの代表データが属性として付加され、この代表データは、映像情報であれば、例えば各メディアセグメントごとの代表画像データや低解像度の映像データであり、音情報であれば、例えば各メディアセグメントごとのキーフレーズのデータである。 In this embodiment, as in the fourth embodiment, representative data of a media segment corresponding to each element of the lowest layer of the context description data is added as an attribute. If the representative data is video information, For example, it is representative image data or low-resolution video data for each media segment, and if it is sound information, for example, it is key phrase data for each media segment.

　次に、本実施例の判定ステップ５１１の処理について、図１１９〜図１２１を参照して説明する。図１１９に示すように、ステップＳ５０１ではユーザからの要求があるかを判別する。このステップＳ５０１において、ユーザ要求があれば図１２１に示すユーザ要求による判定処理ＳＲ−Ｊを実行する。 Next, the processing of the determination step 511 of this embodiment will be described with reference to FIGS. As shown in FIG. 119, in a step S501, it is determined whether or not there is a request from the user. In step S501, if there is a user request, a determination process SR-J based on the user request shown in FIG. 121 is executed.

　また、ステップＳ５０１において、ユーザ要求がなければステップＳ５０３に進み、受信可能なデータがメディアセグメントの代表データのみか、メディアセグメントの全体データのみか、代表データおよび全体データの両方であるかを判別する。このステップＳ５０３において、受信可能なデータが代表データのみであるときは、図１２０に示すステップＳ５５３に進んで代表データのみを選択対象と判定し、全体データのみであるときはステップＳ５５５に進んで全体データのみを選択対象と判定し、代表データおよび全体データの両方であるときはステップＳ５０５に進む。 If there is no user request in step S501, the flow advances to step S503 to determine whether receivable data is only the representative data of the media segment, only the entire data of the media segment, or both the representative data and the entire data. . In this step S503, when the receivable data is only the representative data, the process proceeds to step S553 shown in FIG. 120, and only the representative data is determined as a selection target. When the receivable data is only the entire data, the process proceeds to step S555 to perform the entire process. It is determined that only the data is to be selected, and if it is both the representative data and the entire data, the process proceeds to step S505.

　ステップＳ５０５では、回線の能力を判別し、回線の能力が高いときはステップＳ５０７に進み、低いときはステップＳ５０９に進む。ステップＳ５０７およびＳ５０９の両ステップとも、回線が混雑しているかを判別し、ステップＳ５０７において、回線が混雑していないと判別されればステップＳ５５１に進んで全体データおよび代表データを選択対象と判定し、ステップＳ５０９において、回線が混雑していると判別されればステップＳ５５３に進んで代表データを選択対象とする。また、ステップＳ５０７において回線が混雑していると判別されたとき、およびステップＳ５０９において回線が混雑していないと判別されたときは、ステップＳ５５５に進んで全体データを選択対象とする。 In step S505, the line capacity is determined. If the line capacity is high, the process proceeds to step S507, and if the line capability is low, the process proceeds to step S509. In both steps S507 and S509, it is determined whether the line is congested. If it is determined in step S507 that the line is not congested, the process proceeds to step S551, where the entire data and the representative data are determined to be selected. If it is determined in step S509 that the line is congested, the flow advances to step S553 to select the representative data. If it is determined in step S507 that the line is congested, or if it is determined in step S509 that the line is not congested, the process proceeds to step S555 to select the entire data.

　また、ユーザ要求による判定処理ＳＲ−Ｊでは、まずステップＳ６０１において、ユーザ要求が代表データのみであるかを判別し、ＹＥＳであればステップＳ５５３に進んで代表データのみを選択対象とし、ＮＯであればステップＳ６０３に進む。ステップＳ６０３では、ユーザ要求が全体データのみであるかを判別し、ＹＥＳであればステップＳ５５５に進んで全体データのみを選択対象とし、ＮＯであればステップＳ５５１に進んで全体データおよび代表データの両方を選択対象とする。 In the determination process SR-J based on a user request, first, in step S601, it is determined whether the user request is only representative data. If YES, the process proceeds to step S553 to select only the representative data. If it is, the process proceeds to step S603. In step S603, it is determined whether the user request is only the entire data. If YES, the process proceeds to step S555 to select only the entire data. If NO, the process proceeds to step S551 to perform both the entire data and the representative data. Is selected.

〔第１８の実施の形態〕
　以下、本発明に係る第１８の実施の形態について述べる。図１２２は、本実施の形態におけるデータ処理方法の処理を示すブロック図である。同図において、５０１は選択ステップを、５０３は抽出ステップを、５１５は構成ステップを表す。なお、選択ステップ５０１および抽出ステップ５０３は、第１４の実施の形態に示した選択ステップおよび抽出ステップと同様であるため説明を省略する。 [Eighteenth Embodiment]
Hereinafter, an eighteenth embodiment according to the present invention will be described. FIG. 122 is a block diagram illustrating processing of the data processing method according to the present embodiment. In the figure, 501 represents a selection step, 503 represents an extraction step, and 515 represents a configuration step. Note that the selection step 501 and the extraction step 503 are the same as the selection step and the extraction step described in the fourteenth embodiment, and thus description thereof will be omitted.

　構成ステップ５１５は、抽出ステップ５０３が抽出した選択区間のデータからメディアコンテンツのストリームを構成するステップである。特に、構成ステップ５１５は、抽出ステップ５０３が出力したデータを多重化してストリームを構成する。 The configuration step 515 is a step of configuring a media content stream from the data of the selected section extracted by the extraction step 503. In particular, the configuration step 515 multiplexes the data output from the extraction step 503 to configure a stream.

〔第１９の実施の形態〕
　以下、本発明に係る第１９の実施の形態について述べる。図１２３は、本実施の形態におけるデータ処理方法の処理を示すブロック図である。同図において、５０１は選択ステップを、５０３は抽出ステップを、５１５は構成ステップを、５１７は配送ステップを表す。なお、選択ステップ５０１および抽出ステップ５０３は、第１４の実施の形態に示した選択ステップおよび抽出ステップと同様であり、構成ステップ５１５は第１８の実施の形態に示した構成ステップと同様であるため説明を省略する。 [Nineteenth Embodiment]
Hereinafter, a nineteenth embodiment according to the present invention will be described. FIG. 123 is a block diagram showing processing of the data processing method according to the present embodiment. In the figure, 501 indicates a selection step, 503 indicates an extraction step, 515 indicates a configuration step, and 517 indicates a delivery step. Note that the selection step 501 and the extraction step 503 are the same as the selection step and the extraction step shown in the fourteenth embodiment, and the configuration step 515 is the same as the configuration step shown in the eighteenth embodiment. Description is omitted.

　配送ステップ５１７は、構成ステップ５１５が構成したストリームを回線を通じて配送するステップである。なお、配送ステップ５１７は配送する回線状況を判断するステップを有し、構成ステップ５１５は配送ステップ５１７が判断した回線状況からファイルを構成するデータ量を調整するステップを有しても良い。 The delivery step 517 is a step of delivering the stream formed by the configuration step 515 via a line. The delivery step 517 may include a step of determining the status of the line to be delivered, and the configuration step 515 may include a step of adjusting the amount of data constituting the file based on the status of the line determined by the delivery step 517.

〔第２０の実施の形態〕
　以下、本発明に係る第２０の実施の形態について述べる。図１２４は、本実施の形態におけるデータ処理方法の処理を示すブロック図である。同図において、５０１は選択ステップを、５０３は抽出ステップを、５１５は構成ステップを、５１９は記録ステップを、５２１はデータ記録媒体を表す。記録ステップ５１９は、構成ステップ５１５が構成したストリームをデータ記録媒体５２１に記録するステップである。また、データ記録媒体５２１は、メディアコンテンツとその文脈内容記述データおよび物理内容記述データを記録するものであり、ハードディスクやメモリ、ＤＶＤ−ＲＡＭなどである。なお、選択ステップ５０１および抽出ステップ５０３は、第１４の実施の形態に示した選択ステップおよび抽出ステップと同様であり、構成ステップ５１５は第１８の実施の形態に示した構成ステップと同様であるため説明を省略する。 [Twentieth embodiment]
Hereinafter, a twentieth embodiment according to the present invention will be described. FIG. 124 is a block diagram showing processing of the data processing method according to the present embodiment. In the figure, reference numeral 501 denotes a selection step, 503 denotes an extraction step, 515 denotes a configuration step, 519 denotes a recording step, and 521 denotes a data recording medium. The recording step 519 is a step of recording the stream composed by the composition step 515 on the data recording medium 521. The data recording medium 521 records media content, its context content description data, and physical content description data, and is a hard disk, a memory, a DVD-RAM, or the like. Note that the selection step 501 and the extraction step 503 are the same as the selection step and the extraction step shown in the fourteenth embodiment, and the configuration step 515 is the same as the configuration step shown in the eighteenth embodiment. Description is omitted.

〔第２１の実施の形態〕
　以下、本発明に係る第２１の実施の形態について述べる。図１２５は、本実施の形態におけるデータ処理方法の処理を示すブロック図である。同図において、５０１は選択ステップを、５０３は抽出ステップを、５１５は構成ステップを、５１９は記録ステップを、５２１はデータ記録媒体を、５２３はデータ記録媒体管理ステップを表す。データ記録媒体管理ステップ５２３は、データ記録媒体５２１の残容量によって、すでに蓄積したメディアコンテンツおよび／または新規に蓄積するメディアコンテンツの再編成を行うステップである。より詳しくは、データ記録媒体管理ステップ５２３は、データ記録媒体５２１の残容量が少ないとき、新たに蓄積するコンテンツを編集が行われた後に蓄積する処理、および、すでに蓄積されているメディアコンテンツに対して、その文脈内容記述データと物理内容記述データとを選択ステップ５０１へ送り、前記メディアコンテンツと物理内容記述データとを抽出ステップ５０３へ送ることによって、このメディアコンテンツを再編成し、再編成したメディアコンテンツをデータ記録媒体５２１に記録すると共に、再編成前のメディアコンテンツを削除する処理の少なくとも一方を行う。 [Twenty-first embodiment]
Hereinafter, a twenty-first embodiment according to the present invention will be described. FIG. 125 is a block diagram illustrating processing of the data processing method according to the present embodiment. In the figure, reference numeral 501 denotes a selection step, 503 denotes an extraction step, 515 denotes a configuration step, 519 denotes a recording step, 521 denotes a data recording medium, and 523 denotes a data recording medium management step. The data recording medium management step 523 is a step of reorganizing already stored media contents and / or newly stored media contents based on the remaining capacity of the data recording medium 521. More specifically, the data recording medium management step 523 includes a step of, when the remaining capacity of the data recording medium 521 is small, a process of accumulating the newly accumulated content after the editing is performed, and a process of accumulating the media content already accumulated. Then, the context content description data and the physical content description data are sent to the selection step 501, and the media content and the physical content description data are sent to the extraction step 503, thereby reorganizing the media content, The content is recorded on the data recording medium 521, and at least one of the processing of deleting the media content before the reorganization is performed.

　なお、選択ステップ５０１および抽出ステップ５０３は、第１４の実施の形態に示した選択ステップおよび抽出ステップと同様であり、構成ステップ５１５は第１８の実施の形態に示した構成ステップと同様であり、記録ステップ５１９およびデータ記録媒体５２１は第１９の実施の形態に示した記録ステップおよびデータ記録媒体と同様であるため説明を省略する。 Note that the selection step 501 and the extraction step 503 are the same as the selection step and the extraction step shown in the fourteenth embodiment, and the configuration step 515 is the same as the configuration step shown in the eighteenth embodiment. The recording step 519 and the data recording medium 521 are the same as the recording step and the data recording medium described in the nineteenth embodiment, and thus description thereof will be omitted.

〔第２２の実施の形態〕
　以下、本発明に係る第２２の実施の形態について述べる。図１２６は、本実施の形態におけるデータ処理方法の処理を示すブロック図である。同図において、５０１は選択ステップを、５０３は抽出ステップを、５１５は構成ステップを、５１９は記録ステップを、５２１はデータ記録媒体を、５２５は蓄積コンテンツ管理ステップを表す。蓄積コンテンツ管理ステップ５２５は、データ記録媒体５２１に蓄積されているメディアコンテンツを、その蓄積期間によって再編成を行うステップである。より詳しくは、蓄積コンテンツ管理ステップ５２５は、データ記録媒体５２１に蓄積されたメディアコンテンツを管理して、一定の蓄積期間に達したメディアコンテンツに対して、その文脈内容記述データと物理内容記述データとを選択ステップへ５０１に送り、前記メディアコンテンツと物理内容記述データとを抽出ステップ５０３に送ることによって、このメディアコンテンツを再編成し、再編成したメディアコンテンツをデータ記録媒体５２１に記録すると共に、再編成前のメディアコンテンツを削除するステップである。 [Twenty-second embodiment]
Hereinafter, a twenty-second embodiment according to the present invention will be described. FIG. 126 is a block diagram illustrating processing of the data processing method according to the present embodiment. In the figure, reference numeral 501 denotes a selection step, 503 denotes an extraction step, 515 denotes a configuration step, 519 denotes a recording step, 521 denotes a data recording medium, and 525 denotes a stored content management step. The stored content management step 525 is a step of reorganizing the media content stored in the data recording medium 521 according to the storage period. More specifically, the stored content management step 525 manages the media content stored in the data recording medium 521, and, for a media content that has reached a certain storage period, the context content description data and the physical content description data. To the selection step 501 and send the media content and the physical content description data to the extraction step 503 to reorganize the media content, record the reorganized media content on the data recording medium 521, and This is a step of deleting media content before composition.

　以上の第１４〜第２２の実施の形態における選択ステップ５０１，５１３、抽出ステップ５０３、再生ステップ５０５、映像選択ステップ５０７、音選択ステップ５０９、判定ステップ５１１、構成ステップ５１５、配送ステップ５１７、記録ステップ５１９、データ記録媒体管理ステップ５２３および蓄積コンテンツ管理ステップ５２５は、それぞれ選択手段、抽出手段、再生手段、映像選択手段、音選択手段、判定手段、構成手段、配送手段、記録手段、データ記録媒体管理手段および蓄積コンテンツ管理手段として、これらの一部または全てを有したデータ処理装置として実現できる。 The selection steps 501 and 513, the extraction step 503, the reproduction step 505, the video selection step 507, the sound selection step 509, the determination step 511, the configuration step 515, the delivery step 517, and the recording step in the above fourteenth to twenty-second embodiments. 519, a data recording medium management step 523, and a stored content management step 525 include a selection unit, an extraction unit, a reproduction unit, a video selection unit, a sound selection unit, a determination unit, a configuration unit, a delivery unit, a recording unit, and a data recording medium management, respectively. As a means and a stored content management means, it can be realized as a data processing device having a part or all of them.

　なお、上記実施の形態においては、メディアコンテンツとして、映像情報や音情報以外のテキストデータ等のデータストリームを含んでも良い。また、上記実施の形態の各ステップは、ステップの全てまたは一部の動作をコンピュータで実行するためのプログラムをプログラム格納媒体に格納し、コンピュータを用いてソフトウェア的に実現することも、それらステップの機能を発揮する専用のハード回路を用いて実現しても構わない。 In the above embodiment, the media content may include a data stream such as text data other than the video information and the sound information. In addition, each step of the above-described embodiment may be realized by storing a program for executing all or a part of the operation of the step by a computer in a program storage medium and realizing the software using a computer. It may be realized by using a dedicated hardware circuit that performs a function.

　なお、上記実施の形態においては、文脈内容記述データと物理内容記述データをそれぞれ別の実体で記述したが、図１２７〜図１３２に示すように、ひとつにまとめたものを用いても良い。 In the above embodiment, the context description data and the physical description data are described in different entities, but may be combined as shown in FIGS. 127 to 132.

　以上説明したように、上述のデータ処理装置、データ処理方法、記録媒体およびプログラムによれば、階層構造の文脈内容記述データを用いて、選択手段（選択ステップ）により、文脈内容記述データに付加されたスコアに基づいてメディアコンテンツ中の少なくとも１つの区間を選択しており、特に、抽出手段（抽出ステップ）によって、選択手段（選択ステップ）が選択した区間に対応するデータのみを抽出するか、再生手段（再生ステップ）によって、選択手段（選択ステップ）が選択した区間に対応するデータのみを再生している。 As described above, according to the above-described data processing apparatus, data processing method, recording medium, and program, the context content description data having the hierarchical structure is added to the context content description data by the selection unit (selection step). At least one section in the media content is selected based on the selected score. In particular, the extraction means (extraction step) extracts or reproduces only data corresponding to the section selected by the selection means (selection step). By means (reproduction step), only data corresponding to the section selected by the selection means (selection step) is reproduced.

　このため、より重要なシーンをメディアコンテンツの中から自由に選択することができ、この重要な選択された区間を抽出または再生することができる。また、文脈内容記述データが最上位層、最下位層およびその他の層から構成された階層構造であるため、章や節など任意の単位でシーンを選択することができ、ある節を選択してその中の不要な段落は削除するなど、多様な選択形式をとることができる。 For this reason, a more important scene can be freely selected from the media contents, and the important selected section can be extracted or reproduced. Also, since the context description data has a hierarchical structure consisting of the top layer, the bottom layer, and other layers, scenes can be selected in arbitrary units such as chapters and sections. A variety of selection formats can be used, such as deleting unnecessary paragraphs.

　また、スコアをメディアコンテンツの文脈内容に基づいた重要度を示すものとすることによって、このスコアを重要な場面を選択するよう設定しておくことによって、例えば、番組などのハイライトシーン集などの作成を容易に行うことができ、また、スコアを該当する場面におけるキーワードの観点に基づいた重要度を示すものとし、キーワードを決定することによってより自由度の高い区間の選択を行うことができる。例えば、キーワードを登場人物や事柄などの特定の観点によって決定することによって、ユーザが見たい場面だけを選び出すことができる。 Also, by setting the score to indicate importance based on the contextual content of the media content, and setting this score to select an important scene, for example, a highlight scene collection of a program or the like can be used. Creation can be performed easily, and the score indicates the importance based on the viewpoint of the keyword in the corresponding scene. By determining the keyword, a section having a higher degree of freedom can be selected. For example, by determining a keyword from a specific viewpoint such as a character or a matter, it is possible to select only a scene that the user wants to see.

　また、メディアコンテンツが同一時間においてそれぞれ異なる複数のメディア情報を有しているとき、判定手段（判定ステップ）が、判定条件からどのメディア情報を選択対象とするかを判定し、選択手段（選択ステップ）が判定手段（判定ステップ）によって判定されたデータからのみ選択処理を行っている。このため、判定手段（判定ステップ）は、判定条件に応じて、最適な区分のメディア情報を判定することができるため、選択手段（選択ステップ）は適切なデータ量のメディア情報を選択することができる。 When the media content has a plurality of different pieces of media information at the same time, the judging means (judging step) judges which media information is to be selected from the judging conditions. ) Performs the selection process only from the data determined by the determination means (determination step). For this reason, since the determining means (determining step) can determine the media information of the optimal division according to the determining condition, the selecting means (selecting step) can select the media information having an appropriate data amount. it can.

　また、判定手段（判定ステップ）が、判定条件から映像情報のみ、音情報のみ、または映像情報および音情報のいずれかを選択対象とするかを判定しているため、選択手段（選択ステップ）が行う区間の選択のために要する時間を短縮することができる。 Further, since the judging means (judging step) judges whether only the video information, only the sound information, or the video information and the sound information are to be selected from the judging conditions, the selecting means (selecting step) The time required for selecting a section to be performed can be reduced.

　また、文脈内容記述データに代表データが属性として付加され、判定手段は、これら判定条件に応じて、最適な区分のメディア情報または代表データを判定することができる。 {Circle around (4)} Representative data is added to the context description data as an attribute, and the determining means can determine the media information or the representative data in the most appropriate category according to these determination conditions.

　さらに、判定手段（判定ステップ）が、判定条件に応じて、該当するメディアセグメントの全体データのみ、代表データのみ、または全体データおよび代表データ両方のいずれかを選択対象と判定しているため、判定手段（判定ステップ）は、選択手段（選択ステップ）が行う区間の選択のために要する時間を短縮することができる。 Further, the determination means (determination step) determines that only the entire data of the corresponding media segment, only the representative data, or both the entire data and the representative data are to be selected according to the determination condition. The means (judgment step) can reduce the time required for selection of the section performed by the selection means (selection step).

　本発明は、必要とする場面をメディアコンテンツの中から自由に選択することができるデータ処理装置、データ処理方法、記録媒体およびプログラム等に有用である。 The present invention is useful for a data processing device, a data processing method, a recording medium, a program, and the like that can freely select a required scene from media contents.

本発明の第１の実施の形態におけるデータ処理方法のブロック図である。FIG. 2 is a block diagram of a data processing method according to the first embodiment of the present invention. 本発明の第１の実施の形態における文脈内容記述データのデータ構造を表す図である。FIG. 4 is a diagram illustrating a data structure of context description data according to the first embodiment of the present invention. 本発明の第１の実施の形態における文脈内容記述データをコンピュータ上で表現するXMLのDTDと、XMLで書かれた文脈内容記述データの一部分の一例である。FIG. 2 is an example of an XML DTD that expresses context content description data on a computer according to the first embodiment of the present invention, and a part of the context content description data written in XML. FIG. 図３の文脈内容記述データの続きの部分である。It is a continuation part of the context content description data of FIG. 図４の続き部分である。It is a continuation of FIG. 図５の続き部分である。It is a continuation of FIG. 図６の続き部分である。It is a continuation part of FIG. 図７の続き部分である。It is a continuation of FIG. 図８の続き部分である。It is a continuation part of FIG. 図３〜図９の文脈内容記述データに代表データを追加したXML文書の一部分と、該文脈内容記述データをコンピュータ上で表現するXMLで書かれたDTDの一例である。It is an example of a part of an XML document in which representative data is added to the context content description data of FIGS. 3 to 9 and a DTD written in XML that expresses the context content description data on a computer. 図１０の文脈内容記述データの続きの部分である。It is a continuation part of the context description data of FIG. 図１１の続き部分である。It is a continuation part of FIG. 図１２の続き部分である。It is a continuation part of FIG. 図１３の続き部分である。It is a continuation part of FIG. 図１４の続き部分である。It is a continuation part of FIG. 図１５の続き部分である。It is a continuation part of FIG. 図１６の続き部分である。It is a continuation part of FIG. 図１７の続き部分である。It is a continuation part of FIG. 図１８の続き部分である。It is a continuation part of FIG. 図１９の続き部分である。It is a continuation part of FIG. 図２０の続き部分である。It is a continuation part of FIG. 本発明の第１の実施の形態における重要度の付け方を表す説明図である。FIG. 4 is an explanatory diagram illustrating how to assign importance in the first embodiment of the present invention. 本発明の第１の実施の形態における選択ステップの処理のフローチャートである。It is a flow chart of processing of a selection step in a 1st embodiment of the present invention. 本発明の第１の実施の形態における抽出ステップの構成図である。FIG. 3 is a configuration diagram of an extraction step according to the first embodiment of the present invention. 本発明の第１の実施の形態における抽出ステップの分離手段の処理のフローチャートである。It is a flowchart of the process of the separation means of the extraction step in the first embodiment of the present invention. 本発明の第１の実施の形態における抽出ステップのビデオスキミング手段の処理のフローチャートである。It is a flowchart of the process of the video skimming means of the extraction step in the first embodiment of the present invention. MPEG1ビデオストリームの構成図である。FIG. 3 is a configuration diagram of an MPEG1 video stream. 本発明の第１の実施の形態における抽出ステップのオーディオスキミング手段の処理のフローチャートである。It is a flowchart of the process of the audio skimming means of the extraction step in the first embodiment of the present invention. MPEGオーディオのAAUの構成図である。FIG. 3 is a configuration diagram of an AAU for MPEG audio. 本発明の第１の実施の形態におけるの応用のブロック図である。It is a block diagram of the application in the 1st Embodiment of this invention. 本発明の第２の実施の形態における重要度の処理の説明図である。It is an explanatory view of processing of importance in a 2nd embodiment of the present invention. 本発明の第２の実施の形態における選択ステップの処理のフローチャートである。It is a flow chart of processing of a selection step in a 2nd embodiment of the present invention. 本発明の第３の実施の形態における選択ステップの処理のフローチャートである。It is a flow chart of processing of a selection step in a 3rd embodiment of the present invention. 本発明の第４の実施の形態における重要度の付け方を表す説明図である。It is an explanatory view showing how to attach importance in a 4th embodiment of the present invention. 本発明の第４の実施の形態における選択ステップの処理のフローチャートである。It is a flow chart of processing of a selection step in a 4th embodiment of the present invention. 本発明の第５の実施の形態におけるデータ処理方法のブロック図である。It is a block diagram of a data processing method in a 5th embodiment of the present invention. 本発明の第５の実施の形態における物理内容記述データのデータ構造を表す図である。It is a figure showing the data structure of the physical content description data in 5th Embodiment of this invention. 本発明の第５の実施の形態における文脈内容記述データのデータ構造を表す図である。It is a figure showing the data structure of the context content description data in 5th Embodiment of this invention. 本発明の第５の実施の形態における物理内容記述データをコンピュータ上で表現するXMLのDTDと、XML文書の一例である。It is an example of the XML DTD which expresses the physical content description data on a computer in 5th Embodiment of this invention, and an example of an XML document. 本発明の第５の実施の形態における文脈内容記述データをコンピュータ上で表現するXMLのDTDと、XML文書の一例の前半部である。It is a first half of an example of an XML DTD and an XML document that expresses context content description data on a computer according to the fifth embodiment of the present invention. 図４０の文脈内容記述データの続きの部分である。It is a continuation part of the context description data of FIG. 図４１の続き部分である。42 is a continuation of FIG. 41. 図４２の続き部分である。42 is a continuation of FIG. 42. 図４３の続き部分である。It is a continuation part of FIG. 図４４の続き部分である。It is a continuation part of FIG. 本発明の第５の実施の形態における選択ステップの出力の一例である。It is an example of the output of the selection step in the fifth exemplary embodiment of the present invention. 本発明の第５の実施の形態における抽出ステップのブロック図である。It is a block diagram of an extraction step in a 5th embodiment of the present invention. 本発明の第５の実施の形態における抽出ステップのインターフェース手段の処理のフローチャートである。It is a flow chart of processing of an interface means of an extraction step in a 5th embodiment of the present invention. 本発明の第５の実施の形態における抽出ステップのインターフェース手段が選択ステップの出力を変換した結果の一例である。It is an example of the result of having converted the output of the selection step by the interface means of the extraction step in the fifth embodiment of the present invention. 本発明の第５の実施の形態における抽出ステップの分離手段の処理のフローチャートである。It is a flow chart of processing of a separation means of an extraction step in a 5th embodiment of the present invention. 本発明の第５の実施の形態における抽出ステップのビデオスキミング手段の処理のフローチャートである。It is a flow chart of processing of video skimming means of an extraction step in a 5th embodiment of the present invention. 本発明の第５の実施の形態における抽出ステップのオーディオスキミング手段の処理のフローチャートである。It is a flow chart of processing of audio skimming means of an extraction step in a 5th embodiment of the present invention. 本発明の第５の実施の形態における抽出ステップのビデオスキミング手段のもうひとつの処理のフローチャートである。It is a flow chart of another processing of video skimming means of an extraction step in a 5th embodiment of the present invention. 本発明の第６の実施の形態におけるデータ処理方法のブロック図である。It is a block diagram of a data processing method in a 6th embodiment of the present invention. 本発明の第６の実施の形態における構成ステップと配送ステップのブロック図である。It is a block diagram of a composition step and a delivery step in a sixth embodiment of the present invention. 本発明の第７の実施の形態におけるデータ処理方法のブロック図である。It is a block diagram of a data processing method in a 7th embodiment of the present invention. 本発明の第７の実施の形態における文脈内容記述データのデータ構造を表す図である。It is a figure showing the data structure of the context content description data in 7th Embodiment of this invention. 本発明の第７の実施の形態における文脈内容記述データをコンピュータ上で表現するXMLのDTDと、XMLで書かれた文脈内容記述データの一部分の一例である。It is an example of the DTD of XML which expresses the context content description data on the computer in 7th Embodiment of this invention, and a part of context content description data written in XML. 図５８の文脈内容記述データの続きの部分である。It is a continuation part of the context description data of FIG. 図５９の続き部分である。It is a continuation part of FIG. 図６０の続き部分である。It is a continuation part of FIG. 図６１の続き部分である。It is a continuation part of FIG. 図６２の続き部分である。It is a continuation part of FIG. 図６３の続き部分である。It is a continuation part of FIG. 図６４の続き部分である。It is a continuation part of FIG. 図６５の続き部分である。It is a continuation part of FIG. 図５８〜図６６の文脈内容記述データに代表データを追加したXML文書の一部分と、該文脈内容記述データをコンピュータ上で表現するXMLで書かれたDTDの一例である。This is an example of a part of an XML document in which representative data is added to the context content description data of FIGS. 58 to 66, and an example of a DTD written in XML that expresses the context content description data on a computer. 図６７の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図６８の続き部分である。It is a continuation part of FIG. 図６９の続き部分である。It is a continuation part of FIG. 図７０の続き部分である。It is a continuation part of FIG. 図７１の続き部分である。It is a continuation part of FIG. 図７２の続き部分である。It is a continuation part of FIG. 図７３の続き部分である。This is a continuation of FIG. 73. 図７４の続き部分である。This is a continuation of FIG. 74. 図７５の続き部分である。It is a continuation of FIG. 図７６の続き部分である。It is a continuation part of FIG. 図７７の続き部分である。It is a continuation of FIG. 77. 図７８の続き部分である。It is a continuation of FIG. 78. 図７９の続き部分である。It is a continuation part of FIG. 本発明の第７の実施の形態における選択ステップの処理のフローチャートである。It is a flow chart of processing of a selection step in a 7th embodiment of the present invention. 本発明の第７の実施の形態におけるの応用のブロック図である。It is a block diagram of the application in the 7th embodiment of the present invention. 本発明の第８の実施の形態における選択ステップの処理のフローチャートである。It is a flow chart of processing of a selection step in an 8th embodiment of the present invention. 本発明の第９の実施の形態における選択ステップの処理のフローチャートである。It is a flow chart of processing of a selection step in a 9th embodiment of the present invention. 本発明の第１０の実施の形態における選択ステップの処理のフローチャートである。It is a flow chart of processing of a selection step in a 10th embodiment of the present invention. 本発明の第１２の実施の形態におけるデータ処理方法のブロック図である。FIG. 33 is a block diagram of a data processing method according to a twelfth embodiment of the present invention. 本発明の第１２の実施の形態における文脈内容記述データのデータ構造を表す図である。FIG. 39 is a diagram illustrating a data structure of context description data according to a twelfth embodiment of the present invention. 本発明の第５の実施の形態における文脈内容記述データをコンピュータ上で表現するXMLのDTDと、XML文書の一例の一部である。It is a part of an example of an XML DTD that expresses context description data on a computer according to the fifth embodiment of the present invention, and an XML document. 図８８の一例の続きの部分である。It is a continuation of the example of FIG. 88. 図８９の一例の続きの部分である。It is a continuation of the example of FIG. 89. 図９０の一例の続きの部分である。It is a continuation of the example of FIG. 90. 図９１の一例の続きの部分である。This is a continuation of the example of FIG. 91. 図９２の一例の続きの部分である。It is a continuation of the example of FIG. 92. 図９３の一例の続きの部分である。It is a continuation of the example of FIG. 93. 図９４の続きの部分である。This is a continuation of FIG. 94. 図９５の続きの部分である。It is a continuation part of FIG. 本発明の第１３の実施の形態におけるデータ処理方法のブロック図である。FIG. 33 is a block diagram of a data processing method according to a thirteenth embodiment of the present invention. 本発明の第１４の実施の形態におけるデータ処理方法のブロック図である。FIG. 35 is a block diagram of a data processing method according to a fourteenth embodiment of the present invention. 本発明の第１５の実施の形態におけるデータ処理方法のブロック図である。FIG. 35 is a block diagram of a data processing method according to a fifteenth embodiment of the present invention. 本発明の第１６の実施の形態におけるデータ処理方法のブロック図である。FIG. 33 is a block diagram of a data processing method according to a sixteenth embodiment of the present invention. 本発明の第１７の実施の形態におけるデータ処理方法のブロック図である。FIG. 51 is a block diagram of a data processing method according to a seventeenth embodiment of the present invention. チャネルおよびレイヤーを示す説明図である。FIG. 3 is an explanatory diagram showing channels and layers. 物理内容記述データをXMLで記述するためのDTDと、該DTDによる物理内容記述データの一部分の一例である。It is an example of a DTD for describing physical content description data in XML and a part of the physical content description data based on the DTD. 図１０３の物理内容記述データの続きの部分である。This is a continuation part of the physical content description data of FIG. 第１７の実施の形態の実施例１の判定ステップの処理を示すフローチャートである。It is a flow chart which shows processing of a judgment step of Example 1 of a 17th embodiment. 第１７の実施の形態の実施例１の判定ステップが行うユーザ要求による判定処理を示すフローチャートである。31 is a flowchart illustrating a determination process according to a user request performed by a determination step of Example 1 of the seventeenth embodiment. 第１７の実施の形態の実施例１の判定ステップが行う映像情報に関する判定処理を示すフローチャートである。31 is a flowchart illustrating a determination process regarding video information performed by a determination step of Example 1 of the seventeenth embodiment. 第１７の実施の形態の実施例１の判定ステップが行う音情報に関する判定処理を示すフローチャートである。It is a flow chart which shows the judgment processing about sound information performed by the judgment step of Example 1 of a 17th embodiment. 第１７の実施の形態の実施例２の判定ステップの処理を示すフローチャートの一部である。39 is a part of a flowchart showing the processing of the determination step in Example 2 of the seventeenth embodiment. 第１７の実施の形態の実施例２の判定ステップの処理を示すフローチャートの一部である。39 is a part of a flowchart showing the processing of the determination step in Example 2 of the seventeenth embodiment. 第１７の実施の形態の実施例３の判定ステップの処理を示すフローチャートである。35 is a flowchart illustrating a process of a determination step in Example 3 of the seventeenth embodiment. 第１７の実施の形態の実施例３の判定ステップが行う映像情報に関する判定処理を示すフローチャートである。35 is a flowchart illustrating a determination process regarding video information performed by a determination step of Example 3 of the seventeenth embodiment. 第１７の実施の形態の実施例３の判定ステップが行う音情報に関する判定処理を示すフローチャートである。39 is a flowchart illustrating sound information determination processing performed by a determination step of Example 3 of the seventeenth embodiment. 第１７の実施の形態の実施例４の判定ステップの処理を示すフローチャートの一部である。39 is a part of a flowchart showing the processing of the determination step in Example 4 of the seventeenth embodiment. 第１７の実施の形態の実施例４の判定ステップの処理を示すフローチャートの一部である。39 is a part of a flowchart showing the processing of the determination step in Example 4 of the seventeenth embodiment. 第１７の実施の形態の実施例４の判定ステップが行うユーザ要求による判定処理を示すフローチャートである。31 is a flowchart illustrating a determination process according to a user request performed by a determination step in Example 4 of the seventeenth embodiment. 第１７の実施の形態の実施例４の判定ステップが行う映像情報に関する判定処理を示すフローチャートである。35 is a flowchart illustrating a determination process regarding video information performed by a determination step of Example 4 of the seventeenth embodiment. 第１７の実施の形態の実施例４の判定ステップが行う音情報に関する判定処理を示すフローチャートである。39 is a flowchart illustrating sound information determination processing performed by a determination step of Example 4 of the seventeenth embodiment. 第１７の実施の形態の実施例５の判定ステップの処理を示すフローチャートの一部である。It is a part of flowchart which shows the process of the determination step of Example 5 of the seventeenth embodiment. 第１７の実施の形態の実施例５の判定ステップの処理を示すフローチャートの一部である。It is a part of flowchart which shows the process of the determination step of Example 5 of the seventeenth embodiment. 第１７の実施の形態の実施例５の判定ステップが行うユーザ要求による判定処理を示すフローチャートである。35 is a flowchart illustrating a determination process according to a user request performed by a determination step of Example 5 of the seventeenth embodiment. 本発明の第１８の実施の形態におけるデータ処理方法のブロック図である。[Fig. 129] A block diagram of a data processing method according to an eighteenth embodiment of the present invention. 本発明の第１９の実施の形態におけるデータ処理方法のブロック図である。[Fig. 129] A block diagram of a data processing method according to a nineteenth embodiment of the present invention. 本発明の第２０の実施の形態におけるデータ処理方法のブロック図である。[Fig. 129] A block diagram of a data processing method according to a twentieth embodiment of the present invention. 本発明の第２１の実施の形態におけるデータ処理方法のブロック図である。FIG. 39 is a block diagram of a data processing method according to a twenty-first embodiment of the present invention. 本発明の第２２の実施の形態におけるデータ処理方法のブロック図である。FIG. 33 is a block diagram of a data processing method according to a twenty-second embodiment of the present invention. 文脈内容記述データと物理内容記述データとをひとつにまとめたDTDと、XML文書の一例である。This is an example of a DTD in which context content description data and physical content description data are combined into one, and an XML document. 図１２７のXML文書の続きの部分である。This is the continuation of the XML document in FIG. 127. 図１２８の続き部分である。It is a continuation of FIG. 図１２９の続き部分である。It is a continuation part of FIG. 図１３０の続き部分である。130 is a continuation of FIG. 130. 図１３１の続き部分である。It is a continuation part of FIG. 本発明の第１１の実施の形態における文脈内容記述データのデータ構造を表す図である。It is a figure showing the data structure of context description data in the eleventh embodiment of the present invention. 本発明の第１１の実施の形態における観点を表す図である。FIG. 33 is a diagram illustrating a viewpoint in the eleventh embodiment of the present invention. 本発明の第１１の実施の形態における重要度を表す図である。It is a figure showing the importance in 11th Embodiment of this invention. 本発明の第１１の実施の形態における文脈内容記述データをコンピュータ上で表現するXMLのDTDと、XMLで書かれた文脈内容記述データの一部分の一例である。It is an example of the DTD of XML which expresses the context description data on a computer in the eleventh embodiment of the present invention, and an example of a part of the context description data written in XML. 図１３６の文脈内容記述データの続きの部分である。It is a continuation part of the context description data of FIG. 図１３７の続き部分である。It is a continuation part of FIG. 図１３８の続き部分である。It is a continuation part of FIG. 図１３９の続き部分である。It is a continuation part of FIG. 図１４０の続き部分である。140 is a continuation of FIG. 140. 図１４１の続き部分である。It is a continuation part of FIG. 図１４２の続き部分である。It is a continuation part of FIG. 図１４３の続き部分である。This is a continuation of FIG. 図１４４の続き部分である。This is a continuation of FIG. 144. 図１４５の続き部分である。It is a continuation part of FIG. 図１４６の続き部分である。It is a continuation part of FIG. 図１４７の続き部分である。This is a continuation of FIG. 図１４８の続き部分である。It is a continuation part of FIG. 図１４９の続き部分である。It is a continuation part of FIG. 図１５０の続き部分である。150 is a continuation of FIG. 150. 図１５１の続き部分である。It is a continuation part of FIG. 図１５２の続き部分である。It is a continuation part of FIG. 図１５３の続き部分である。It is a continuation part of FIG. 図１５４の続き部分である。It is a continuation part of FIG. 図１５５の続き部分である。This is a continuation of FIG. 図１５６の続き部分である。It is a continuation part of FIG. 図１５７の続き部分である。It is a continuation part of FIG. 図１５８の続き部分である。It is a continuation part of FIG. 図１５９の続き部分である。It is a continuation part of FIG. 図１６０の続き部分である。160 is a continuation of FIG. 160. 図１６１の続き部分である。It is a continuation part of FIG. 図１６２の続き部分である。It is a continuation part of FIG. 本発明の第１１の実施の形態における文脈内容記述データをコンピュータ上で表現するXMLのDTDと、XMLで書かれた文脈内容記述データの一部分の他の例である。It is another example of the DTD of XML which expresses the context content description data on the computer in 11th Embodiment of this invention, and a part of context content description data written in XML. 図１６４の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１６５の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図１６６の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１６７の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１６８の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１６９の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１７０の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１７１の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図１７２の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１７３の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１７４の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図１７５の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図１７６の文脈内容記述データの続きの部分である。This is a continuation part of the context content description data of FIG. 図１７７の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１７８の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１７９の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１８０の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１８１の文脈内容記述データの続きの部分である。FIG. 181 is a continuation part of the context description data. 図１８２の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図１８３の文脈内容記述データの続きの部分である。This is a continuation part of the context description data in FIG. 183. 図１８４の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１８５の文脈内容記述データの続きの部分である。This is a continuation part of the context content description data of FIG. 図１８６の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図１８７の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 187. 図１８８の文脈内容記述データの続きの部分である。It is a continuation part of the context content description data of FIG. 図１８９の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１９０の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１９１の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図１９２の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図１９３の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 193. 図１９４の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図１９５の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 本発明の第１１の実施の形態における他の態様の文脈内容記述データのデータ構造を表す図である。FIG. 39 is a diagram illustrating a data structure of context content description data of another aspect in the eleventh embodiment of the present invention. 本発明の第１１の実施の形態における、図１９７に対応した、文脈内容記述データをコンピュータ上で表現するXMLのDTDと、XMLで書かれた文脈内容記述データの一部分の一例である。FIG. 39 illustrates an example of an XML DTD for representing context content description data on a computer and a part of the context content description data written in XML, corresponding to FIG. 197 according to the eleventh embodiment of the present invention. 図１９８の文脈内容記述データの続きの部分である。This is the continuation of the context description data of FIG. 図１９９の続き部分である。It is a continuation part of FIG. 図２００の続き部分である。It is a continuation part of FIG. 図２０１の続き部分である。This is a continuation of FIG. 図２０２の続き部分である。It is a continuation part of FIG. 図２０３の続き部分である。This is a continuation of FIG. 図２０４の続き部分である。This is a continuation of FIG. 図２０５の続き部分である。This is a continuation of FIG. 205. 図２０６の続き部分である。This is a continuation of FIG. 206. 図２０７の続き部分である。This is a continuation of FIG. 図２０８の続き部分である。This is a continuation of FIG. 図２０９の続き部分である。This is a continuation of FIG. 図２１０の続き部分である。It is a continuation part of FIG. 図２１１の続き部分である。This is a continuation of FIG. 図２１２の続き部分である。It is a continuation of FIG. 図２１３の続き部分である。This is a continuation of FIG. 図２１４の続き部分である。This is a continuation of FIG. 214. 図２１５の続き部分である。This is a continuation of FIG. 図２１６の続き部分である。It is a continuation part of FIG. 図２１７の続き部分である。This is a continuation of FIG. 図２１８の続き部分である。This is a continuation of FIG. 図２１９の続き部分である。It is a continuation of FIG. 図２２０の続き部分である。It is a continuation part of FIG. 図２２１の続き部分である。It is a continuation part of FIG. 本発明の第１１の実施の形態における、図１９７に対応した、文脈内容記述データをコンピュータ上で表現するXMLのDTDと、XMLで書かれた文脈内容記述データの一部分の他の例である。FIG. 39 shows another example of the DTD of XML for expressing the context description data on a computer and a part of the context description data written in XML, corresponding to FIG. 197, in the eleventh embodiment of the present invention. 図２２３の文脈内容記述データの続きの部分である。This is a continuation part of the context description data of FIG. 図２２４の続き部分である。This is a continuation of FIG. 図２２５の続き部分である。It is a continuation part of FIG. 図２２６の続き部分である。It is a continuation part of FIG. 図２２７の続き部分である。This is a continuation of FIG. 図２２８の続き部分である。This is a continuation of FIG. 図２２９の続き部分である。It is a continuation part of FIG. 図２３０の続き部分である。It is a continuation part of FIG. 図２３１の続き部分である。It is a continuation part of FIG. 図２３２の続き部分である。It is a continuation part of FIG. 図２３３の続き部分である。This is a continuation part of FIG. 図２３４の続き部分である。This is a continuation part of FIG. 図２３５の続き部分である。This is a continuation part of FIG. 図２３６の続き部分である。It is a continuation part of FIG. 図２３７の続き部分である。This is a continuation of FIG. 図２３８の続き部分である。This is a continuation of FIG. 図２３９の続き部分である。It is a continuation part of FIG. 図２４０の続き部分である。This is a continuation of FIG. 図２４１の続き部分である。It is a continuation part of FIG. 図２４２の続き部分である。This is a continuation part of FIG. 図２４３の続き部分である。This is a continuation part of FIG. 図２４４の続き部分である。This is a continuation of FIG. 図２４５の続き部分である。This is a continuation part of FIG. 図２４６の続き部分である。This is a continuation part of FIG. 図２４７の続き部分である。This is a continuation part of FIG. 図２４８の続き部分である。This is a continuation of FIG. 図２４９の続き部分である。This is a continuation part of FIG. 図２５０の続き部分である。It is a continuation part of FIG. 図２５１の続き部分である。It is a continuation part of FIG.

Explanation of reference numerals

　１０１　選択ステップ
　１０２　抽出ステップ
　５０１，５１３　選択ステップ
　５０３　抽出ステップ
　５０５　再生ステップ
　５０７　映像選択ステップ
　５０９　音選択ステップ
　５１１　判定ステップ
　５１５　構成ステップ
　５１７　配送ステップ
　５１９　記録ステップ
　５２３　データ記録媒体管理ステップ
　５２５　蓄積コンテンツ管理ステップ
　６０１　分離手段
　６０２　ビデオスキミング手段
　６０３　オーディオスキミング手段
　１３０１　節<section>
　１３０２　節<section>
　１３０１　節<section>
　１３０１　葉<segment>
　１８０１　選択ステップ
　１８０２　抽出ステップ
　１８０３　構成ステップ
　１８０４　配送ステップ
　１８０５　データベース
　２４０１　インターフェース手段
　２４０２　分離手段
　２４０３　ビデオスキミング手段
　２４０４　オーディオスキミング手段
　３１０１　選択ステップ
　３１０２　抽出ステップ
　３１０３　構成ステップ
　３１０４　配送ステップ
　３１０５　データベース
　３２０１　ストリーム選択手段
　３２０２　多重化手段
　３２０３　状況判定手段
　３２０４　配送手段
　４１０１　選択ステップ
　４１０２　抽出ステップ
　４１０３　構成ステップ
　４１０４　配送ステップ
　４１０５　データベース
　４４０１　選択ステップ
　４４０２　抽出ステップ
　４４０３　構成ステップ
　４４０４　配送ステップ
　４４０５　データベース 101 Selection Step 102 Extraction Step 501, 513 Selection Step 503 Extraction Step 505 Playback Step 507 Video Selection Step 509 Sound Selection Step 511 Judgment Step 515 Construction Step 517 Delivery Step 519 Recording Step 523 Data Recording Medium Management Step 525 Storage Content Management Step 601 Separation Means 602 Video skimming means 603 Audio skimming means 1301 section
Section 1302 <section>
Section 1301 <section>
1301 leaf <segment>
1801 selection step 1802 extraction step 1803 construction step 1804 delivery step 1805 database 2401 interface means 2402 separation means 2403 video skimming means 2404 audio skimming means 3101 selection step 3102 extraction step 3103 construction step 3104 delivery step 3105 database 3201 stream selection means 3202 multiplexing means 3203 Situation determination means 3204 Delivery means 4101 Selection step 4102 Extraction step 4103 Configuration step 4104 Delivery step 4105 Database 4401 Selection step 4402 Extraction step 4403 Configuration step 4404 Delivery step 4405 Database

Claims

Context content description data in which a segment representing each scene of a media content composed of a plurality of scenes and a score representing importance based on the context content of the media content, which is attribute information of the segment, are input. Means to
A selection unit for selecting a segment based on the score.

2. The scene according to claim 1, wherein each scene of the media content is separated by time according to a break of the scene, and the context content description data describes time information indicating the break of the scene as the attribute information. Data processing device.

The data processing apparatus according to claim 2, wherein the time information includes a start time and an end time of each scene.

The data processing device according to claim 2, wherein the time information includes a start time and a duration of each of the scenes.

5. The data processing device according to claim 1, wherein the plurality of segments are described hierarchically in the context description data.

The data processing device according to any one of claims 1 to 5, wherein the context content description data has auxiliary information relating to context content.

7. The data processing device according to claim 1, wherein the media content is at least one of video information and sound information.

8. The data processing device according to claim 1, wherein a link destination of representative data representing the segment is added to the segment.

The data processing apparatus according to claim 8, wherein the representative data is at least one of video information and sound information.

Context content description data in which a segment representing each scene of a media content composed of a plurality of scenes and a score representing importance based on the context content of the media content, which is attribute information of the segment, are input. And
A data processing method for selecting a segment based on the score.

11. The scene according to claim 10, wherein each scene of the media content is separated by time according to a break of the scene, and the context content description data describes time information indicating the break of the scene as the attribute information. Data processing method.

12. The data processing method according to claim 11, wherein the time information includes a start time and an end time of each scene.

12. The data processing method according to claim 11, wherein the time information includes a start time and a duration of each of the scenes.

14. The data processing method according to claim 10, wherein a plurality of the segments are described hierarchically in the context description data.

The data processing method according to any one of claims 10 to 14, wherein the context content description data has auxiliary information relating to context content.

16. The data processing method according to claim 10, wherein the media content is at least one of video information and sound information.

17. The data processing method according to claim 10, wherein a link destination of representative data representing the segment is added to the segment.

18. The data processing method according to claim 17, wherein the representative data is at least one of video information and sound information.

Context content description data describing a segment representing each scene of a media content composed of a plurality of scenes and a score representing importance based on the context content of the media content, which is attribute information of the segment, is input. And a program for causing a computer to execute a step of selecting a segment based on the score.

20. The scene according to claim 19, wherein each scene of the media content is separated by time according to a break of the scene, and the context content description data describes time information indicating the break of the scene as the attribute information. program.

A computer-readable storage medium on which the program according to claim 19 or 20 is recorded.