JP4214103B2

JP4214103B2 - Data processing apparatus and data processing method

Info

Publication number: JP4214103B2
Application number: JP2004291993A
Authority: JP
Inventors: 敏彦宗續; 恒一江村
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2000-06-14
Filing date: 2004-10-04
Publication date: 2009-01-28
Anticipated expiration: 2021-05-28
Also published as: JP2005122730A

Description

本発明は、動画や映像、音声などの連続視聴覚情報であるメディアコンテンツの視聴、再生、配送において、視聴者の嗜好や端末能力に合わせた再生、配送を行うために、メディアコンテンツの構成の記述から、再生を行う表現の記述への変換を行うデータ処理装置およびデータ処理方法に関するものである。 The present invention describes the configuration of media content in order to perform playback and delivery in accordance with viewer preference and terminal capability in viewing, playback, and delivery of media content that is continuous audiovisual information such as video, video, and audio. The present invention relates to a data processing apparatus and a data processing method for converting a description to a description of an expression to be reproduced.

従来、メディアコンテンツの保存はファイル単位に行われており、メディアコンテンツの再生、配送はメディアコンテンツを格納するファイル単位で行われている。 Conventionally, storage of media content is performed in units of files, and reproduction and delivery of media content are performed in units of files that store media content.

また、メディアコンテンツを複数の異なる方式でデジタル化して複数のファイルに保存する場合は、メディアコンテンツを再生する場合に復号処理が発生する。そして、復号処理の処理量は、デジタル化の方式により変わる。このため、メディアコンテンツを選ぶ際には、メディアコンテンツを再生する端末の処理能力に合わせたデジタル化方式でデジタル化されたメディアコンテンツを選ぶ必要が生じる。この場合は、端末機器の能力による表示メディアコンテンツの選択は、受け取るユーザが、使用する端末の能力に合わせたメディアコンテンツをファイル単位で選択して行っている。 In addition, when media content is digitized by a plurality of different methods and stored in a plurality of files, decoding processing occurs when the media content is played back. The amount of decoding processing varies depending on the digitization method. For this reason, when selecting media content, it is necessary to select media content that has been digitized by a digitization method that matches the processing capability of the terminal that reproduces the media content. In this case, the display media content is selected based on the capability of the terminal device by the receiving user by selecting the media content that matches the capability of the terminal to be used in units of files.

また、World Wide Webを用いた動画配信において、特定シーンのみの再生を行う方法として、特許文献１に記載されたものが知られている。図５０に特許文献１記載の動画配信装置の構成図を示し、以下に説明する。 Also, as a method of reproducing only a specific scene in moving image distribution using the World Wide Web, the method described in Patent Document 1 is known. FIG. 50 shows a configuration diagram of the moving image distribution apparatus described in Patent Document 1, which will be described below.

この動画配信装置においては、予め、シーン情報入力部３９０３からシーン情報格納部３９０４に、シーン番号、開始／終了フレームのタイムコード、シーンに関するキーワード、動画ファイル名を入力しておく。そして、シーン検索部３９０５が、シーン情報入力部３９０３から入力された検索条件を用いて、シーン情報格納部３９０４に格納されているシーン情報を検索する。そして、シーン検索部３９０５は、検索した所望のシーンのシーン番号を抽出してシナリオとして、シナリオ格納部３９０７に格納する。 In this moving image distribution apparatus, a scene number, a start / end frame time code, a keyword related to a scene, and a moving image file name are input in advance from the scene information input unit 3903 to the scene information storage unit 3904. Then, the scene search unit 3905 searches the scene information stored in the scene information storage unit 3904 using the search conditions input from the scene information input unit 3903. Then, the scene search unit 3905 extracts the scene number of the searched desired scene and stores it in the scenario storage unit 3907 as a scenario.

次に、シナリオ編集部３９０８が、必要に応じて、抽出されたシーンの順序変更や不要なシーンの削除を行う。そして、動画転送部３９０９が、動画ファイル格納部３９０２に格納された動画データを、シナリオ編集部３９０８が編集したシナリオに格納されているシーン番号順に、動画再生部３９１０に転送して再生する。なお、動画ファイル格納部３９０２には、動画ファイル入力部３９０１から動画が入力される。
特開平１０−１１１８７２号公報 Next, the scenario editing unit 3908 changes the order of the extracted scenes and deletes unnecessary scenes as necessary. Then, the moving image transfer unit 3909 transfers the moving image data stored in the moving image file storage unit 3902 to the moving image reproducing unit 3910 in the order of the scene numbers stored in the scenario edited by the scenario editing unit 3908 and reproduces it. Note that a moving image is input from the moving image file input unit 3901 to the moving image file storage unit 3902.
JP 10-1111872 A

しかしながら、従来の、コンテンツの再生をファイル単位で行う方法では、ファイルの格納されたコンテンツをすべて再生しなくてはならない。よって、コンテンツの要約であるあらすじを見ることは不可能である。また、コンテンツの一部のみを抽出したハイライトシーンや、視聴者が見たい場面を検索する場合においても、コンテンツの先頭から参照しなければならないという課題がある。 However, in the conventional method of reproducing contents in units of files, all contents stored in files must be reproduced. Therefore, it is impossible to see an outline that is a summary of the content. In addition, when searching for a highlight scene in which only a part of the content is extracted or a scene that the viewer wants to see, there is a problem that the content must be referenced from the top.

また、特許文献１記載の方法によれば、シーンカットの再生順序を指定できるため、コンテンツの先頭から参照することは不要となる。しかし、この方法は、シナリオとしてシーンの再生の順番をもっているだけである。このため、シーンの再生の順番を並べ替える以外のことはできない。よって、複数メディアを対応させて再生するなどの、複雑な再生ができないという問題がある。 In addition, according to the method described in Patent Document 1, it is not necessary to refer to the beginning of content because the playback order of scene cuts can be specified. However, this method only has the order of scene playback as a scenario. For this reason, it is impossible to do anything other than rearranging the order of scene playback. Therefore, there is a problem that complicated reproduction such as reproduction corresponding to a plurality of media cannot be performed.

本発明の目的は、メディアコンテンツの構成を表現する構造記述データから、この構造記述データに記述されているメディアセグメントを、さまざまな制約を加えて再生するための表現記述データを生成することである。 An object of the present invention is to generate expression description data for reproducing a media segment described in the structure description data with various restrictions from the structure description data expressing the configuration of the media content. .

上記課題を解決するために、本発明は、メディアコンテンツの構成を記述した構造記述データから、構造記述データに記述されているメディアセグメントの再生順序、再生のタイミング、および同期情報を表現する表現記述データを生成するようにした。 In order to solve the above problems, the present invention provides an expression description that expresses the playback order, playback timing, and synchronization information of the media segments described in the structure description data from the structure description data that describes the structure of the media content. Generate data.

このように、構造記述データから、幾つかのメディアセグメントを選択してメディアセグメントの再生順序、再生のタイミング、および同期情報を表現する表現記述データへ変換することにより、あらすじやハイライトシーン、視聴者の嗜好に合わせたシーン集の表示形態を得ることができる。また、表現記述データが再生順序、再生のタイミング、および同期情報を持つことにより、複数のメディアを対応付けて再生することができる。 Thus, by selecting several media segments from the structure description data and converting them into expression description data that represents the playback order, playback timing, and synchronization information of the media segments, synopsis, highlight scene, viewing The display form of the scene collection according to the user's preference can be obtained. In addition, since the expression description data has a reproduction order, reproduction timing, and synchronization information, a plurality of media can be associated and reproduced.

さらに、本発明は、構造記述データにメディアセグメントの代替データの集合を記憶し、メディアセグメントおよび代替データの少なくとも一方の、再生順序、再生のタイミング、および同期情報を表現する表現記述データに変換するようにした。 Furthermore, the present invention stores a set of alternative data of media segments in the structure description data, and converts the data into expression description data expressing the reproduction order, reproduction timing, and synchronization information of at least one of the media segments and the alternative data. I did it.

これにより、メディアコンテンツを配送するネットワークの容量や通信量、メディアコンテンツを再生する端末の能力などに合わせて、メディアセグメントを再生するか代替データを再生するか切り替えられる。つまり、再生する端末の能力などに合わせたメディアを用いて、コンテンツの配送や再生が行える。 As a result, switching between media segment reproduction or alternative data reproduction can be performed in accordance with the capacity and communication volume of the network for delivering the media content, the capability of the terminal for reproducing the media content, and the like. In other words, it is possible to deliver and play back content using media that matches the capabilities of the terminal to be played back.

さらに、本発明は、構造記述データに表現されているメディアセグメントを再生する際に、メディアセグメントあるいは代替データのどちらを再生するかを選択するメディア選択部を設けている。 Furthermore, the present invention is provided with a media selection unit that selects whether to reproduce a media segment or alternative data when a media segment expressed in the structure description data is reproduced.

これにより、視聴者が端末の能力に合わせて、メディアセグメントあるいは代替データを選択しなくても、メディア選択部により端末の能力に合わせてメディアセグメントあるいは代替データを自動的に選択できる。 Thereby, even if a viewer does not select a media segment or alternative data according to the capability of the terminal, the media selection unit can automatically select the media segment or alternative data according to the capability of the terminal.

さらに、本発明は、構造記述データに各メディアセグメントの文脈内容に基づいたスコアをさらに記述した。 Furthermore, the present invention further describes the score based on the context contents of each media segment in the structure description data.

これにより、例えば、さまざまな再生時間のハイライトシーン集などを生成でき、これの再生や配送を容易に行うことができる。また、スコアをキーワードで示される観点に基づくものとすることにより、キーワードを指定することによって、ユーザの好みに合わせたシーンだけを再生、配送することができる。 Thereby, for example, a collection of highlight scenes having various reproduction times can be generated, and the reproduction and delivery thereof can be easily performed. Further, by setting the score based on the viewpoint indicated by the keyword, by designating the keyword, it is possible to reproduce and deliver only the scene according to the user's preference.

本発明によれば、メディアセグメントによるメディアコンテンツの構成を記述する構造記述データから、メディアコンテンツを再生する形態を表現する表現記述データへの変換を行える。このため、メディアコンテンツの再生において、メディアセグメント毎に再生するタイミング、同期情報などの制約を加えることができ、さまざまな再生が実現できる。 ADVANTAGE OF THE INVENTION According to this invention, the structure description data describing the structure of the media content by a media segment can be converted into the expression description data expressing the form which reproduces a media content. For this reason, in the playback of media content, restrictions such as playback timing and synchronization information for each media segment can be added, and various playbacks can be realized.

本発明の第１の態様にかかるデータ処理装置は、連続視聴覚情報であるメディアコンテンツの全体あるいは前記メディアコンテンツの部分の構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの時間情報の集合によって表現する構造記述データを入力し、入力した前記構造記述データに記述されている前記メディアセグメントの時間情報を取得する解析部と、解析した前記メディアセグメントの時間情報を用いて前記構造記述データを前記メディアセグメントの、再生順序、再生のタイミング、および同期情報を表現する表現記述データに変換し出力する変換部と、を具備した構成を採る。 According to a first aspect of the present invention, there is provided a data processing apparatus comprising: a set of time information of a media segment that is a division into which the media content is divided into an entire media content that is continuous audiovisual information or a configuration of the media content. The structure description data to be input is input, the analysis unit that acquires the time information of the media segment described in the input structure description data, and the structure description data is converted to the structure description data using the time information of the analyzed media segment. The media segment includes a playback unit, a playback timing, and a conversion unit that converts the media segment into expression description data that expresses synchronization information and outputs the data.

この構成により、メディアコンテンツの構成に関する情報から、その再生に関する情報を生成することができる。 With this configuration, it is possible to generate information related to reproduction from information related to the configuration of media content.

本発明の第２の態様は、第１の態様にかかるデータ処理装置において、前記構造記述データは、前記メディアセグメントの代替データの集合を有し、前記変換部は、前記構造記述データを前記メディアセグメントおよび前記代替データの少なくとも一方の、再生順序、再生のタイミング、および同期情報を表現する表現記述データに変換する。 According to a second aspect of the present invention, in the data processing apparatus according to the first aspect, the structure description data includes a set of alternative data of the media segment, and the conversion unit converts the structure description data into the medium. At least one of the segment and the alternative data is converted into expression description data representing reproduction order, reproduction timing, and synchronization information.

これにより、メディアコンテンツの構成に関する情報から、表示メディアの選択を含んだ再生に関する情報を生成できる。 As a result, information related to reproduction including selection of display media can be generated from information related to the configuration of the media content.

本発明の第３の態様は、第２の態様にかかるデータ処理装置において、前記構造記述データに表現されている前記メディアセグメントを再生する際に、前記メディアセグメントあるいは前記代替データのどちらを再生するかを選択するメディア選択部を具備し、前記変換部は、前記メディア選択部の選択に基づいて、前記メディアセグメントもしくは前記代替データデータのどちらか一方の、再生順序、再生のタイミング、および同期情報を表現する表現記述データを出力する。 According to a third aspect of the present invention, in the data processing apparatus according to the second aspect, when reproducing the media segment expressed in the structure description data, either the media segment or the alternative data is reproduced. A media selection unit for selecting the media segment, and the conversion unit, based on the selection of the media selection unit, the playback order, playback timing, and synchronization information of either the media segment or the alternative data data The expression description data that expresses is output.

本発明の第４の態様は、第１の態様から第３の態様のいずれかのデータ処理装置において、前記表現記述データは、SMIL文書である。 According to a fourth aspect of the present invention, in the data processing device according to any one of the first to third aspects, the expression description data is a SMIL document.

これにより、表現記述データを国際標準規格のものにできるので、表現記述データに汎用性を持たせることができる。 As a result, the expression description data can be made to the international standard, so that the expression description data can be versatile.

本発明の第５の態様にかかるデータ処理装置は、映像情報と音声情報とが同期した連続視聴覚情報であるメディアコンテンツの構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの集合によって表現し、かつ前記メディアセグメントの時間情報と前記メディアセグメントの文脈内容におけるスコアを記述した構造記述データと、前記構造記述データから所定の前記メディアセグメントを選択するための選択条件とを入力し、入力した前記構造記述データから前記選択条件にあったスコアを持つ前記メディアセグメントだけを選択する選択部と、前記選択部が選択した前記メディアセグメントを、前記メディアセグメントの、再生順序および再生のタイミングを表現する表現記述データに変換し出力する変換部と、を具備した構成を採る。 The data processing apparatus according to the fifth aspect of the present invention expresses the configuration of media content, which is continuous audiovisual information in which video information and audio information are synchronized, by a set of media segments that are divisions of the media content. And the structure description data describing the time information of the media segment and the score in the context content of the media segment, and the selection condition for selecting the predetermined media segment from the structure description data, and the input A selection unit that selects only the media segment having a score that meets the selection condition from structure description data, and an expression that expresses the playback order and playback timing of the media segment for the media segment selected by the selection unit. A conversion unit that converts and outputs the description data It takes Bei configuration.

この構成により、メディアコンテンツの構成に関する情報から、条件にあったメディアセグメントだけを選び出し、選び出したメディアセグメントのみの再生に関する情報を生成できる。 With this configuration, it is possible to select only the media segment that meets the condition from the information related to the configuration of the media content, and to generate information related to reproduction of only the selected media segment.

本発明の第６の態様は、第５の態様にかかるデータ処理装置において、前記構造記述データは、前記メディアセグメントの代替データの集合を有し、前記変換部は、前記構造記述データを前記メディアセグメントおよび前記代替データの少なくとも一方の、再生順序、再生のタイミング、および同期情報を表現する表現記述データに変換する。 According to a sixth aspect of the present invention, in the data processing device according to the fifth aspect, the structure description data includes a set of alternative data of the media segment, and the conversion unit converts the structure description data into the medium. At least one of the segment and the alternative data is converted into expression description data representing reproduction order, reproduction timing, and synchronization information.

これにより、メディアコンテンツの構成に関する情報から、条件に合ったメディアセグメントだけを選び出し、選び出したメディアセグメントの表示メディアの選択を含んだ再生に関する情報を生成できる。 As a result, it is possible to select only the media segment that meets the conditions from the information related to the configuration of the media content, and to generate information related to reproduction including selection of the display media of the selected media segment.

本発明の第７の態様は、第６の態様のデータ処理装置において、前記選択部は、前記選択したメディアセグメントを再生する際に、前記メディアセグメントあるいは前記代替データのどちらを再生するかの選択を行う。 According to a seventh aspect of the present invention, in the data processing device according to the sixth aspect, the selection unit selects whether to reproduce the media segment or the alternative data when reproducing the selected media segment. I do.

これにより、メディアコンテンツの構成に関する情報から、あらすじやハイライトシーンを構成するメディアセグメントだけを選び出し、選び出したメディアセグメントの再生に関する情報を生成できる。 As a result, it is possible to select only the media segment constituting the synopsis or highlight scene from the information related to the configuration of the media content, and to generate information related to the playback of the selected media segment.

本発明の第８の態様は、第５の態様から第７の態様のいずれかのデータ処理装置において、前記スコアは、メディアコンテンツの文脈内容に基づいた該当メディアセグメントの重要度である。 According to an eighth aspect of the present invention, in the data processing device according to any one of the fifth to seventh aspects, the score is the importance of the corresponding media segment based on the context content of the media content.

これにより、メディアコンテンツの構成に関する情報から、視聴者の嗜好を反映したあらすじやハイライトシーンを構成するメディアセグメントだけを選び出し、選び出したメディアセグメントのみの再生に関する情報を生成できる。 As a result, it is possible to select only a media segment that constitutes a synopsis or highlight scene reflecting the viewer's preference from information related to the configuration of the media content, and generate information related to playback of only the selected media segment.

本発明の第９の態様は、第５の態様から第８の態様のいずれかのデータ処理装置において、前記メディアセグメントにはキーワードで表現される観点が付与され、前記スコアは前記観点に基づいた該当メディアセグメントの重要度である。 According to a ninth aspect of the present invention, in the data processing device according to any one of the fifth to eighth aspects, the media segment is assigned a viewpoint expressed by a keyword, and the score is based on the viewpoint The importance of the corresponding media segment.

本発明の第１０の態様にかかるデータ処理装置は、映像情報もしくは音声情報の少なくともどちらか一方の連続視聴覚情報であるメディアコンテンツの構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの集合によって表現し、かつ前記メディアセグメントの時間情報と前記メディアセグメントの文脈内容におけるスコアを記述した構造記述データと、前記構造記述データから所定の前記メディアセグメントを選択するための選択条件とを入力し、入力した前記構造記述データから前記選択条件にあったスコアを持つ前記メディアセグメントだけを選択する選択部と、前記選択部が選択した前記メディアセグメントを、前記メディアセグメントの、再生順序、再生のタイミング、および同期情報を表現する表現記述データに変換し出力する変換部と、を具備した構成を採る。 According to a tenth aspect of the present invention, there is provided a data processing device comprising: a set of media segments, which are divisions of the media content, comprising a configuration of media content that is continuous audiovisual information of at least one of video information and audio information. The structure description data that expresses and describes the time information of the media segment and the score in the context content of the media segment, and the selection condition for selecting the predetermined media segment from the structure description data are input and input A selection unit that selects only the media segment having a score that meets the selection condition from the structure description data, and the media segment selected by the selection unit is a playback order, a playback timing of the media segment, and Representation description expressing synchronization information Employs a configuration equipped with conversion section outputting converted into over data, the.

本発明の第１１の態様にかかるデータ処理装置は、連続視聴覚情報であるメディアコンテンツの構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの集合によって表現し、かつ前記メディアセグメントの時間情報と前記メディアセグメントの文脈内容におけるスコアを記述した構造記述データと、前記構造記述データから所定の前記メディアセグメントを選択するための選択条件とを入力し、入力した前記構造記述データから前記選択条件にあったスコアを持つ前記メディアセグメントだけを選択する選択部と、前記選択部が選択した前記メディアセグメントを、前記メディアセグメントの、再生順序、再生のタイミング、および同期情報を表現する表現記述データに変換し出力する変換部と、前記表現記述データと前記メディアコンテンツを入力し、前記表現記述データの内容に応じて前記メディアコンテンツを再生する再生部と、を具備した構成を採る。 According to an eleventh aspect of the present invention, there is provided a data processing device that expresses a configuration of media content that is continuous audiovisual information by a set of media segments that are segments into which the media content is segmented, and time information of the media segments. The structure description data describing the score in the context contents of the media segment and the selection condition for selecting the predetermined media segment from the structure description data are input, and the selection condition is determined from the input structure description data. A selection unit that selects only the media segment having a score, and the media segment selected by the selection unit is converted into expression description data that represents the playback order, playback timing, and synchronization information of the media segment. The conversion unit to output and the expression description data The type of media content, a configuration provided with the, a reproducing unit for reproducing the media content in accordance with the contents of the representation description data.

この構成により、メディアコンテンツの構成に関する情報から、条件にあったメディアセグメントだけを選び出し、選び出したメディアセグメントのみを再生できる。 With this configuration, it is possible to select only the media segment that meets the condition from the information related to the configuration of the media content, and to reproduce only the selected media segment.

本発明の第１２の態様にかかるサーバ−クライアントシステムは、第１１の態様の、前記選択部および前記変換部を有するサーバと、第１１の態様の前記再生部を有するクライアントと、前記サーバと前記クライアントを接続するネットワークと、を具備し、前記サーバと前記クライアントの間で前記表現記述データの通信を行う。 A server-client system according to a twelfth aspect of the present invention is the server having the selection unit and the conversion unit, the client having the reproduction unit according to the eleventh aspect, the server, A network for connecting clients, and the representation description data is communicated between the server and the clients.

これにより、サーバ側でユーザの要求やネットワークの状態、再生端末の能力などに合わせた要約と再生メディアの選択を行い、クライアント側で必要なメディアセグメントのみの再生が行え、また再生メディアの通信においても必要なデータだけを受け取ればよい、という作用を有する。 This allows the server to select the summary and playback media according to user requirements, network conditions, playback terminal capabilities, etc., and to play only the necessary media segments on the client side. Has the effect that only necessary data need be received.

本発明の第１３の態様にかかるサーバ−クライアントシステムは、第１１の態様の前記選択部を有するサーバと、第１１の態様の、前記変換部および前記再生部を有するクライアントと、前記サーバと前記クライアントを接続するネットワークと、を具備し、前記サーバと前記クライアントの間で前記選択部が選択したメディアセグメントだけを残した要約構造記述データの通信を行う。 A server-client system according to a thirteenth aspect of the present invention is a server including the selection unit according to an eleventh aspect, a client including the conversion unit and the reproduction unit according to the eleventh aspect, the server, and the server. A network to which the client is connected, and the summary structure description data in which only the media segment selected by the selection unit remains is communicated between the server and the client.

これにより、サーバ側でユーザの要求やネットワークの状態、再生端末の能力などに合わせた要約を行い、クライアント側で様々な条件にあった適切な再生メディアの選択を行い、必要なメディアセグメントのみの再生が行え、また再生メディアの通信においても必要なデータだけを受け取ればよい、という作用を有する。 As a result, the server side summarizes it according to user requirements, network conditions, playback terminal capabilities, etc., the client side selects the appropriate playback media that meets various conditions, and only the necessary media segments. Reproduction is possible, and only necessary data is received in communication of reproduction media.

本発明の第１４の態様は、連続視聴覚情報であるメディアコンテンツの全体あるいは前記メディアコンテンツの部分の構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの時間情報の集合によって表現する構造記述データを入力し、入力した前記構造記述データに記述されている前記メディアセグメントの時間情報を取得し、解析した前記メディアセグメントの時間情報を用いて前記構造記述データを前記メディアセグメントの、再生順序、再生のタイミング、および同期情報を表現する表現記述データに変換し出力することを特徴とするデータ処理方法である。 According to a fourteenth aspect of the present invention, there is provided structure description data that expresses the entire media content that is continuous audiovisual information or the configuration of the media content portion by a set of time information of media segments that are the divisions of the media content. To obtain the time information of the media segment described in the inputted structure description data, and using the analyzed time information of the media segment, the structure description data is reproduced in the reproduction order and reproduction of the media segment. The data processing method is characterized in that it is converted into expression description data expressing the timing and synchronization information and output.

本発明の第１５の態様は、映像情報と音声情報とが同期した連続視聴覚情報であるメディアコンテンツの構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの集合によって表現し、かつ前記メディアセグメントの時間情報と前記メディアセグメントの文脈内容におけるスコアを記述した構造記述データと、前記構造記述データから所定の前記メディアセグメントを選択するための選択条件とを入力し、入力した前記構造記述データから前記選択条件にあったスコアを持つ前記メディアセグメントだけを選択し、選択した前記メディアセグメントを、前記メディアセグメントの、再生順序および再生のタイミングを表現する表現記述データに変換し出力することを特徴とするデータ処理方法である。 According to a fifteenth aspect of the present invention, a configuration of media content that is continuous audiovisual information in which video information and audio information are synchronized is expressed by a set of media segments that are divisions of the media content, and the media segments Time description, structure description data describing a score in the context content of the media segment, and a selection condition for selecting a predetermined media segment from the structure description data, and from the input structure description data Only the media segment having a score that meets a selection condition is selected, and the selected media segment is converted into expression description data that represents the playback order and playback timing of the media segment and is output. Data processing method.

本発明の第１６の態様は、映像情報もしくは音声情報の少なくともどちらか一方の連続視聴覚情報であるメディアコンテンツの構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの集合によって表現し、かつ前記メディアセグメントの時間情報と前記メディアセグメントの文脈内容におけるスコアを記述した構造記述データと、前記構造記述データから所定の前記メディアセグメントを選択するための選択条件とを入力し、入力した前記構造記述データから前記選択条件にあったスコアを持つ前記メディアセグメントだけを選択し、選択した前記メディアセグメントを、前記メディアセグメントの、再生順序、再生のタイミング、および同期情報を表現する表現記述データに変換し出力することを特徴とするデータ処理方法である。 According to a sixteenth aspect of the present invention, a configuration of media content that is continuous audiovisual information of at least one of video information and audio information is expressed by a set of media segments that are divisions of the media content, and The structure description data in which the time information of the media segment, the structure description data describing the score in the context content of the media segment, and the selection condition for selecting the predetermined media segment from the structure description data are input. Only the media segment having a score that meets the selection condition is selected, and the selected media segment is converted into expression description data representing the playback order, playback timing, and synchronization information of the media segment and output. Data characterized by It is a management method.

本発明の第１７の態様は、コンピュータに、連続視聴覚情報であるメディアコンテンツの全体あるいは前記メディアコンテンツの部分の構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの時間情報の集合によって表現する構造記述データを入力させ、入力した前記構造記述データに記述されている前記メディアセグメントの時間情報を取得させ、解析した前記メディアセグメントの時間情報を用いて前記構造記述データを前記メディアセグメントの、再生順序、前記メディアセグメントの再生のタイミング、および前記メディアセグメントの同期情報を表現する表現記述データに変換させ出力させることを特徴とするプログラムである。 According to a seventeenth aspect of the present invention, the entire media content that is continuous audiovisual information or the configuration of the media content portion is expressed on a computer by a set of time information of media segments that are segments into which the media content is segmented. The structure description data is input, the time information of the media segment described in the input structure description data is acquired, and the structure description data is reproduced using the analyzed time information of the media segment. A program characterized in that it is converted into expression description data expressing the order, the timing of playback of the media segment, and the synchronization information of the media segment, and output.

本発明の第１８の態様は、コンピュータに、映像情報と音声情報とが同期した連続視聴覚情報であるメディアコンテンツの構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの集合によって表現し、かつ前記メディアセグメントの時間情報と前記メディアセグメントの文脈内容におけるスコアを記述した構造記述データと、前記構造記述データから所定の前記メディアセグメントを選択するための選択条件とを入力させ、入力した前記構造記述データから前記選択条件にあったスコアを持つ前記メディアセグメントだけを選択させ、選択した前記メディアセグメントを、前記メディアセグメントの、再生順序および再生のタイミングを表現する表現記述データに変換させ出力させることを特徴とするプログラムである。 According to an eighteenth aspect of the present invention, a configuration of media content that is continuous audiovisual information in which video information and audio information are synchronized is expressed in a computer by a set of media segments that are divisions of the media content, and The structure description inputted by inputting the structure description data describing the time information of the media segment, the score in the context content of the media segment, and the selection condition for selecting the predetermined media segment from the structure description data Selecting only the media segment having a score that meets the selection condition from the data, converting the selected media segment into expression description data expressing the playback order and playback timing of the media segment, and outputting the data. It is a program that features .

本発明の第１９の態様は、コンピュータに、映像情報もしくは音声情報の少なくともどちらか一方の連続視聴覚情報であるメディアコンテンツの構成を、前記メディアコンテンツを区分けした区分であるメディアセグメントの集合によって表現し、かつ前記メディアセグメントの時間情報と前記メディアセグメントの文脈内容におけるスコアを記述した構造記述データと、前記構造記述データから所定の前記メディアセグメントを選択するための選択条件とを入力させ、入力した前記構造記述データから前記選択条件にあったスコアを持つ前記メディアセグメントだけを選択させ、選択した前記メディアセグメントを、前記メディアセグメントの、再生順序、再生のタイミング、および同期情報を表現する表現記述データに変換させ出力させることを特徴とするプログラムである。 According to a nineteenth aspect of the present invention, a configuration of media content that is continuous audiovisual information of at least one of video information and audio information is expressed on a computer by a set of media segments that are divided into the media content. And the structure description data describing the time information of the media segment and the score in the context content of the media segment, and the selection condition for selecting the predetermined media segment from the structure description data, and the input Only the media segment having the score that meets the selection condition is selected from the structure description data, and the selected media segment is used as the expression description data that represents the playback order, playback timing, and synchronization information of the media segment. Convert and output It is a program characterized by.

本発明の第２０の態様は、第１７の態様から第１９の態様のいずれかに記載のプログラムを記憶したコンピュータ読み取り可能な記録媒体である。 A twentieth aspect of the present invention is a computer-readable recording medium storing the program according to any one of the seventeenth to nineteenth aspects.

（実施の形態１）
以下、本発明の実施の形態１について添付図面を参照して説明する。まず、本発明の実施の形態１にかかるデータ処理システムの構成について、図１を用いて説明する。図１は、実施の形態１にかかるデータ処理システムの概念図である。 (Embodiment 1)
Embodiment 1 of the present invention will be described below with reference to the accompanying drawings. First, the configuration of the data processing system according to the first exemplary embodiment of the present invention will be described with reference to FIG. FIG. 1 is a conceptual diagram of the data processing system according to the first embodiment.

実施の形態１にかかるデータ処理システムは、メタデータのデータベース１００１と、要約エンジン１００２と、記述コンバータ１００３と、再生機１００４と、メディアコンテンツのデータベース１００５と、から構成される。また、図中、１００６で記されるものはメタデータである内容記述であり、１００７で記されるものは選択条件であり、１００８で記されるものは要約結果である要約内容記述であり、１００９で記されるものは再生機１００４に指示を与える再生方法記述であり、１０１０で記されるものはメディアコンテンツデータである。 The data processing system according to the first exemplary embodiment includes a metadata database 1001, a summary engine 1002, a description converter 1003, a playback device 1004, and a media content database 1005. Further, in the figure, what is described as 1006 is a content description that is metadata, what is described as 1007 is a selection condition, and what is described as 1008 is a summary content description that is a summary result, What is written in 1009 is a playback method description for giving an instruction to the player 1004, and what is written in 1010 is media content data.

また、メタデータとは、メディアコンテンツのタイトルや作成日時などの書誌事項や内容やシーン構成などの、メディアコンテンツの付加的な情報を表すデータである。データベース１００１は、このようなメタデータのデータベースを表す。 The metadata is data representing additional information of the media content such as bibliographic items such as the title of the media content, creation date and time, contents, and scene configuration. The database 1001 represents such a metadata database.

要約エンジン１００２は、データベース１００１に格納されたメタデータの中から、メディアコンテンツの内容や構成を表す構造記述データである内容記述１００６を入力する。そして、要約エンジン１００２は、入力した内容記述１００６から、ユーザが入力した選択条件１００７に合ったシーンのみを選択する。さらに、要約エンジン１００２は、内容記述１００６から選択されたシーンに関するものだけを残し、それ以外を削除した要約内容記述１００８を生成し出力する。 The summary engine 1002 inputs content description 1006 that is structure description data representing the content and configuration of the media content from the metadata stored in the database 1001. Then, the summary engine 1002 selects only scenes that meet the selection condition 1007 input by the user from the input content description 1006. Further, the summary engine 1002 generates and outputs a summary content description 1008 that leaves only the scenes selected from the content description 1006 and deletes the rest.

また、内容記述１００６と要約内容記述１００８は、メディアコンテンツの内容や構成を表す構造記述データであり、記述されているシーン数は異なるが、記述フォーマットは同じである。 The content description 1006 and the summary content description 1008 are structural description data representing the content and configuration of the media content, and the description format is the same although the number of scenes described is different.

記述コンバータ１００３は、要約内容記述１００８を入力し、要約内容記述１００８に記述されているシーンを再生するときの、再生順序、再生開始のタイミング、同期情報などのメディアの再生形態を記述した表現記述データである再生方法記述１００９を生成し、出力するものである。 The description converter 1003 receives the summary content description 1008 and expresses the description of the media playback mode, such as the playback order, playback start timing, and synchronization information when the scene described in the summary content description 1008 is played back. A reproduction method description 1009 that is data is generated and output.

再生機１００４は、再生方法記述１００９と、再生方法記述１００９に従って再生するデータであるメディアコンテンツデータ１０１０をメディアコンテンツデータベース１００５から入力する。そして、再生機１００４は、再生方法記述１００９に記述されている再生順序、再生開始のタイミング、同期情報などに従いメディアコンテンツデータ１０１０の再生を行うものである。 The playback device 1004 inputs a playback method description 1009 and media content data 1010 that is data to be played back according to the playback method description 1009 from the media content database 1005. The playback device 1004 plays back the media content data 1010 according to the playback order, playback start timing, synchronization information, and the like described in the playback method description 1009.

また、要約内容記述１００８と内容記述１００６は同じフォーマットであるため、記述コンバータ１００３は、内容記述１００６に対する再生方法記述（表現記述データ）も同様に生成することができる。 Further, since the summary content description 1008 and the content description 1006 have the same format, the description converter 1003 can similarly generate a reproduction method description (expression description data) for the content description 1006.

次に、内容記述１００６および要約内容記述１００８に用いられている構造記述データについて、図２（ａ）、図２（ｂ）、および図３を用いて説明する。 Next, the structure description data used in the content description 1006 and the summary content description 1008 will be described with reference to FIG. 2 (a), FIG. 2 (b), and FIG.

図２（ａ）は、構造記述データをXMLで記述するための定義であるDocument Type Definition(DTD)である。また、図２（ｂ）は、MPEG1を例に動画と音声が同期したメディアコンテンツに対する構造記述データの例である。また、図３は、動画と音声がそれぞれ別のメディアとなっているメディアコンテンツの構造記述データの例である。 FIG. 2A shows Document Type Definition (DTD) which is a definition for describing structure description data in XML. FIG. 2B is an example of structure description data for media content in which moving images and audio are synchronized, using MPEG1 as an example. FIG. 3 is an example of structure description data of media content in which moving images and audio are different media.

本実施の形態では、構造記述データをコンピュータ上で表現する形態の一例として、Extensible Markup Language(XML)を用いている。 In this embodiment, Extensible Markup Language (XML) is used as an example of a form in which structure description data is expressed on a computer.

XMLは、World Wide Web Consortium(W3C)によって標準化されたデータ記述言語であり、１９９８年２月１０日にVer.1.0が勧告された。XML ver.1.0の仕様書は、http://www.w3.org/TR/REC-xmlで得ることができる。 XML is a data description language standardized by the World Wide Web Consortium (W3C), and Ver. 1.0 was recommended on February 10, 1998. The XML ver.1.0 specification can be obtained at http://www.w3.org/TR/REC-xml.

まず、図２（ａ）を用いて、構造記述データをXMLで記述するための定義であるDocument Type Definition(DTD)の説明をする。 First, the Document Type Definition (DTD), which is a definition for describing structure description data in XML, will be described with reference to FIG.

図中２０１に示すように、cotents要素は、par要素とmediaObject要素で構成される。また、図中２０２で示すように、contents要素は、キャラクタデータで示されるtitle属性を持つようになっている。 As shown in 201 in the figure, the contents element is composed of a par element and a mediaObject element. Further, as indicated by 202 in the figure, the contents element has a title attribute indicated by character data.

mediaObject要素はメディアを表すものである。また、図中２０３に示すように、par要素は、複数の子要素であるmediaObject要素から構成される。par要素は、contents要素がaudioとvideo等の複数のmediaObject要素から構成される場合に、複数のmediaObject要素を子要素として、同期させて再生することを表すものである。 The mediaObject element represents the media. Further, as indicated by 203 in the figure, the par element is composed of a plurality of mediaObject elements that are child elements. The par element represents that, when the contents element is composed of a plurality of mediaObject elements such as audio and video, the plurality of mediaObject elements are played back in synchronization with each other as a child element.

また、図中２０４に示すように、mediaObject要素は、メディアセグメントを表すsegment要素から構成される。また、図中２０５に示すように、mediaObject要素は、そのtype属性によってメディアのタイプを指定される。この例では、メディアのタイプとして音声情報であるaudio、動画情報であるvideo、静止画情報であるimage、音声および動画が同期した情報であるaudiovideo、音声および静止画の情報であるaudioimageといったメディアのタイプが指定される。また、type属性の指定がない場合は、type属性は、デフォルトでaudiovideoに設定される。 Further, as indicated by 204 in the figure, the mediaObject element is composed of segment elements representing media segments. Further, as indicated by 205 in the figure, the mediaObject element is designated with the type of media by its type attribute. In this example, the media type is audio, which is audio information, video, which is video information, image, which is still image information, audiovideo, which is information in which audio and video are synchronized, and audioimage, which is audio and still image information. A type is specified. If the type attribute is not specified, the type attribute is set to audiovideo by default.

また、図中２０６に示すように、mediaObject要素は、format属性でMPEG1,MPEG2といったメディアのフォーマットが指定される。また、図中２０７に示すように、mediaObject要素は、src属性でデータが保存されている場所が指定される。src属性でUniform Resource Locator (URL)を指定することにより、データが保存されている場所を指定できる。 Also, as indicated by 206 in the figure, the mediaObject element designates a media format such as MPEG1 or MPEG2 in the format attribute. Further, as indicated by reference numeral 207 in the figure, the mediaObject element designates a location where data is stored with the src attribute. By specifying Uniform Resource Locator (URL) in src attribute, you can specify the location where data is stored.

また、図中２０８に示すように、segment要素は、start属性およびend属性を有する。start属性およびend属性は、segment要素の開始時間と、segment要素の終了時間と、を示す。また、start属性およびend属性は、mediaObject要素により指定されたメディア内部での時間を示す。つまり、segment要素は、start属性およびend属性により、mediaObject要素で指定されたメディア内のどの部分に相当するかが指定される。 Further, as indicated by 208 in the figure, the segment element has a start attribute and an end attribute. The start attribute and end attribute indicate the start time of the segment element and the end time of the segment element. The start attribute and the end attribute indicate the time inside the media specified by the mediaObject element. That is, the segment element specifies which part in the media specified by the mediaObject element corresponds to the start attribute and the end attribute.

なお、本実施の形態においては、メディアセグメントの時間情報を開始時間と終了時間との組によって指定しているが、開始時間と継続時間との組によって表してもよい。 In the present embodiment, the time information of the media segment is specified by a set of start time and end time, but may be expressed by a set of start time and duration.

次に、MPEG1を例に動画と音声が同期したメディアコンテンツに対する構造記述データの例について、図２（ｂ）を用いて説明する。 Next, an example of structure description data for media content in which moving images and audio are synchronized will be described using MPEG1 as an example with reference to FIG.

図２（ｂ）に示す構造記述データには、contents要素にMovie etcというタイトルが指定されている。そして、mediaObject要素のタイプにはaudiovideoが、フォーマットとしてMPEG1が、保存場所としてhttp://mserv.com/MPEG/movie0.mpgが指定されている。また、mediaObject要素は、時刻00:00:00から00:01:00の時間情報を持つsegment要素と、時刻00:01:0から00:02:00の時間情報を持つsegment要素と、時刻00:03:00から00:04:00の時間情報を持つsegment要素と、時刻00:04:00から00:05:00の時間情報を持つsegment要素と、を有する。つまり、mediaObject要素は、時刻00:02:00から00:03:00を除いた記述になっている。 In the structure description data shown in FIG. 2B, the title “Movie etc” is specified in the contents element. In the mediaObject element type, audiovideo is specified, MPEG1 is specified as the format, and http://mserv.com/MPEG/movie0.mpg is specified as the storage location. The mediaObject element includes a segment element having time information from time 00:00:00 to 00:01:00, a segment element having time information from time 00: 01: 0 to 00:02:00, and a time 00 Segment elements having time information from 03:00 to 00:04:00 and segment elements having time information from time 00:04:00 to 00:05:00. That is, the mediaObject element has a description excluding the time 00:02:00 to 00:03:00.

次に、動画と音声がそれぞれ別のメディアとなっているメディアコンテンツの構造記述データの例について図３を用いて説明する。 Next, an example of structure description data of media content in which moving images and audio are different media will be described with reference to FIG.

図３に示す構造記述データには、contents要素にMovie etcというタイトルが指定されている。そして、図３の例では、contents要素が、videoのタイプのmediaObject要素と、audioのタイプのmediaObject要素から構成されている。よって、par要素により、タイプがvideoのmediaObject要素と、タイプがaudioのmediaObject要素との同期がとられている。 In the structure description data shown in FIG. 3, the title “Movie etc” is specified in the contents element. In the example of FIG. 3, the contents element is composed of a mediaObject element of the video type and a mediaObject element of the audio type. Therefore, the par element synchronizes the mediaObject element whose type is video and the mediaObject element whose type is audio.

タイプがvideoのmediaObject要素は、フォーマットとしてMPEG1が、保存場所としてhttp://mserv.com/MPEG/movie0v.mpvが指定されている。 The mediaObject element of type video specifies MPEG1 as the format and http://mserv.com/MPEG/movie0v.mpv as the storage location.

また、タイプがvideoのmediaObject要素は、時刻00:00:00から00:01:00の時間情報を持つsegment要素と、時刻00:01:0から00:02:00の時間情報を持つsegment要素と、時刻00:03:00から00:04:00の時間情報を持つsegment要素と、時刻00:04:00から00:05:00の時間情報を持つsegment要素と、を有する。つまり、タイプがvideoのmediaObject要素は、時刻00:02:00から00:03:00を除いた記述になっている。 The mediaObject element of type video has a segment element with time information from time 00:00:00 to 00:01:00 and a segment element with time information from time 00: 01: 0 to 00:02:00 And a segment element having time information from time 00:03:00 to 00:04:00 and a segment element having time information from time 00:04:00 to 00:05:00. In other words, the mediaObject element of type video is a description excluding the time 00:02:00 to 00:03:00.

また、タイプがaudioのmediaObject要素は、フォーマットとしてMPEG1が、保存場所としてhttp://mserv.com/MPEG/movie0a.mp2が指定されている。また、タイプがaudioのmediaObject要素は、時刻00:00:00から00:01:00の時間情報を持つsegment要素と、時刻00:01:0から00:02:00の時間情報を持つsegment要素と、時刻00:03:00から00:04:00の時間情報を持つsegment要素と、時刻00:04:00から00:05:00の時間情報を持つsegment要素と、を有する。つまり、タイプがaudioのmediaObject要素は、時刻00:02:00から00:03:00を除いた記述になっている。 The mediaObject element of the type “audio” specifies MPEG1 as the format and “http://mserv.com/MPEG/movie0a.mp2” as the storage location. The mediaObject element of type audio has a segment element with time information from time 00:00:00 to 00:01:00 and a segment element with time information from time 00: 01: 0 to 00:02:00. And a segment element having time information from time 00:03:00 to 00:04:00 and a segment element having time information from time 00:04:00 to 00:05:00. That is, the mediaObject element whose type is audio is a description excluding the time 00:02:00 to 00:03:00.

ところで、コンテンツが複数のメディアから構成されている場合には、メディアセグメント間の再生タイミングや同期を制御する必要がある。そこで、本実施の形態では、記述コンバータ１００３が、構造記述データで記述された要約内容記述１００８を、メディアセグメントの再生順序、再生のタイミング、同期情報を表現できる表現記述データで記述された再生方法記述１００９に変換している。 By the way, when the content is composed of a plurality of media, it is necessary to control playback timing and synchronization between media segments. Therefore, in this embodiment, description converter 1003 reproduces summary content description 1008 described by structure description data by using description description data that can represent the playback order of media segments, the timing of playback, and synchronization information. The description is converted into the description 1009.

本実施の形態では、表現記述データとして、Synchronized Multimedia Integration Language (SMIL)を用いている。SMILは、複数のメディアに対して、それらの表現の時間的な挙動や表示スクリーン上でのレイアウトなどを記述を目的として、W3Cによって標準化された記述言語である。１９９８年６月１５日にSMILのVer 1.0が勧告された。SMIL ver. 1.0の仕様書は、http://www.w3.org/TR/REC-smilで得られる。 In this embodiment, Synchronized Multimedia Integration Language (SMIL) is used as expression description data. SMIL is a description language standardized by the W3C for the purpose of describing the temporal behavior of these expressions and the layout on the display screen for multiple media. On June 15, 1998, SMIL Ver 1.0 was recommended. The specification for SMIL ver. 1.0 is available at http://www.w3.org/TR/REC-smil.

このように、表現記述データに、標準化されたSMILを用いることで、既存あるいはこれから開発されるSMIL処理プログラムが利用できるので、汎用性がます。 In this way, by using standardized SMIL for the expression description data, existing or future SMIL processing programs can be used, so it is versatile.

次に、XMLで記述された構造記述データを、メディアセグメントの再生順序、再生タイミング、同期情報等の再生の形態を表現する表現記述データに変換する処理について図４を用いて説明する。図４は、実施の形態１にかかる記述コンバータが、構造記述データをSMILへ変換する手順を示すフロー図である。 Next, a process of converting the structure description data described in XML into expression description data expressing the playback form of media segment playback order, playback timing, synchronization information, and the like will be described with reference to FIG. FIG. 4 is a flowchart showing a procedure in which the description converter according to the first embodiment converts the structure description data into SMIL.

まず、処理を開始すると（ステップＳ４０１）、ステップＳ４０２において、記述コンバータ１００３が構造記述データで記述された要約内容記述１００８に要素par要素があるか否かを調べる。そして、記述コンバータ１００３は、ステップＳ４０２においてpar要素があると判断した場合には、ステップＳ４０６の処理へ、par要素がないと判断した場合にはステップＳ４０３の処理へ移行する。 First, when processing is started (step S401), in step S402, the description converter 1003 checks whether or not there is an element par element in the summary content description 1008 described by the structure description data. If the description converter 1003 determines that there is a par element in step S402, the description converter 1003 proceeds to step S406. If it is determined that there is no par element, the description converter 1003 proceeds to step S403.

記述コンバータ１００３は、ステップＳ４０３において、構造記述データで記述された要約内容記述１００８のmediaObject要素の、type属性からメディアのタイプを、format属性からメディアのフォーマットを、src属性からメディアデータのURLをそれぞれ取得する。次に、記述コンバータ１００３は、ステップＳ４０４において、各segment要素の、start属性とend属性からメディアセグメントの時間情報を取得し、それを記憶しておく。そして、記述コンバータ１００３は、ステップＳ４０５において、ステップＳ４０３およびステップＳ４０５で取得した、メディアのフォーマット、メディアデータのURL、メディアセグメントの時間情報からSMIL文書で記述された再生方法記述１００９を生成し、出力する。 In step S403, the description converter 1003 converts the media type from the type attribute, the media format from the format attribute, and the media data URL from the src attribute of the mediaObject element of the summary content description 1008 described by the structure description data. get. Next, description converter 1003 acquires the time information of the media segment from the start attribute and end attribute of each segment element in step S404, and stores it. In step S405, the description converter 1003 generates a playback method description 1009 described in the SMIL document from the media format, media data URL, and media segment time information acquired in steps S403 and S405, and outputs them. To do.

一方、記述コンバータ１００３は、ステップＳ４０６でおいて、par要素中の先頭のmediaObject要素を取得する。次に、記述コンバータ１００３は、ステップＳ４０７において、取得したmediaObject要素の、type属性からメディアのタイプを、format属性からメディアのフォーマットを、src属性からメディアデータのURLをそれぞれ取得する。そして、記述コンバータ１００３は、ステップＳ４０８において、各segment要素の、start属性とend属性からメディアセグメントの時間情報を取得し、それを記憶しておく。 On the other hand, the description converter 1003 acquires the first mediaObject element in the par element in step S406. Next, in step S407, the description converter 1003 acquires the media type from the type attribute, the media format from the format attribute, and the URL of the media data from the src attribute of the acquired mediaObject element. In step S408, the description converter 1003 acquires the time information of the media segment from the start attribute and the end attribute of each segment element, and stores it.

次に、記述コンバータ１００３は、ステップＳ４０９で、par要素中にまだ調べていないmediaObject要素があるかどうか調べる。そして、記述コンバータ１００３は、まだ調べていないmediaObject要素があれば、ステップＳ４１０においてその先頭のものを取得し、ステップＳ４０７の処理に移行する。一方、記述コンバータ１００３は、調べていないmediaObject要素がなければステップＳ４１１の処理に移行する。 Next, in step S409, the description converter 1003 checks whether there is a mediaObject element that has not been checked yet in the par element. If there is a mediaObject element that has not been checked, the description converter 1003 acquires the first one in step S410, and proceeds to the processing in step S407. On the other hand, if there is no mediaObject element that has not been examined, the description converter 1003 proceeds to the process of step S411.

次に、記述コンバータ１００３は、ステップＳ４１１において、記憶しておいたsegment要素の時間情報を用いて、異なるmediaObject要素に属し、かつ時間に重なりのあるsegment要素をグループ化する。そして、記述コンバータ１００３は、ステップＳ４１２において、ステップＳ４０７およびステップＳ４０８で取得したメディアのフォーマット、メディアデータのURL、メディアセグメントの時間情報からSMIL文書で記述された再生方法記述１００９を生成し、出力する。 Next, in step S411, description converter 1003 groups segment elements that belong to different mediaObject elements and overlap in time using the stored time information of segment elements. In step S412, the description converter 1003 generates and outputs a playback method description 1009 described in the SMIL document from the media format, media data URL, and media segment time information acquired in steps S407 and S408. .

次に、構造記述データである要約内容記述１００８にpar要素がない場合に、記述コンバータ１００３が要約内容記述１００８からSMIL文書である再生方法記述１００９を出力する、ステップＳ４０５における処理について、図５を用いて説明する。図５は、実施の形態１にかかる記述コンバータが構造記述データである要約内容記述からSMIL文書である再生方法記述を出力するフロー図である。 Next, when there is no par element in the summary content description 1008 that is the structure description data, the description converter 1003 outputs the reproduction method description 1009 that is a SMIL document from the summary content description 1008. FIG. It explains using. FIG. 5 is a flowchart in which the description converter according to the first embodiment outputs a reproduction method description that is a SMIL document from a summary content description that is structure description data.

まず、記述コンバータ１００３は、SMILのヘッダを出力する（ステップＳ５０１）。 First, the description converter 1003 outputs a SMIL header (step S501).

SMIL文書は、図６に示すように、ヘッダ６０１と本体６０２で構成されたものである。ヘッダ６０１はhead要素で、本体６０２はbody要素で記述される。すなわち、ヘッダ６０１は<head>と</head>とで囲まれた部分、本体６０２は<body>と</body>とで囲まれた部分となる。 The SMIL document is composed of a header 601 and a main body 602 as shown in FIG. The header 601 is described by a head element, and the main body 602 is described by a body element. That is, the header 601 is a portion surrounded by <head> and </ head>, and the main body 602 is a portion surrounded by <body> and </ body>.

ヘッダには作成者、作成日などの情報や、映像を画面上のどこに表示し、テキストをどこに表示するかなどといったレイアウトなどを記述することができる。なお、ヘッダは省略することが可能である。 The header can describe information such as the creator and creation date, and the layout such as where the video is displayed on the screen and where the text is displayed. The header can be omitted.

次に、記述コンバータ１００３は、<seq>と</seq>でメディアセグメント全体を囲う（ステップＳ５０２）。これらはseq要素であり、<seq>と</seq>により囲まれたメディアセグメントを、記述した順に処理、すなわち再生あるいは表示することを表すものである。 Next, description converter 1003 surrounds the entire media segment with <seq> and </ seq> (step S502). These are seq elements, which indicate that the media segments enclosed by <seq> and </ seq> are processed, that is, played or displayed, in the order described.

次に、記述コンバータ１００３は、<seq>と</seq>で囲んだメディアセグメント毎に、以下の処理を行う。 Next, description converter 1003 performs the following processing for each media segment enclosed by <seq> and </ seq>.

まず、メディアタイプに合わせて、SMILのaudio要素、video要素、ref要素、img要素から対応する要素を選択する（ステップＳ５０３）。なお、ref要素とは、ソースのメディアを特定しない書き方として定義されたものである。ref要素で指定したものは音声であっても、映像であっても、静止画であっても、映像と音声が同期したものであってもよい。 First, a corresponding element is selected from the SMIL audio element, video element, ref element, and img element in accordance with the media type (step S503). The ref element is defined as a writing method that does not specify the source medium. What is specified by the ref element may be audio, video, still image, or video and audio synchronized.

次に、記述コンバータ１００３は、ステップＳ５０３で選択した要素のclip-begin属性とclip-end属性の値を次のように設定する。すなわち、対応する要約内容記述１００８のsegment要素の、start属性の値をSMILのclip-begin属性の値に、end属性の値をclip-end属性の値に設定する（ステップＳ５０４）。なお、clipとは、時間的な区間のことをいう。 Next, description converter 1003 sets the values of the clip-begin attribute and clip-end attribute of the element selected in step S503 as follows. That is, the start attribute value of the segment element of the corresponding summary content description 1008 is set to the SMIL clip-begin attribute value, and the end attribute value is set to the clip-end attribute value (step S504). Note that “clip” refers to a time interval.

次に、記述コンバータ１００３は、ステップＳ５０３で選択した要素のsrc属性の値を、対応する要約内容記述１００８のsegment要素の親要素であるmediaObject要素のsrc属性の値に設定する（ステップＳ５０５）。この後、ステップＳ５０３で選択した要素の記述を出力する。 Next, the description converter 1003 sets the value of the src attribute of the element selected in step S503 to the value of the src attribute of the mediaObject element that is the parent element of the segment element of the corresponding summary content description 1008 (step S505). Thereafter, the description of the element selected in step S503 is output.

このようにして、記述コンバータ１００３は、構造記述データである要約内容記述１００８からSMILで記述された表現記述データである再生方法記述１００９を生成する。 In this way, the description converter 1003 generates a reproduction method description 1009 that is expression description data described in SMIL from the summary content description 1008 that is structure description data.

図７に、記述コンバータ１００３が、図２（ｂ）に示す構造記述データから出力したSMIL文書を示す。図７は、実施の形態１にかかる記述コンバータが出力したSMIL文書の例を示す図である。 FIG. 7 shows a SMIL document output by the description converter 1003 from the structure description data shown in FIG. FIG. 7 is a diagram illustrating an example of a SMIL document output by the description converter according to the first embodiment.

図７に示す文書の例では、http://mserv.com/MPEG/movie0.mpgの時刻00:00:00から00:01:00の情報、http://mserv.com/MPEG/movie0.mpgの時刻00:01:0から00:02:00の情報、http://mserv.com/MPEG/movie0.mpgの時刻00:03:00から00:04:00の情報を持つ情報、http://mserv.com/MPEG/movie0.mpgの時刻00:04:00から00:05:00の情報の順に処理される。なお、図７に示す例では、ヘッダは省略されている。 In the example of the document shown in FIG. 7, the information from time 00:00:00 to 00:01:00 of http://mserv.com/MPEG/movie0.mpg, http://mserv.com/MPEG/movie0.mpg. mpg time 00: 01: 0 to 00:02:00 information, http://mserv.com/MPEG/movie0.mpg time 00:03:00 to 00:04:00 information, http It is processed in the order of information from time 00:04:00 to 00:05:00 of: //mserv.com/MPEG/movie0.mpg. In the example shown in FIG. 7, the header is omitted.

また、時間的に連続したクリップをひとつにまとめる処理を追加し、図８に示すSMIL文書を出力してもよい。 Further, a process for collecting temporally continuous clips into one may be added to output the SMIL document shown in FIG.

図８に示す文書の例では、http://mserv.com/MPEG/movie0.mpgの時刻00:00:00から00:02:00の情報を持つ情報、http://mserv.com/MPEG/movie0.mpgの時刻00:03:00から00:05:00の情報の順に処理される。つまり、図８に示した文書は、図７に示した文書の例と同じ処理を行う。 In the example of the document shown in FIG. 8, information having information from time 00:00:00 to 00:02:00 of http://mserv.com/MPEG/movie0.mpg, http://mserv.com/MPEG It is processed in the order of information from 00:03:00 to 00:05:00 of /movie0.mpg. That is, the document shown in FIG. 8 performs the same processing as the example of the document shown in FIG.

次に、構造記述データである要約内容記述１００８にpar要素がある場合に、記述コンバータ１００３が要約内容記述１００８からSMIL文書である再生方法記述１００９を出力する、ステップＳ４１２における処理について、図９を用いて説明する。図９は、実施の形態１にかかる記述コンバータが構造記述データである要約内容記述からSMIL文書である再生方法記述を出力するフロー図である。 Next, when there is a par element in the summary content description 1008 that is the structure description data, the description converter 1003 outputs the reproduction method description 1009 that is a SMIL document from the summary content description 1008. FIG. It explains using. FIG. 9 is a flowchart in which the description converter according to the first embodiment outputs a reproduction method description that is a SMIL document from a summary content description that is structure description data.

まず、記述コンバータ１００３は、SMILのヘッダを出力する（ステップＳ９０１）。次に、記述コンバータ１００３は、<seq>と</seq>でメディアセグメント全体を囲う（ステップＳ９０２）。そして、記述コンバータ１００３は、メディアセグメントのグループを、時間の早いグループ順に、SMILの<par>と</par>で囲う（ステップＳ９０３）。 First, the description converter 1003 outputs a SMIL header (step S901). Next, description converter 1003 surrounds the entire media segment with <seq> and </ seq> (step S902). Then, the description converter 1003 encloses the media segment groups in SMIL <par> and </ par> in the order of the group with the earliest time (step S903).

次に、記述コンバータ１００３は、同じmediaObject要素に属するメディアセグメントが他にあるか判断し（ステップＳ９０４）、メディアセグメントが他にある場合はそれらを<seq>と</seq>で囲う（ステップＳ９０５）。そして、記述コンバータ１００３は、<seq>と</seq>で囲んだメディアセグメント毎に、以下の処理を行う。 Next, description converter 1003 determines whether there are other media segments belonging to the same mediaObject element (step S904). If there are other media segments, they are enclosed in <seq> and </ seq> (step S905). ). The description converter 1003 performs the following processing for each media segment enclosed by <seq> and </ seq>.

まず、メディアのタイプに合わせて、SMILのaudio要素、video要素、ref要素、img要素などから対応する要素を選択する。次に、選択した要素のclip-bigen属性とclip-end属性の値を設定する（ステップＳ９０６）。これは、要約内容記述１００８の、対応するsegment要素のstart属性の値をSMILのclip-begin属性の値に、end属性の値をclip-end属性の値に設定する（ステップＳ９０７）。次に、選択した要素のsrc属性の値を、対応する要約内容記述１００８のsegment要素の親要素であるmediaObject要素のsrc属性の値に設定する（ステップＳ９０８）。次に、選択した要素の記述を出力する。 First, the corresponding element is selected from the SMIL audio element, video element, ref element, img element, etc. according to the type of media. Next, the values of the clip-bigen attribute and clip-end attribute of the selected element are set (step S906). In the summary content description 1008, the start attribute value of the corresponding segment element is set to the SMIL clip-begin attribute value, and the end attribute value is set to the clip-end attribute value (step S907). Next, the value of the src attribute of the selected element is set to the value of the src attribute of the mediaObject element that is the parent element of the segment element of the corresponding summary content description 1008 (step S908). Next, a description of the selected element is output.

一方、記述コンバータ１００３は、同じmediaObject属性に属するメディアセグメントがない場合は<seq>と</seq>で囲わず、上記のメディアセグメント毎に行った処理と同様の処理を行う。 On the other hand, if there is no media segment belonging to the same mediaObject attribute, description converter 1003 does not enclose it with <seq> and </ seq>, and performs the same processing as that performed for each media segment.

このようにして、記述コンバータ１００３は、構造記述データである要約内容記述１００８が複数のメディアで構成されている場合であっても、複数のメディアを同期させて処理する表現記述データである再生方法記述１００９を生成する。 In this way, the description converter 1003 allows the reproduction method that is expression description data to be processed in synchronization with a plurality of media, even if the summary content description 1008 that is the structure description data is composed of a plurality of media. A description 1009 is generated.

図１０に、図３に示す構造記述データから出力されるSMIL文書を示す。図１０は、実施の形態１にかかる記述コンバータが出力したSMIL文書の例を示す図である。 FIG. 10 shows a SMIL document output from the structure description data shown in FIG. FIG. 10 is a diagram illustrating an example of a SMIL document output by the description converter according to the first embodiment.

図１０に示す文書の例では、videoであるhttp://mserv.com/MPEG/movie0v.mpvの時刻00:00:00から00:01:00の情報とaudioであるhttp://mserv.com/MPEG/movie0a.mp2の時刻00:00:00から00:01:00の情報とを同期し、videoであるhttp://mserv.com/MPEG/movie0v.mpvの時刻00:01:00から00:02:00の情報とaudioであるhttp://mserv.com/MPEG/movie0a.mp2の時刻00:01:00から00:02:00の情報を同期し、videoであるhttp://mserv.com/MPEG/movie0v.mpvの時刻00:03:00から00:04:00の情報とaudioであるhttp://mserv.com/MPEG/movie0a.mp2の時刻00:03:00から00:04:00の情報を同期し、videoであるhttp://mserv.com/MPEG/movie0v.mpvの時刻00:04:00から00:05:00の情報とaudioであるhttp://mserv.com/MPEG/movie0a.mp2の時刻00:04:00から00:05:00の情報とを同期し、かつ同期した情報を記述した順番に処理するようになっている。 In the example of the document shown in FIG. 10, information from time 00:00:00 to 00:01:00 of http://mserv.com/MPEG/movie0v.mpv which is video and http: // mserv. com / MPEG / movie0a.mp2 time 00:00:00 to 00:01:00 information is synchronized and video http://mserv.com/MPEG/movie0v.mpv time 00:01:00 00:02:00 information and audio http://mserv.com/MPEG/movie0a.mp2 information from time 00:01:00 to 00:02:00, and video http: / Information from 00:03:00 to 00:04:00 at /mserv.com/MPEG/movie0v.mpv and audio from http://mserv.com/MPEG/movie0a.mp2 from 00:03:00 Synchronize the information at 00:04:00, http://mserv.com/MPEG/movie0v.mpv, which is video, information from time 00:04:00 to 00:05:00 and http: //, which is audio The information from time 00:04:00 to 00:05:00 of mserv.com/MPEG/movie0a.mp2 is synchronized, and the synchronized information is processed in the order described.

また、図１１に示すように、時間的に連続したクリップをひとつにまとめる処理を追加したSMIL文書を出力してもよい。 Further, as shown in FIG. 11, a SMIL document to which processing for grouping temporally continuous clips into one may be output.

また、SMIL文書のpar要素中の複数のクリップを同期させるために、あるクリップの再生開始時間を他のクリップの再生開始時間と異ならせる必要が出てくる場合がある。例えば、audioとvideoが別のメディアオブジェクトとなっており、videoのクリップはその人間が映っている範囲で、audioのクリップはその人間が話している声だけの場合が考えられる。この場合は、videoに含まれる人間の口の動きの映像に合わせて、audioは人間が話し始めるところから再生する必要がある。 Also, in order to synchronize a plurality of clips in the par element of the SMIL document, it may be necessary to make the playback start time of a certain clip different from the playback start time of other clips. For example, audio and video may be different media objects, and the video clip may be a range in which the person is reflected, and the audio clip may be only the voice spoken by the person. In this case, the audio needs to be played back from where the person starts speaking in accordance with the video of the movement of the human mouth included in the video.

つまり、各クリップの再生開始時間を計算し、その時間がくれば再生を始めるようにする必要がある。SMILには、このような目的のために、audio要素、video要素、img要素、ref要素に、遅延情報を示すbeginという属性が用意されている。そして、begin属性を用いることで、クリップ毎に再生開始時間を異ならせることができる。 That is, it is necessary to calculate the playback start time of each clip and start playback when that time comes. For such purposes, SMIL provides an attribute called “begin” indicating delay information in the audio element, video element, img element, and ref element. By using the begin attribute, the playback start time can be varied for each clip.

図１２は、クリップ毎に再生開始時間を異ならせたSMIL文書の例を示した図である。図１２に示す文書では、begin属性を用いることで、videoであるhttp://mserv.com/MPEG/movie0v.mpvの時刻00:00:00から00:01:00の情報の再生時間に対して、audioであるhttp://mserv.com/MPEG/movie0a.mp2の時刻00:00:10から00:000:40の情報を10秒送らせて再生している。また、videoであるhttp://mserv.com/MPEG/movie0v.mpvの時刻00:04:00から00:05:00の情報に対して、audioであるhttp://mserv.com/MPEG/movie0a.mp2の時刻00:04:15から00:05:00の情報を15秒送らせて再生している。 FIG. 12 is a diagram illustrating an example of a SMIL document in which the reproduction start time is different for each clip. In the document shown in FIG. 12, by using the begin attribute, the reproduction time of information from time 00:00:00 to 00:01:00 of http://mserv.com/MPEG/movie0v.mpv which is video is used. Then, the information from time 00:00:10 to 00: 000: 40 of audio http://mserv.com/MPEG/movie0a.mp2 is sent for 10 seconds and reproduced. Also, for the information from time 00:04:00 to 00:05:00 of http://mserv.com/MPEG/movie0v.mpv that is video, http://mserv.com/MPEG/ that is audio Information from movie0a.mp2 from 00:04:15 to 00:05:00 is played for 15 seconds.

このように、begin属性を用いることで、構造記述データに含まれる複数のメディア間の再生時間をずらすことで、複数のメディアの同期をとることができる。 As described above, by using the begin attribute, it is possible to synchronize a plurality of media by shifting reproduction times between the plurality of media included in the structure description data.

以上のように、実施の形態１によれば、メディアコンテンツの構成を表現する構造記述データから、そのメディアコンテンツの再生形態を表現する表現記述データへの変換が行える。これにより、メディアコンテンツの配信において構造記述データを適当に処理、あるいは選択することにより、ユーザや端末に合わせた配信データを作成することができる。 As described above, according to the first embodiment, the structure description data expressing the configuration of the media content can be converted into the expression description data expressing the reproduction form of the media content. Accordingly, distribution data suitable for the user or terminal can be created by appropriately processing or selecting the structure description data in the distribution of the media content.

また、実施の形態１によれば、構造記述データが複数のメディアから構成されていても、メディア間で同期をとることができる。また、複数のメディア間で再生タイミングをずらすことでも、メディア間の同期をとることができる。 Further, according to the first embodiment, even if the structure description data is composed of a plurality of media, synchronization can be established between the media. Also, synchronization between media can be achieved by shifting the reproduction timing among a plurality of media.

また、実施の形態１では、メディアコンテンツの構成を表現する構造記述データから、そのメディアコンテンツの再生形態を表現する表現記述データへの変換を、記述コンバータ１００３が行う形態で説明したが、記述コンバータ１００３が行う処理をプログラムにし、コンピュータに読み取らせて実行させる形態にしてもよい。 In the first embodiment, the description converter 1003 converts the structure description data representing the structure of the media content into the expression description data representing the reproduction form of the media content. The processing performed by 1003 may be programmed and read by a computer for execution.

また、記述コンバータ１００３が行う処理をコンピュータに実行させるプログラムを記憶媒体に格納する形態であってもよい。 Alternatively, a program for causing a computer to execute the processing performed by the description converter 1003 may be stored in a storage medium.

（実施の形態２）
実施の形態２は、端末に合わせたメディアコンテンツの再生、配信を行うために、構造記述データにメディアセグメントとその代替データを記述し、構造記述データをメディアセグメントまたは代替データの再生形態を表現する表現記述データへの変換を行うものである。これにより、動画であるメディアセグメントの代表画像といったような代替データの集合によって記述される構造記述データから、代替データの表現記述データへの変換が行える。以下、実施の形態２について説明する。 (Embodiment 2)
The second embodiment describes a media segment and its alternative data in the structure description data in order to reproduce and distribute media content tailored to the terminal, and expresses the structure description data as a reproduction form of the media segment or alternative data. Conversion to expression description data is performed. Thereby, the structure description data described by a set of alternative data such as a representative image of a media segment that is a moving image can be converted into the expression description data of the alternative data. The second embodiment will be described below.

図１３、図１４、図１５に本実施の形態における構造記述データの例を示す図である。実施の形態２では、構造記述データをコンピュータ上で表現する一例として、Extensible Markup Language(XML)を用いている。図１３は、構造記述データをXMLで記述するためのDTDである。また、図１４は、MPEG1を例に動画と音声が同期したメディアコンテンツに対する構造記述データの例である。また、図１５は、動画と音声がそれぞれ別のメディアとなっているメディアコンテンツの構造記述データの例である。 FIG. 13, FIG. 14 and FIG. 15 are diagrams showing examples of structure description data in the present embodiment. In the second embodiment, Extensible Markup Language (XML) is used as an example of expressing structure description data on a computer. FIG. 13 is a DTD for describing structure description data in XML. FIG. 14 is an example of structure description data for media content in which video and audio are synchronized, taking MPEG1 as an example. FIG. 15 is an example of structure description data of media content in which moving images and audio are different media.

まず、図１３を用いて、構造記述データをXMLで記述するための定義であるDocument Type Definition(DTD)の説明をする。 First, the Document Type Definition (DTD), which is a definition for describing structure description data in XML, will be described with reference to FIG.

図中１３０１に示すように、contents要素は、par要素とmediaObject要素で構成される。また、図中１３０２で示すように、contents要素は、キャラクタデータで示されるtitle属性を持つようになっている。また、図中１３０３に示すように、par要素は、複数の子要素であるmediaObject要素から構成される。 As indicated by reference numeral 1301 in the figure, the contents element is composed of a par element and a mediaObject element. Further, as indicated by reference numeral 1302 in the figure, the contents element has a title attribute indicated by character data. Further, as indicated by reference numeral 1303 in the figure, the par element is composed of mediaObject elements that are a plurality of child elements.

また、図中１３０４に示すように、mediaObject要素は、segment要素から構成される。また、図中１３０５に示すように、mediaObject要素は、そのtype属性によってメディアのタイプを指定される。この例では、メディアのタイプとして音声情報であるaudio、動画情報であるvideo、静止画情報であるimage、音声および動画が同期した情報であるaudiovideo、音声および静止画の情報であるaudioimageといったメディアのタイプが指定される。また、type属性の指定がない場合は、type属性は、デフォルトでaudiovideoに設定される。 Further, as indicated by reference numeral 1304 in the figure, the mediaObject element is composed of segment elements. Further, as indicated by reference numeral 1305 in the figure, the mediaObject element is designated by the type attribute of the media type. In this example, the media type is audio, which is audio information, video, which is video information, image, which is still image information, audiovideo, which is information in which audio and video are synchronized, and audioimage, which is audio and still image information. A type is specified. If the type attribute is not specified, the type attribute is set to audiovideo by default.

また、図中１３０６に示すように、mediaObject要素は、format要素で、動画に対してMPEG1やMPEG2といったメディアのフォーマットが、静止画に対してはgifやjpegといったフォーマットが指定される。また、図中１３０７に示すように、mediaObject要素は、src属性によりデータが保存されている場所が指定される。src属性でUniform Resource Locator (URL)を指定することによりデータが保存されている場所を指定できる。 Also, as indicated by reference numeral 1306 in the figure, the mediaObject element is a format element that specifies a media format such as MPEG1 or MPEG2 for a moving image and a format such as gif or jpeg for a still image. Also, as indicated by reference numeral 1307 in the figure, the mediaObject element specifies the location where the data is stored by the src attribute. The location where data is stored can be specified by specifying Uniform Resource Locator (URL) in src attribute.

また、図中１３０８に示すように、start属性により、segment要素の開始時間に対応する、mediaObject要素で指定されたメディア内部における時間が指定される。また、end属性により、segment要素の終了時間に対応する、mediaObject要素で指定されたメディア内部における時間が指定される。 Further, as indicated by reference numeral 1308 in the figure, the start attribute specifies the time inside the medium specified by the mediaObject element corresponding to the start time of the segment element. The end attribute specifies the time inside the media specified by the mediaObject element corresponding to the end time of the segment element.

また、図中１３０９に示すように、segment要素は、alt要素を有する。alt要素は、該当メディアセグメントの代替データを表すものである。そして、図中１３１０に示すように、alt要素は、type属性によってimageやaudioといったメディアのタイプが指定される。また、alt要素は、format属性によって、静止画像であればgifやjpegといったメディアのフォーマットが指定される。また、alt要素は、src属性により、保存されている場所が指定される。 Further, as indicated by reference numeral 1309 in the drawing, the segment element has an alt element. The alt element represents alternative data of the corresponding media segment. Then, as indicated by reference numeral 1310 in the figure, the alt element designates a media type such as image or audio by the type attribute. In the alt element, a media format such as gif or jpeg is specified for a still image by the format attribute. In the alt element, the saved location is specified by the src attribute.

また、alt要素は各メディアセグメントに複数指定可能とし、同じメディアの場合は、登場順に再生することとする。 Also, a plurality of alt elements can be specified for each media segment, and in the case of the same media, playback is performed in the order of appearance.

また、alt要素は子要素posを有する。そして、alt要素は、子要素posによって、src属性で指定されたデータの中でどの区間であるかを指定できる。pos要素のstart属性およびend属性は、それぞれsrc属性で示されたメディア内部での開始時間、終了時間を表す。 The alt element has a child element pos. The alt element can specify which section in the data specified by the src attribute by the child element pos. The start attribute and end attribute of the pos element represent the start time and end time inside the media indicated by the src attribute.

なお、本実施の形態においては、時間情報を開始時間と終了時間との組によって指定しているが、開始時間と継続時間との組によって表してもよい。 In the present embodiment, the time information is specified by a set of start time and end time, but may be expressed by a set of start time and duration.

次に、MPEG1を例に動画と音声が同期したメディアコンテンツに対する構造記述データの例について、図１４を用いて説明する。 Next, an example of structure description data for media content in which moving images and audio are synchronized will be described using MPEG1 as an example with reference to FIG.

図１４に示す構造記述データには、contents要素にMovie etcというタイトルが指定されている。そして、mediaObject要素のタイプにはaudiovideoが、フォーマットとしてMPEG1が、保存場所としてhttp://mserv.com/MPEG/movie0.mpgが指定されている。また、mediaObject要素は、時刻00:00:00から00:01:00の時間情報を持つsegment要素と、時刻00:01:00から00:02:00の時間情報を持つsegment要素と、時刻00:03:00から00:04:00の時間情報を持つsegment要素と、時刻00:04:00から00:05:00の時間情報を持つsegment要素と、を有する。つまり、mediaObject要素は、時刻00:02:00から00:03:00を除いた記述になっている。 In the structure description data shown in FIG. 14, the title “Movie etc” is specified in the contents element. In the mediaObject element type, audiovideo is specified, MPEG1 is specified as the format, and http://mserv.com/MPEG/movie0.mpg is specified as the storage location. The mediaObject element includes a segment element having time information from time 00:00:00 to 00:01:00, a segment element having time information from time 00:01:00 to 00:02:00, and a time 00 Segment elements having time information from 03:00 to 00:04:00 and segment elements having time information from time 00:04:00 to 00:05:00. That is, the mediaObject element has a description excluding the time 00:02:00 to 00:03:00.

また、時刻00:00:00から00:01:00の時間情報を持つsegment要素は、audiovideoの代替データであるalt要素により指示されている。時刻00:00:00から00:01:00の時間情報を持つsegment要素は、タイプがimage、フォーマットがjpeg、保存場所がhttp://mserv.com/IMAGE/s0.jpgであるalt要素と、タイプがaudio、フォーマットがmpeg1、保存場所がhttp://mserv.com/MPEG/movie0.mp2、時刻00:00:00から00:01:00の時間情報を持つalt要素と、から構成されている。 Further, the segment element having time information from time 00:00:00 to 00:01:00 is indicated by an alt element that is alternative data of audiovideo. The segment element with time information from time 00:00:00 to 00:01:00 is an alt element with type image, format jpeg, and storage location http://mserv.com/IMAGE/s0.jpg , Type is audio, format is mpeg1, save location is http://mserv.com/MPEG/movie0.mp2, alt element with time information from time 00:00:00 to 00:01:00 ing.

また、時刻00:01:00から00:02:00の時間情報を持つsegment要素は、タイプがimage、フォーマットがjpeg、保存場所がhttp://mserv.com/IMAGE/s1.jpgであるalt要素と、タイプがaudio、フォーマットがmpeg1、保存場所がhttp://mserv.com/MPEG/movie0.mp2、時刻00:01:00から00:01:30の時間情報を持つalt要素と、から構成されている。 A segment element with time information from time 00:01:00 to 00:02:00 is an alt whose type is image, format is jpeg, and storage location is http://mserv.com/IMAGE/s1.jpg An alt element with time information from time 00:01:00 to 00:01:30, element, type audio, format mpeg1, save location http://mserv.com/MPEG/movie0.mp2, It is configured.

また、時刻00:03:00から00:04:00の時間情報を持つsegment要素は、タイプがimage、フォーマットがjpeg、保存場所がhttp://mserv.com/IMAGE/s3.jpgであるalt要素と、タイプがaudio、フォーマットがmpeg1、保存場所がhttp://mserv.com/MPEG/movie0.mp2、時刻00:03:00から00:03:30の時間情報を持つalt要素と、から構成されている。 A segment element with time information from time 00:03:00 to 00:04:00 is an alt whose type is image, format is jpeg, and storage location is http://mserv.com/IMAGE/s3.jpg From an element, an alt element with time information of type audio, format mpeg1, save location http://mserv.com/MPEG/movie0.mp2, time 00:03:00 to 00:03:30 It is configured.

また、時刻00:04:00から00:05:00の時間情報を持つsegment要素は、タイプがimage、フォーマットがjpeg、保存場所がhttp://mserv.com/IMAGE/s４.jpgであるalt要素と、タイプがaudio、フォーマットがmpeg1、保存場所がhttp://mserv.com/MPEG/movie0.mp2、時刻00:04:00から00:05:00の時間情報を持つalt要素と、から構成されている。 A segment element with time information from time 00:04:00 to 00:05:00 is an alt whose type is image, format is jpeg, and storage location is http://mserv.com/IMAGE/s4.jpg From an element, an alt element with type audio, format mpeg1, save location http://mserv.com/MPEG/movie0.mp2, time information from 00:04:00 to 00:05:00 It is configured.

次に、動画と音声がそれぞれ別のメディアとなっているメディアコンテンツの構造記述データの例について図１５を用いて説明する。 Next, an example of structure description data of media content in which moving images and audio are different media will be described with reference to FIG.

図１５に示す構造記述データには、contents要素にMovie etcというタイトルが指定されている。そして、図１５の例では、contents要素が、videoのタイプのmediaObject要素と、audioのタイプのmediaObject要素から構成されている。よって、par要素により、タイプがaudioのmediaObject要素と、タイプがvideoのmediaObject要素との、同期がとられている。 In the structure description data shown in FIG. 15, the title “Movie etc” is specified in the contents element. In the example of FIG. 15, the contents element includes a mediaObject element of the video type and a mediaObject element of the audio type. Therefore, the par element synchronizes the mediaObject element whose type is audio and the mediaObject element whose type is video.

タイプがvideoのmediaObject要素は、フォーマットとしてMPEG1が、保存場所としてhttp://mserv.com/MPEG/movie0v.mpvが指定されている。また、タイプがvideoのmediaObject要素は、時刻00:00:00から00:01:00の時間情報を持つsegment要素と、時刻00:01:0から00:02:00の時間情報を持つsegment要素と、時刻00:03:00から00:04:00の時間情報を持つsegment要素と、時刻00:04:00から00:05:00の時間情報を持つsegment要素と、を有する。つまり、タイプがvideoのmediaObject要素は、時刻00:02:00から00:03:00を除いた記述になっている。 The mediaObject element of type video specifies MPEG1 as the format and http://mserv.com/MPEG/movie0v.mpv as the storage location. The mediaObject element of type video has a segment element with time information from time 00:00:00 to 00:01:00 and a segment element with time information from time 00: 01: 0 to 00:02:00 And a segment element having time information from time 00:03:00 to 00:04:00 and a segment element having time information from time 00:04:00 to 00:05:00. In other words, the mediaObject element of type video is a description excluding the time 00:02:00 to 00:03:00.

また、時刻00:00:00から00:01:00の時間情報を持つsegment要素は、videoの代替データであるalt要素により指示されている。時刻00:00:00から00:01:00の時間情報を持つsegment要素は、タイプがimage、フォーマットがjpeg、保存場所がhttp://mserv.com/IMAGE/s0.jpgであるalt要素が指示されている。また、時刻00:01:00から00:02:00の時間情報を持つsegment要素は、タイプがimage、フォーマットがjpeg、保存場所がhttp://mserv.com/IMAGE/s1.jpgであるalt要素が指示されている。また、時刻00:03:00から00:04:00の時間情報を持つsegment要素は、タイプがimage、フォーマットがjpeg、保存場所がhttp://mserv.com/IMAGE/s3.jpgであるalt要素が指示されている。また、時刻00:04:00から00:05:00の時間情報を持つsegment要素は、タイプがimage、フォーマットがjpeg、保存場所がhttp://mserv.com/IMAGE/s４.jpgであるalt要素が指示されている。 Further, the segment element having time information from time 00:00:00 to 00:01:00 is indicated by an alt element that is alternative data of video. The segment element with time information from time 00:00:00 to 00:01:00 has an alt element with type image, format jpeg, and storage location http://mserv.com/IMAGE/s0.jpg Have been instructed. A segment element with time information from time 00:01:00 to 00:02:00 is an alt whose type is image, format is jpeg, and storage location is http://mserv.com/IMAGE/s1.jpg The element is indicated. A segment element with time information from time 00:03:00 to 00:04:00 is an alt whose type is image, format is jpeg, and storage location is http://mserv.com/IMAGE/s3.jpg The element is indicated. A segment element with time information from time 00:04:00 to 00:05:00 is an alt whose type is image, format is jpeg, and storage location is http://mserv.com/IMAGE/s4.jpg The element is indicated.

また、時刻00:00:00から00:01:00の時間情報を持つsegment要素は、audioの代替データであるalt要素により指示されている。時刻00:00:00から00:01:00の時間情報を持つsegment要素は、タイプがaudio、フォーマットがmpeg1、保存場所がhttp://mserv.com/MPEG/movie0.mp2、時刻00:00:00から00:01:00の時間情報を持つalt要素が指示されている。また、時刻00:01:00から00:02:00の時間情報を持つsegment要素は、タイプがaudio、フォーマットがmpeg1、保存場所がhttp://mserv.com/MPEG/movie0.mp2、時刻00:01:00から00:01:30の時間情報を持つalt要素が指示されている。また、時刻00:03:00から00:04:00の時間情報を持つsegment要素は、タイプがaudio、フォーマットがmpeg1、保存場所がhttp://mserv.com/MPEG/movie0.mp2、時刻00:03:00から00:03:30の時間情報を持つalt要素が指示されている。また、時刻00:04:00から00:05:00の時間情報を持つsegment要素は、タイプがaudio、フォーマットがmpeg1、保存場所がhttp://mserv.com/MPEG/movie0.mp2、時刻00:04:00から00:05:00の時間情報を持つalt要素が指示されている。 Further, the segment element having time information from time 00:00:00 to 00:01:00 is indicated by the alt element which is audio alternative data. The segment element with time information from time 00:00:00 to 00:01:00 has type audio, format mpeg1, save location http://mserv.com/MPEG/movie0.mp2, time 00:00 An alt element having time information from: 00 to 00:01:00 is indicated. Also, the segment element with time information from time 00:01:00 to 00:02:00 has type audio, format mpeg1, save location http://mserv.com/MPEG/movie0.mp2, time 00 An alt element having time information from 01:00 to 00:01:30 is indicated. Also, the segment element with time information from time 00:03:00 to 00:04:00 has type audio, format mpeg1, storage location http://mserv.com/MPEG/movie0.mp2, time 00 An alt element having time information from 03:00 to 00:03:30 is indicated. Also, the segment element with time information from time 00:04:00 to 00:05:00 has type audio, format mpeg1, save location http://mserv.com/MPEG/movie0.mp2, time 00 An alt element having time information from 04:00 to 00:05:00 is indicated.

本実施の形態においても、実施の形態１と同様に、表現記述データとしてSMILを用いる。各メディアセグメント自身を再生するSMIL文書の出力は実施の形態１と同様である。 Also in the present embodiment, SMIL is used as the expression description data as in the first embodiment. The output of the SMIL document that reproduces each media segment itself is the same as in the first embodiment.

以下に、記述コンバータ１００３が、代替データを再生するSMIL文書を出力する処理について述べる。これは、実施の形態１における図４のフローチャートの中で、SMIL文書を出力するステップＳ４０５とステップＳ４１２の処理が異なるだけである。よって、実施の形態１と異なる処理について説明する。まず、ステップＳ４０５の処理について、図１６を用いて説明する。 Hereinafter, a process in which description converter 1003 outputs a SMIL document for reproducing alternative data will be described. This is different from the flowchart of FIG. 4 in the first embodiment only in the processing of step S405 and step S412 for outputting the SMIL document. Therefore, processing different from that of the first embodiment will be described. First, the process of step S405 will be described with reference to FIG.

まず、記述コンバータ１００３は、SMILのヘッダを出力する（ステップ１６０１）。次に、記述コンバータ１００３は、<seq>と</seq>でメディアセグメント全体を囲う（ステップＳ１６０２）。そして、記述コンバータ１００３は、囲ったメディアセグメント毎に、メディアタイプの異なる代替データがあるか判断する（ステップＳ１６０３）。 First, the description converter 1003 outputs a SMIL header (step 1601). Next, description converter 1003 surrounds the entire media segment with <seq> and </ seq> (step S1602). The description converter 1003 determines whether there is alternative data of different media types for each enclosed media segment (step S1603).

そして、記述コンバータ１００３は、Ｓ１６０３において、メディアタイプの異なる代替データがないという判断をすると、代替データが複数あるか判断する（ステップＳ１６０４）。そして、記述コンバータ１００３は、代替データが複数ある場合は、複数の代替データを<seq>と</seq>で囲む（ステップＳ１６０５）。一方、記述コンバータ１００３は、代替データがひとつの場合は<seq>と</seq>で囲まず、代替データ毎に以下の処理をする。 If the description converter 1003 determines in S1603 that there is no alternative data of different media types, it determines whether there are a plurality of alternative data (step S1604). When there are a plurality of alternative data, description converter 1003 surrounds the plurality of alternative data with <seq> and </ seq> (step S1605). On the other hand, the description converter 1003 performs the following processing for each alternative data without enclosing it with <seq> and </ seq> when there is only one alternative data.

次に、記述コンバータ１００３は、代替データのタイプに合わせて、SMILのaudio要素、video要素、img要素などから対応する要素を選択する（ステップＳ１６０６）。次に、記述コンバータ１００３は、alt要素の子要素posの、start属性およびend属性が指定されている場合は、start属性の値をSMILのclip-beginに、end属性の値をclip-end属性に設定する（ステップＳ１６０７）。そして、記述コンバータ１００３は、代替データ毎に、保存場所を示すsrc属性を設定する（ステップＳ１６０８）。 Next, description converter 1003 selects a corresponding element from the SMIL audio element, video element, img element, and the like in accordance with the type of alternative data (step S1606). Next, when the start attribute and the end attribute of the child element pos of the alt element are specified, the description converter 1003 sets the start attribute value to the SMIL clip-begin and the end attribute value to the clip-end attribute. (Step S1607). Then, the description converter 1003 sets a src attribute indicating a storage location for each alternative data (step S1608).

一方、記述コンバータ１００３は、Ｓ１６０３において、メディアタイプの異なる代替データがあると判断した場合は、同じ種類のメディアタイプで代替データをグループ化する（ステップＳ１６０９）。 On the other hand, if the description converter 1003 determines in S1603 that there is alternative data with a different media type, the description converter 1003 groups the alternative data with the same type of media type (step S1609).

次に、記述コンバータ１００３は、グループの再生終了の同期をとるために、継続時間の最も長い代替データを調べる必要がある。よって、記述コンバータ１００３は、各グループ毎に、代替データのstart属性,end属性の値から継続時間を計算する（ステップＳ１６１０）。ただし、メディアタイプが静止画像（ｉｍａｇｅ）の場合か、start属性、end属性が指定されていない場合は、その代替データの継続時間は０とする。 Next, description converter 1003 needs to examine alternative data having the longest duration in order to synchronize the end of reproduction of the group. Therefore, the description converter 1003 calculates the duration from the values of the start attribute and end attribute of the alternative data for each group (step S1610). However, if the media type is a still image (image) or the start attribute and end attribute are not specified, the duration of the alternative data is set to 0.

次に、記述コンバータ１００３は、最も継続時間の長いグループに、再生終了の同期を合わせるようにSMILのpar要素のendsync属性を設定し（ステップＳ１６１１）、グループ全体を<par>と</par>で囲って（ステップＳ１６１２）、各メディアタイプのグループ毎にＳ１６０４の処理を行う。 Next, description converter 1003 sets the endsync attribute of the SMIL par element to the group with the longest duration so as to synchronize the end of playback (step S1611), and sets the entire group to <par> and </ par>. (Step S1612), and the process of S1604 is performed for each group of each media type.

endsync属性とは、<par>と</par>で囲んだ複数メディアを並列(parallel)に再生・表示させる際に、メディアによって継続時間が異なる場合に使用するものである。つまり、endsync属性とは、このような場合にどのメディアに、全てのメディアの再生・表示の終了を合わせるかを指定するものである。endsync属性におけるメディアの指定方法は幾つかあるが、本実施例ではメディアのidによって指定する方法を用いている。具体的には、あるタイプのメディアの属性に識別子であるidを付与する。そして、endsync属性＝idとすることで、idが付与されたメディアの終了時間に合わせて、このidが付与されたメディアと同一のグループ内のメディアが同期して終了するようになる。 The endsync attribute is used when a plurality of media surrounded by <par> and </ par> are played back and displayed in parallel and the duration is different depending on the media. In other words, the endsync attribute is used to specify which media should be matched with the end of playback / display of all media in such a case. There are several methods for specifying media in the endsync attribute. In this embodiment, a method for specifying by the media id is used. Specifically, an id that is an identifier is assigned to an attribute of a certain type of media. Then, by setting the endsync attribute = id, the media in the same group as the media to which the id is assigned end in synchronization with the end time of the media to which the id is assigned.

これにより、静止画のように継続時間を持たないメディアであって、durなどの属性によって表示時間が指定されていないものに対しても、このメディアの再生終了時間をidが付与されたメディアの再生時間と同じにすることができる。例えば、音声のメディアが再生されている間はずっと静止画を表示しつづけるようなことができる。 As a result, even for media that does not have a duration such as a still image and for which the display time is not specified by an attribute such as dur, the playback end time of this media is the same as that of the media with id. Can be the same as the playback time. For example, a still image can be continuously displayed while audio media is being played.

図１７に、図１４に示す構造記述データから、上記の処理により出力されるSMIL文書を示す。 FIG. 17 shows a SMIL document output by the above processing from the structure description data shown in FIG.

図１７のSMIL文書には、複数のグループ１７０１〜１７０４が記述されている。１７０１で記されるグループには、タイプがimage、フォーマットがjpeg、保存場所がhttp://mserv.com/IMAGE/s0.jpgである代替データと、タイプがaudio、フォーマットがmpeg1、保存場所がhttp://mserv.com/MPEG/movie0.mp2、時刻00:00:00から00:01:00の時間情報を持つ代替データと、から構成されている。また、タイプがaudioの代替データには、id属性としてa0が付与されている。そして、グループ１７０１には、endsync属性にid(a0)が設定されている。これにより、グループ１７０１に含まれる代替データの再生終了時間は、タイプがaudioの代替データに合わせられる。つまり、タイプがimageの代替データは、タイプがaudioの代替データの再生時間中、再生され続けることになる。 A plurality of groups 1701 to 1704 are described in the SMIL document of FIG. The group indicated by 1701 includes alternative data whose type is image, format is jpeg and storage location is http://mserv.com/IMAGE/s0.jpg, type is audio, format is mpeg1 and storage location is http://mserv.com/MPEG/movie0.mp2 and alternative data having time information from 00:00:00 to 00:01:00. Further, a0 is assigned as the id attribute to the alternative data of the type audio. In the group 1701, id (a0) is set in the endsync attribute. Thereby, the reproduction end time of the alternative data included in the group 1701 is matched with the alternative data whose type is audio. That is, the alternative data of the type image is continuously reproduced during the reproduction time of the alternative data of the type audio.

なお、グループ１７０２〜１７０４の説明は省略する。 Note that description of the groups 1702 to 1704 is omitted.

次に、ステップＳ４１２の処理について、図１８を用いて説明する。まず、記述コンバータ１００３は、SMILのヘッダを出力する（ステップＳ１８０１）。次に、記述コンバータ１００３は、<seq>と</seq>でメディアセグメント全体を囲う（ステップＳ１８０２）。 Next, the process of step S412 is demonstrated using FIG. First, the description converter 1003 outputs a SMIL header (step S1801). Next, description converter 1003 surrounds the entire media segment with <seq> and </ seq> (step S1802).

そして、記述コンバータ１００３は、メディアセグメントのグループの、時間の早い順に、同じmediaObject要素に属する代替データをグループ化し（ステップＳ１８０３）、グループ毎にstart属性とend属性の値から継続時間を計算する（ステップ１８０４）。ただし、メディアタイプが静止画像（image）の場合か、start属性、end属性が指定されていない場合は、その代替データの継続時間は０とする。 Then, the description converter 1003 groups the alternative data belonging to the same mediaObject element in the chronological order of the group of media segments (step S1803), and calculates the duration from the values of the start attribute and end attribute for each group (step S1803). Step 1804). However, if the media type is a still image (image) or the start attribute and end attribute are not specified, the duration of the alternative data is set to 0.

次に、記述コンバータ１００３は、最も継続時間の長いグループに、再生終了の同期を合わせるように、SMILのpar要素のendsync属性を設定し、全体をその<par>と</par>で囲う（ステップＳ１８０５）。 Next, the description converter 1003 sets the endsync attribute of the par element of SMIL so as to synchronize the end of playback with the group with the longest duration, and encloses the whole in <par> and </ par> ( Step S1805).

次に、記述コンバータ１００３は、代替データが複数あるか判断する（ステップＳ１８０６）。そして、記述コンバータ１００３は、代替データが複数ある場合は、複数の代替データを<seq>と</seq>で囲む（ステップＳ１８０７）。一方、記述コンバータ１００３は、代替データがひとつの場合は<seq>と</seq>で囲まず、代替データ毎に以下の処理をする。 Next, description converter 1003 determines whether there are a plurality of alternative data (step S1806). If there are a plurality of alternative data, description converter 1003 surrounds the plurality of alternative data with <seq> and </ seq> (step S1807). On the other hand, the description converter 1003 performs the following processing for each alternative data without enclosing it with <seq> and </ seq> when there is only one alternative data.

次に、記述コンバータ１００３は、代替データのタイプに合わせて、SMILのaudio要素、video要素、img要素などから対応する要素を選択する（ステップＳ１８０８）。次に、記述コンバータ１００３は、alt要素の子要素posの、start属性およびend属性が指定されている場合は、start属性の値をSMILのclip-begin属性に、end属性の値をclip-end属性に設定する（ステップＳ１８０９）。そして、記述コンバータ１００３は、代替データ毎に、保存場所を示すsrc属性を設定する（ステップＳ１８１０）。 Next, description converter 1003 selects a corresponding element from the SMIL audio element, video element, img element, and the like in accordance with the type of alternative data (step S1808). Next, when the start attribute and the end attribute of the child element pos of the alt element are specified, the description converter 1003 sets the start attribute value to the SMIL clip-begin attribute and the end attribute value to the clip-end. The attribute is set (step S1809). Then, the description converter 1003 sets the src attribute indicating the storage location for each alternative data (step S1810).

なお、図１４に示す構造記述データから、図１８に示す処理により出力されるSMIL文書は図１７と同じものとなる。 Note that the SMIL document output by the process shown in FIG. 18 from the structure description data shown in FIG. 14 is the same as that shown in FIG.

また、SMIL文書においてpar要素の中の各クリップを同期させるために、再生開始時間を異なるものとする必要が出てくる場合がある。この場合は、各クリップの再生開始時間を計算し、その時間がくれば再生を始めるようにする必要がある。 In addition, in order to synchronize each clip in the par element in the SMIL document, it may be necessary to make the playback start times different. In this case, it is necessary to calculate the playback start time of each clip and start playback when that time arrives.

SMILには、このような目的のために、audio要素,video要素,img要素,ref要素にbeginという属性が用意されており、これらを用いることで実現できる。 For such purposes, SMIL provides an attribute called “begin” in the audio element, video element, img element, and ref element, which can be realized by using these attributes.

以上のように、実施の形態２によれば、メディアコンテンツの全体あるいは部分の構成を、メディアセグメントの時間情報と、該当メディアセグメントが動画ならばその代表画像といったような代替データの集合によって記述する構造記述データから、構造記述データに記述されているメディアセグメントまたはその代替データの再生順序、再生のタイミング、同期情報を表現する表現記述データへの変換が行える。 As described above, according to the second embodiment, the configuration of the whole or part of the media content is described by the time information of the media segment and a set of alternative data such as a representative image if the media segment is a moving image. The structure description data can be converted into expression description data representing the reproduction order, reproduction timing, and synchronization information of the media segment described in the structure description data or its alternative data.

これにより、メディアコンテンツの構成に関する情報から、端末に合わせた表示メディアの再生に関する情報を生成することができる。この結果、メディアコンテンツの配信において、端末に合わせた配信データを作成することができる。 As a result, it is possible to generate information related to playback of display media tailored to the terminal from information related to the configuration of the media content. As a result, in the distribution of media content, distribution data suitable for the terminal can be created.

（実施の形態３）
実施の形態３は、端末に合わせたメディアコンテンツの再生、配信を行うために、構造記述データにメディアセグメントとその代替データと、端末に合わせてメディアセグメントと代替データを切り替えるデータとを記述したものである。そして、この構造記述データを、メディアセグメントおよび代替データを端末に合わせて切り替えて表現する表現記述データへの変換を行うものである。 (Embodiment 3)
In the third embodiment, a media segment and its alternative data are described in the structure description data and data for switching the media segment and alternative data in accordance with the terminal in order to play and distribute the media content that matches the terminal. It is. Then, the structure description data is converted into expression description data that is expressed by switching the media segment and the alternative data according to the terminal.

以下、本発明の実施の形態３について説明する。実施の形態３の表現記述データには、メディアセグメントを再生する場合と、代替データを再生する場合の二通りをひとつのSMIL文書に記述して出力するものである。構造記述データとしては、図１４および図１５に示したものを用いる。 The third embodiment of the present invention will be described below. The expression description data of the third embodiment describes and outputs two types of media segment reproduction and alternative data reproduction in one SMIL document. As the structure description data, the data shown in FIGS. 14 and 15 is used.

本実施の形態によって出力される表現記述データには、メディアセグメントを再生する場合と、代替データを再生する場合とが、共に記述されている。これを基にメディアコンテンツを再生するときは、メディアセグメントを再生する場合と、代替データを再生する場合とのどちらを再生するか選択する必要がある。そこで、表現記述データの中に、選択するための条件を記述するようにしている。 The expression description data output according to the present embodiment describes both a case where a media segment is reproduced and a case where alternative data is reproduced. When media content is reproduced based on this, it is necessary to select whether to reproduce a media segment or alternative data. Therefore, the conditions for selection are described in the expression description data.

選択するための条件は、SMILにおけるswitch要素で記述できるため、本実施の形態においても、表現記述データとしてSMIL文書を用いる。switch要素とは複数のメディアから条件にあったひとつを選択するものである。選択は、switch要素の内容に書かれたメディア順に評価され、最初に条件に合ったメディアが選択される。条件はswitch要素の内容に書かれたメディアの属性に付けられており、system-bitrate、system-captionなどがある。 Since the selection condition can be described by a switch element in SMIL, the SMIL document is used as the expression description data also in the present embodiment. The switch element is to select one that meets the conditions from multiple media. The selection is evaluated in the order of the media written in the contents of the switch element, and the media that meets the conditions is selected first. Conditions are attached to the media attributes written in the contents of the switch element, such as system-bitrate and system-caption.

本実施の形態においては、メディアコンテンツを配送するネットワークの接続ビットレートを条件とする。具体的には、接続ビットレートが５６ｋｂｐｓ以上の場合はメディアセグメントを再生し、５６ｋｂｐｓ未満の場合は代替データを再生することとする。 In the present embodiment, the connection bit rate of the network for delivering the media content is a condition. Specifically, when the connection bit rate is 56 kbps or higher, the media segment is played back. When the connection bit rate is lower than 56 kbps, alternative data is played back.

以下に、記述コンバータ１００３が、メディアセグメントもしくは代替データを再生するSMIL文書を出力する処理について述べる。これは、実施の形態１における図４のフローチャートの中で、SMIL文書を出力するステップＳ４０５とステップＳ４１２の一部の処理が異なるだけである。よって、ステップＳ４０５もしくはステップＳ４１２に対応する処理について、図１９を用いて説明する。 In the following, a process in which the description converter 1003 outputs a SMIL document that reproduces a media segment or alternative data will be described. This is different from the flowchart of FIG. 4 in the first embodiment only in a part of processing of step S405 and step S412 for outputting a SMIL document. Therefore, the process corresponding to step S405 or step S412 will be described with reference to FIG.

まず、記述コンバータ１００３は、ＳＭＩＬのヘッダを出力する（ステップ１９０１）。次に、記述コンバータ１００３は、<switch>と</switch>でメディア全体を囲う（ステップＳ１９０２）。そして、記述コンバータ１００３は、メディアセグメントを<seq>と</seq>で囲い（ステップＳ１９０３）、seq要素のsystem-bitrate属性をsystem-bitrate=”56000”に設定する（ステップＳ１９０４）。 First, the description converter 1003 outputs a SMIL header (step 1901). Next, description converter 1003 surrounds the entire medium with <switch> and </ switch> (step S1902). The description converter 1003 encloses the media segment with <seq> and </ seq> (step S1903), and sets the system-bitrate attribute of the seq element to system-bitrate = “56000” (step S1904).

system-bitrate属性はswitch要素内の条件評価に用いられるもので、システムが利用可能な帯域を１秒あたりのビット数で指定するものある。ここに記された値以上であればswitch要素が条件に合うと判定される。上記の例では、ビットレートが56000bps以上であれば条件に合うと判定される。そして、この条件一致がswitch要素の中で最初の条件一致であれば、一致した条件のメディアが選択される。 The system-bitrate attribute is used for condition evaluation in the switch element, and designates the bandwidth that can be used by the system in the number of bits per second. If the value is greater than or equal to this value, it is determined that the switch element meets the condition. In the above example, if the bit rate is 56000 bps or more, it is determined that the condition is met. If this condition match is the first condition match in the switch element, the media with the matched condition is selected.

次に、記述コンバータ１００３は、図５に示すＳ５０３〜Ｓ５０５もしくは図９に示すＳ９０３〜Ｓ９０８の処理を行う（ステップＳ１９０５）。これにより、メディアセグメントを再生するSMIL文書を出力する。 Next, the description converter 1003 performs the processing of S503 to S505 shown in FIG. 5 or S903 to S908 shown in FIG. 9 (step S1905). This outputs a SMIL document that reproduces the media segment.

この場合、代替データを表すalt要素を無視することにより、実施の形態１でのステップＳ４０５あるいはステップＳ４１２の処理手順を用いることができる。 In this case, the processing procedure of step S405 or step S412 in the first embodiment can be used by ignoring the alt element representing the substitute data.

次に、記述コンバータ１００３は、seq要素のsystem-bitrate属性を設定せず、この代替データを<seq>と</seq>で囲い（ステップＳ１９０６）、実施の形態２で示した図１６のＳ１６０３〜Ｓ１６１２あるいは、図１８のＳ１８０３〜Ｓ１８１０の処理手順を行う（ステップＳ１９０７）。これにより、記述コンバータ１００３は、代替データを再生するSMIL文書を出力する。 Next, description converter 1003 does not set the system-bitrate attribute of the seq element, and surrounds this alternative data with <seq> and </ seq> (step S1906), and S1603 in FIG. 16 shown in the second embodiment. To S1612 or the processing procedure of S1803 to S1810 in FIG. 18 is performed (step S1907). As a result, the description converter 1003 outputs a SMIL document that reproduces the substitute data.

このようにして、メディアセグメントを再生する場合と、代替データを再生する場合とのどちらを再生するか選択できるSMIL文書が作成できる。 In this way, it is possible to create a SMIL document in which it is possible to select whether to reproduce a media segment or alternative data.

図２０に、実施の形態３により、出力されるSMIL文書を示す。図２０に示すSMIL文書にはswitch要素２０００が記述されていて、switch要素はふたつのseq要素２００１、２００２を内容として持っている。ひとつのseq要素２００１は、<seq system-bitrate=”56000”>から最初の</seq>までの部分、もうひとつのseq要素２００２はその下の<seq>から</seq>までの部分である。このswitch要素が、<seq system-bitrate=”56000”>を評価する。使用するシステムが利用可能なビットレートが56000bps以上であれば、この条件を満たすため、seq要素２００１が選択される。システムが利用可能なビットレートが56000bps未満であれば、seq要素２００１は選択されず、seq要素２００２を評価する。 FIG. 20 shows an SMIL document output according to the third embodiment. A switch element 2000 is described in the SMIL document shown in FIG. 20, and the switch element has two seq elements 2001 and 2002 as contents. One seq element 2001 is the part from <seq system-bitrate = ”56000”> to the first </ seq>, and the other seq element 2002 is the part from <seq> to </ seq> below it. is there. This switch element evaluates <seq system-bitrate = ”56000”>. If the bit rate available to the system to be used is 56000 bps or more, the seq element 2001 is selected to satisfy this condition. If the bit rate available to the system is less than 56000 bps, the seq element 2001 is not selected and the seq element 2002 is evaluated.

seq要素２００１はメディアセグメントを再生することを示す部分であり、seq要素２００２は代替データを再生することを示す部分である。よって、システムが利用可能なビットレートが56000bps以上であればメディアセグメントを再生し、56000bps未満であれば代替データが再生されることになる。 A seq element 2001 is a part indicating that the media segment is reproduced, and a seq element 2002 is a part indicating that the alternative data is reproduced. Therefore, if the bit rate that can be used by the system is 56000 bps or more, the media segment is reproduced, and if it is less than 56000 bps, the alternative data is reproduced.

なお、本実施の形態においては、メディアセグメントあるいは代替データのどちらを再生するかの選択の条件として、ネットワークの接続ビットレートを用いたが、他の条件であってもよい。ただし、その場合、SMILのswitch要素を用いることができない条件もあるため、SMILのswitch要素を拡張した表現記述データを定義する必要がある。 In the present embodiment, the connection bit rate of the network is used as a condition for selecting whether to reproduce the media segment or the alternative data, but other conditions may be used. In this case, however, there are conditions in which the SMIL switch element cannot be used, so it is necessary to define expression description data that extends the SMIL switch element.

また、構造記述データのaltを、図２１（ａ）に示すように、ここで指定された代替データを使う条件を記述するconditionという子要素を持つように拡張し、condition要素で指定された条件によって、場合分けをしてもよい。 In addition, as shown in FIG. 21A, the alt of the structure description data is expanded to have a child element called “condition” that describes the condition for using the alternative data specified here, and the condition specified by the “condition” element Depending on the case, it may be divided.

図２１（ｂ）に、conditionという子要素を用いた構造記述データを示す。図２１（ｂ）に記す構造記述データは、narrow bandのときに一行上のデータを使うように、表現記述データを構成することを意味する。 FIG. 21B shows structure description data using a child element called condition. The structure description data shown in FIG. 21B means that the expression description data is configured so that data on one line is used in the narrow band.

以上のように、実施の形態３によれば、メディアコンテンツの全体あるいは部分の構成を、メディアセグメントの時間情報と、該当メディアセグメントが動画ならばその代表画像といったような代替データの集合によって記述する構造記述データから、構造記述データに記述されているメディアセグメントおよびその代替データの再生順序、再生のタイミング、同期情報および、メディアセグメントもしくは代替データのどちらかを選択して再生することを示す情報を表現する表現記述データへの変換が行える。これにより、メディアコンテンツの構成に関する情報から、端末に合わせて、メディアセグメントもしくは代替データの選択を含んだ、再生に関する情報を生成することができる。 As described above, according to the third embodiment, the whole or part of the media content is described by the time information of the media segment and a set of alternative data such as a representative image if the media segment is a moving image. Information indicating that the media segment described in the structure description data and its alternative data are reproduced from the structure description data, the playback timing, the playback timing, the synchronization information, and either the media segment or the alternative data for playback. Conversion to expression description data to be expressed is possible. As a result, it is possible to generate information relating to reproduction including selection of a media segment or alternative data in accordance with the terminal from information relating to the configuration of the media content.

（実施の形態４）
実施の形態４は、映像情報と音声情報とが同期した連続視聴覚情報（メディアコンテンツ）に関して、あらすじやハイライトシーンのようなメディアコンテンツの代表的な部分のみの再生、配信を行うために、メディアコンテンツに対して、メディアコンテンツを区分けした各区分（メディアセグメント）の集合によって該当メディアコンテンツの構成を表現し、かつ各メディアセグメントの時間情報と該当メディアセグメントの文脈内容に基づいた重要度を記述したものである構造記述データと、文脈内容に基づいた重要度のしきい値とを入力し、しきい値以上のメディアセグメントだけを構造記述データから選択するものである。そして、構造記述データから選択したメディアセグメントを、再生形態として、メディアセグメントの再生順序、再生のタイミングを表現する表現記述データに変換し、出力するものである。 (Embodiment 4)
In the fourth embodiment, with regard to continuous audiovisual information (media content) in which video information and audio information are synchronized, in order to reproduce and distribute only a representative part of media content such as a synopsis or a highlight scene, For the content, the configuration of the corresponding media content is expressed by a set of each segment (media segment) into which the media content is divided, and the importance based on the time information of each media segment and the context content of the corresponding media segment is described The structure description data and the threshold value of importance based on the context contents are input, and only the media segment equal to or higher than the threshold value is selected from the structure description data. Then, the media segment selected from the structure description data is converted into representation description data that represents the playback order and playback timing of the media segments as a playback mode, and is output.

これにより、メディアコンテンツの構成に関する情報から、重要度の高いメディアセグメントのみを選択することによって、あらすじやハイライトシーンを構成するメディアセグメントだけを選び出し、選び出したメディアセグメントのみに対して、再生に関する表現記述データへの変換を行うことができるようにしたものである。 This allows you to select only the media segments that make up the synopsis or highlight scene by selecting only the media segments with the highest importance from the information related to the configuration of the media content. It can be converted into descriptive data.

以下、本発明の実施の形態４について説明する。実施の形態４は、メディアセグメントの代替データが指定されていない構成に関するものである。図２２に、実施の形態４におけるデータ処理装置のブロック図を示す。図２２において、１５０１で示されるものは選択手段である要約エンジンである。１５０２で記されるものは変換手段である記述コンバータである。１５０３で記されるものは入力データであり構造記述データである内容記述であり、１５０４で記されるものは選択条件であり、１５０５で記されるものは出力であり表現記述データである再生方法記述である。 Embodiment 4 of the present invention will be described below. The fourth embodiment relates to a configuration in which alternative data for a media segment is not specified. FIG. 22 shows a block diagram of a data processing apparatus according to the fourth embodiment. In FIG. 22, what is indicated by reference numeral 1501 is a summary engine as selection means. What is described by 1502 is a description converter which is a conversion means. Reproduction method 1503 is input data and content description that is structure description data, 1504 is a selection condition, 1505 is output and expression description data. It is a description.

図２３に、実施の形態４で用いる構造記述データのDTDを示す。図２３に示すDTDは、図２（ａ）で示したDTDのsegment要素に、メディアセグメントの文脈内容に基づく重要度を表すscore２３０１という属性を加えたものである。この重要度は、正の整数値で表されるものとし、１が最も重要度が低いとする。 FIG. 23 shows the DTD of the structure description data used in the fourth embodiment. The DTD shown in FIG. 23 is obtained by adding an attribute “score 2301” indicating the importance based on the context contents of the media segment to the segment element of the DTD shown in FIG. This importance is represented by a positive integer value, and 1 is the lowest importance.

次に、図２４に、実施の形態４の構造記述データである内容記述１５０３の例を示す。 Next, FIG. 24 shows an example of the content description 1503 which is the structure description data of the fourth embodiment.

図中２４０１に示すように、各セグメントには、重要度を示すscore属性が付与されている。 As indicated by 2401 in the figure, each segment is given a score attribute indicating importance.

実施の形態４においては、選択条件１５０４として、メディアセグメントの重要度を用いる。そして、要約エンジン１５０１は、メディアセグメントの重要度があるしきい値以上であることを条件として、メディアセグメントの選択を行う。以下、選択手段である要約エンジン１５０１の処理について、図２５のフローチャートを用いて説明する。 In the fourth embodiment, the importance of the media segment is used as the selection condition 1504. The summary engine 1501 selects a media segment on the condition that the importance of the media segment is equal to or greater than a certain threshold value. Hereinafter, processing of the summary engine 1501 as selection means will be described with reference to the flowchart of FIG.

まず、要約エンジン１５０１は、ステップＳ２５０１において、内容記述１５０３に記述された最初のメディアセグメント、すなわちsegment要素の先頭のものを取り出す。次に、要約エンジン１５０１は、ステップＳ２５０２において、取り出したメディアセグメントのスコアである、segment要素のscore属性を取り出し、それがしきい値以上であるかを調べる。そして、要約エンジン１５０１は、最初のメディアセグメントのscore属性がしきい値以上の場合はステップＳ２５０３の処理に以降し、最初のメディアセグメントのscore属性がしきい値未満の場合はステップＳ２５０４の処理に移行する。 First, in step S2501, the summary engine 1501 extracts the first media segment described in the content description 1503, that is, the first segment element. Next, in step S2502, the summary engine 1501 extracts the score attribute of the segment element, which is the score of the extracted media segment, and checks whether it is greater than or equal to the threshold value. If the score attribute of the first media segment is greater than or equal to the threshold, the summarization engine 1501 proceeds to the process of step S2503, and if the score attribute of the first media segment is less than the threshold, the process proceeds to step S2504. Transition.

要約エンジン１５０１は、ステップＳ２５０３では、該当メディアセグメントの開始時間と終了時間であるsegment要素のstart属性,end属性の値を変換手段である記述コンバータ１５０２に出力する。 In step S2503, the summarization engine 1501 outputs the start attribute and end attribute values of the segment element, which are the start time and end time of the corresponding media segment, to the description converter 1502 as conversion means.

また、要約エンジン１５０１は、ステップＳ２５０４では、未処理のメディアセグメントがあるかどうかを調べる。そして、要約エンジン１５０１は、未処理のメディアセグメントがある場合は、ステップＳ２５０５の処理に以降し、未処理のメディアセグメントがない場合は処理を終了する。 In step S2504, the summary engine 1501 checks whether there is an unprocessed media segment. If there is an unprocessed media segment, the summarization engine 1501 proceeds to the process of step S2505, and if there is no unprocessed media segment, ends the process.

要約エンジン１５０１は、ステップＳ２５０５において、未処理のメディアセグメントのうち、先頭のsegment要素を取り出し、ステップＳ２５０２へ移行する。 In step S2505, the summary engine 1501 extracts the first segment element from the unprocessed media segments, and proceeds to step S2502.

変換手段である記述コンバータ１５０２の処理は、実施の形態１で示した図４の構造記述データからＳＭＩＬへの変換の手順と同様であるので、詳細な説明は省略する。 The processing of the description converter 1502 serving as the conversion means is the same as the procedure for converting the structure description data into SMIL shown in FIG.

なお、実施の形態４においては要約エンジン１５０１が、選択したメディアセグメントの要素の内容を記述コンバータ１５０２へ出力し、記述コンバータ１５０２はそれを用いて処理を行う構成であるが、要約エンジン１５０１が選択されたメディアセグメントだけを残した中間的な構造記述データを作成し、記述コンバータ１５０２がこの中間的な構造記述データを入力して処理を行うものであってもよい。 In the fourth embodiment, the summary engine 1501 outputs the content of the selected media segment element to the description converter 1502, and the description converter 1502 performs processing using the content, but the summary engine 1501 selects the content. It is also possible to create intermediate structure description data that leaves only the media segment, and the description converter 1502 inputs the intermediate structure description data and performs processing.

図２６に、しきい値を４とした場合に図２３の構造記述データである内容記述１５０３から生成される中間的な構造記述データの例を示す。 FIG. 26 shows an example of intermediate structure description data generated from the content description 1503, which is the structure description data of FIG.

図中２６０１からわかるように、中間的な構造記述データには、しきい値４以上のメディアセグメントだけが選択されて記述されている。 As can be seen from reference numeral 2601 in the figure, in the intermediate structure description data, only media segments having a threshold value of 4 or more are selected and described.

また、選択条件として、メディアセグメントの重要度があるしきい値以上であることとしたが、選択したメディアセグメントの再生時間の総和があるしきい値以下であることとしてもよい。この場合は、要約エンジン１５０１では、すべてのメディアセグメントをその重要度の高い順にソートし、再生時間の総和がしきい値以下で、かつ最大となるまで、ソートした先頭からメディアセグメントを選択していく処理を行うようにする。また、メディアセグメントの重要度の条件と再生時間の条件との組み合わせる形態であってもよい。 In addition, as the selection condition, the importance of the media segment is not less than a certain threshold value, but the sum of the reproduction times of the selected media segment may be not more than a certain threshold value. In this case, the summary engine 1501 sorts all media segments in descending order of importance, and selects media segments from the sorted head until the total playback time is equal to or less than the threshold and becomes the maximum. Process. Further, a combination of the media segment importance condition and the reproduction time condition may be employed.

以上のように、実施の形態４によれば、メディアセグメントの文脈内容に基づいた重要度から、メディアセグメントの選択を行うことにより、あらすじやハイライトシーン集などを構成し、それらの表現記述データの生成が行える。これにより、ユーザが希望する部分だけのメディアコンテンツの再生、配信が行える。 As described above, according to the fourth embodiment, by selecting a media segment from the importance based on the context content of the media segment, a synopsis, a highlight scene collection, and the like are constructed, and their expression description data Can be generated. As a result, it is possible to reproduce and distribute the media content only for the portion desired by the user.

なお、セグメントの重要度に合わせて、セグメントの再生時間を変化させた要約内容記述を作成してもよい。 A summary content description in which the segment playback time is changed may be created in accordance with the importance of the segment.

（実施の形態５）
実施の形態５は、実施の形態４が映像情報と音声情報がひとつのメディアオブジェクトになっているものに限定されているのに対し、複数のメディアオブジェクトによる同期で構成されている場合も含むようにしたものである。 (Embodiment 5)
The fifth embodiment is limited to the one in which the video information and the audio information are one media object in the fourth embodiment, but includes the case where it is configured by synchronization with a plurality of media objects. It is a thing.

以下、本発明の実施の形態５について述べる。実施の形態５は、メディアセグメントの代替データが指定されていない構成に関するものである。実施の形態５におけるデータ処理装置のブロック図は図２２に示したものと同様である。 Embodiment 5 of the present invention will be described below. The fifth embodiment relates to a configuration in which alternative data for a media segment is not specified. The block diagram of the data processing apparatus in the fifth embodiment is the same as that shown in FIG.

実施の形態５においても、構造記述データ１５０３のためのＤＴＤとして、図２３に示したものを用いる。図２７に、実施の形態５における構造記述データである内容記述１５０３の例を示す。 Also in the fifth embodiment, the DTD for the structure description data 1503 is shown in FIG. FIG. 27 shows an example of a content description 1503 that is structure description data in the fifth embodiment.

図２７に示す内容記述１５０３には、タイプがvideoのmediaObject要素２７０１と、タイプがaudioのmediaObject要素２７０２と、が記述されている。図中２７０３に示すようにタイプがvideoのmediaObject要素２７０１のセグメントには、重要度であるscore属性が記述されている。また、図中２７０４に示すようにタイプがaudioのmediaObject要素２７０２のセグメントには、重要度であるscore属性が記述されている。 In the content description 1503 shown in FIG. 27, a mediaObject element 2701 whose type is video and a mediaObject element 2702 whose type is audio are described. As shown by reference numeral 2703 in the figure, a score attribute that is an importance level is described in a segment of a mediaObject element 2701 of type video. In addition, as shown by 2704 in the figure, a score attribute which is an importance level is described in a segment of a mediaObject element 2702 whose type is audio.

実施の形態５においても、選択条件１５０４はメディアセグメントの重要度があるしきい値以上であることとする。この場合の選択手段である要約エンジン１５０１の処理は、実施の形態４における要約エンジン１５０１の処理を、各mediaObject要素毎に行うこととなる。 Also in the fifth embodiment, it is assumed that the selection condition 1504 is greater than or equal to a certain threshold value of the media segment. In this case, the processing of the summary engine 1501, which is a selection unit, performs the processing of the summary engine 1501 in the fourth embodiment for each mediaObject element.

図２８に実施の形態５における要約エンジン１５０１の処理のフローチャートを示す。 FIG. 28 shows a flowchart of processing of the summary engine 1501 in the fifth embodiment.

まず、要約エンジン１５０１は、ステップＳ２８０１において、最初のmediaObject要素を取り出す。次に、要約エンジン１５０１は、Ｓ２８０２において、取り出したmediaObject要素の内容であるメディアセグメントのうち先頭のものであるsegment要素を取り出す。そして、要約エンジン１５０１は、ステップＳ２８０３において、取り出したメディアセグメントのスコアを表すsegment要素のscore属性の値を取り出し、それがしきい値以上であるかどうかを調べる。 First, the summary engine 1501 extracts the first mediaObject element in step S2801. Next, in S2802, the summary engine 1501 extracts the segment element that is the head of the media segments that are the contents of the extracted mediaObject element. In step S2803, the summary engine 1501 extracts the score attribute value of the segment element indicating the score of the extracted media segment, and checks whether it is equal to or greater than the threshold value.

要約エンジン１５０１は、取り出したメディアセグメントがしきい値以上の場合はステップＳ２８０４の処理に移行し、取り出したメディアセグメントがしきい値未満の場合はステップＳ２８０５の処理に移行する。そして、要約エンジン１５０１は、ステップＳ２８０４で、該当メディアセグメントの開始時間と終了時間であるsegment要素の、start属性およびend属性の値を記述コンバータ１５０２に出力する。 The summary engine 1501 proceeds to the process of step S2804 when the extracted media segment is equal to or greater than the threshold value, and proceeds to the process of step S2805 when the extracted media segment is less than the threshold value. In step S2804, the summary engine 1501 outputs the start attribute and end attribute values of the segment element, which are the start time and end time of the corresponding media segment, to the description converter 1502.

次に、要約エンジン１５０１は、ステップＳ２８０５で、未処理のメディアセグメントがあるかどうかを調べる。そして、要約エンジン１５０１は、未処理のメディアセグメントがある場合はステップＳ２８０６の処理に移行し、未処理のメディアセグメントない場合はステップＳ２８０５の処理に移行する。 Next, in step S2805, the summary engine 1501 checks whether there is an unprocessed media segment. If there is an unprocessed media segment, the summary engine 1501 proceeds to the process of step S2806, and if there is no unprocessed media segment, the digest engine 1501 proceeds to the process of step S2805.

そして、要約エンジン１５０１は、ステップＳ２８０６で、未処理のメディアセグメントのうち、先頭のsegment要素を取り出し、ステップＳ２８０３の処理に移行する。 In step S2806, the summary engine 1501 extracts the first segment element from unprocessed media segments, and proceeds to the process in step S2803.

一方、要約エンジン１５０１は、ステップＳ２８０７で、未処理のmediaObject要素がまだ残っているかどうかを調べ、まだ残っている場合はステップＳ２８０８の処理に移行し、残っていない場合は処理を終了する。そして、要約エンジン１５０１は、ステップＳ２８０８で、未処理のmediaObject要素のうち、先頭のmediaObject要素を取り出し、ステップＳ２８０２の処理に移行する。 On the other hand, the summarization engine 1501 checks in step S2807 whether or not an unprocessed mediaObject element still remains, and moves to the process of step S2808 if it still remains, and ends the process if it does not remain. In step S2808, the summary engine 1501 extracts the first mediaObject element from the unprocessed mediaObject elements, and proceeds to the process in step S2802.

実施の形態５の変換手段である記述コンバータ１５０２も、各mediaObject要素毎の処理となるが、実施の形態１に示した図４の構造記述データからＳＭＩＬへの変換の手順の処理と同様の処理を行う。 The description converter 1502 which is the conversion means of the fifth embodiment also performs processing for each mediaObject element, but the same processing as the processing of the procedure for converting the structure description data to SMIL of FIG. 4 shown in the first embodiment. I do.

なお、実施の形態５においては、要約エンジン１５０１が、選択したメディアセグメントの要素の内容を記述コンバータ１５０２へ出力し、要約エンジン１５０２はそれを用いて処理を行う構成であるが、要約エンジン１５０１が選択されたメディアセグメントだけを残した中間的な構造記述データを作成し、記述コンバータ１５０２はこの中間的な構造記述データを入力して処理を行うものであってもよい。 In the fifth embodiment, the summary engine 1501 outputs the content of the element of the selected media segment to the description converter 1502, and the summary engine 1502 performs processing using it. The intermediate structure description data leaving only the selected media segment may be created, and the description converter 1502 may input the intermediate structure description data and perform processing.

図２９に、しきい値を４とした場合に図２７の内容記述１５０３から生成される中間的な構造記述データの例を示す。 FIG. 29 shows an example of intermediate structure description data generated from the content description 1503 of FIG.

図中２９０１からわかるように、タイプがvideoのmediaObject要素には、しきい値４以上のメディアセグメントだけが選択され記述されている。また、図中２９０２からわかるように、タイプがaudioのmediaObject要素にも、しきい値が４以上のメディアセグメントだけが選択され記述されている。 As can be seen from 2901 in the figure, in the mediaObject element whose type is video, only media segments with a threshold value of 4 or more are selected and described. Also, as can be seen from 2902 in the figure, only the media segment having a threshold value of 4 or more is selected and described in the mediaObject element whose type is audio.

また、SMIL文書においてpar要素の中の各クリップに関し、同期させるために再生開始時間を異なるものとする必要が出てくる場合がある。この場合は、各クリップの再生開始時間を計算し、その時間がくれば再生を始めるようにする必要がある。 In addition, it may be necessary to make the playback start times different in order to synchronize each clip in the par element in the SMIL document. In this case, it is necessary to calculate the playback start time of each clip and start playback when that time arrives.

以上のように、実施の形態５によれば、メディアセグメントの文脈内容に基づいた重要度から、メディアセグメントの選択を行うことにより、あらすじやハイライトシーン集などを構成し、それらの表現記述データの生成が行える。これにより、ユーザが希望する部分だけのメディアコンテンツの再生、配信が行える。 As described above, according to the fifth embodiment, by selecting a media segment based on the importance based on the context content of the media segment, a synopsis, a highlight scene collection, and the like are constructed, and their expression description data Can be generated. As a result, it is possible to reproduce and distribute the media content only for the portion desired by the user.

（実施の形態６）
発明の実施の形態６について説明する。実施の形態４がメディアセグメントの代替データが指定されていないのに対し、実施の形態６はメディアセグメントの代替データを指定したものである。また、実施の形態６は、要約エンジンにおいてメディアセグメントを再生するか、代替データを再生するかの選択を行わない構成に関するものである。 (Embodiment 6)
Embodiment 6 of the invention will be described. In the fourth embodiment, alternative data for the media segment is not specified, whereas in the sixth embodiment, alternative data for the media segment is specified. Further, the sixth embodiment relates to a configuration in which the summary engine does not select whether to reproduce the media segment or the alternative data.

実施の形態６におけるデータ処理装置のブロック図は図２２に示したものと同様である。 The block diagram of the data processing apparatus in the sixth embodiment is the same as that shown in FIG.

実施の形態６において用いる構造記述データのDTDの例を図３０に示す。図中３００１に示すように、図３０に示すDTDは、図１３で示したDTDのsegment要素にメディアセグメントの文脈内容に基づく重要度を表すscoreという属性を加えたものである。この重要度は、正の整数値で表されるものとし、１が最も重要度が低いとする。 An example of the DTD of the structure description data used in the sixth embodiment is shown in FIG. As indicated by reference numeral 3001 in the figure, the DTD shown in FIG. 30 is obtained by adding the attribute “score” representing the importance based on the context contents of the media segment to the segment element of the DTD shown in FIG. This importance is represented by a positive integer value, and 1 is the lowest importance.

図３１に、構造記述データである内容記述１５０３の例を示す。図３１からわかるように、代替データで構成されるセグメントには、それぞれ重要度を示すscore属性が記述されている。 FIG. 31 shows an example of a content description 1503 that is structure description data. As can be seen from FIG. 31, a score attribute indicating importance is described in each segment composed of alternative data.

実施の形態６における選択手段である要約エンジン１５０１の処理は、実施の形態４における要約エンジンの処理と同様である。だたし、実施の形態６における選択手段である要約エンジン１５０１は、選択したメディアセグメントの出力する際に、segment要素のstart属性，end属性に加えて、子要素であるalt要素も出力する。 The processing of summary engine 1501, which is the selection means in the sixth embodiment, is the same as the processing of summary engine in the fourth embodiment. However, the summary engine 1501, which is the selection means in the sixth embodiment, outputs the alt element as a child element in addition to the start attribute and end attribute of the segment element when outputting the selected media segment.

また、実施の形態６における変換手段である記述コンバータ１５０２の処理は、実施の形態１、実施の形態２、実施の形態３に示した図４の構造記述データからSMILへの変換の手順の処理と同様である。 The processing of the description converter 1502, which is the conversion means in the sixth embodiment, is the procedure of the procedure for converting the structure description data to SMIL in FIG. 4 shown in the first, second, and third embodiments. It is the same.

なお、本実施の形態においては要約エンジン１５０１が、選択したメディアセグメントの要素の内容を記述コンバータ１５０２へ出力し、記述コンバータ１５０２はそれを用いて処理を行う構成であるが、要約エンジン１５０１が選択されたメディアセグメントだけを残した中間的な構造記述データを作成し、記述コンバータ１５０２はこの中間的な構造記述データを入力して処理を行うものであってもよい。 In this embodiment, the summary engine 1501 outputs the content of the selected media segment element to the description converter 1502, and the description converter 1502 is configured to perform processing using the content, but the summary engine 1501 selects the content. It is also possible to create intermediate structure description data that leaves only the media segments that have been processed, and the description converter 1502 inputs the intermediate structure description data and performs processing.

図３２に、しきい値を４とした場合に図３１の構造記述データである内容記述１５０３から生成される中間的な構造記述データの例を示す。 FIG. 32 shows an example of intermediate structure description data generated from the content description 1503, which is the structure description data of FIG.

図３２に記された構造記述データには、重要度であるscore属性の値が４以上のセグメントと、このセグメントの代替データのみを選択して記述してある。 In the structure description data shown in FIG. 32, only a segment having an importance score attribute value of 4 or more and alternative data of this segment are selected and described.

（実施の形態７）
本発明の実施の形態７について説明する。実施の形態５がメディアセグメントの代替データが指定されていないのに対し、実施の形態７はメディアセグメントの代替データを指定したものである。また、実施の形態７は、メディアセグメントの代替データが指定されており、また要約エンジンにおいてメディアセグメントを再生するか代替データを再生するかの選択を行わない構成に関するものである。 (Embodiment 7)
Embodiment 7 of the present invention will be described. In the fifth embodiment, the alternative data for the media segment is not specified, whereas in the seventh embodiment, the alternative data for the media segment is specified. Further, the seventh embodiment relates to a configuration in which alternative data for a media segment is designated and selection of whether to reproduce a media segment or alternative data is not performed in the summary engine.

実施の形態７におけるデータ処理装置のブロック図は図２２に示したものと同様である。 The block diagram of the data processing device in the seventh embodiment is the same as that shown in FIG.

実施の形態７においても、構造記述データである内容記述１５０３のためのDTDとして、図３０に示したものを用いる。図３３に実施の形態７における構造記述データである内容記述１５０３の例を示す。 Also in the seventh embodiment, the DTD shown in FIG. 30 is used as the DTD for the content description 1503 that is the structure description data. FIG. 33 shows an example of a content description 1503 which is structure description data in the seventh embodiment.

実施の形態７における選択手段である要約エンジン１５０１の処理は、実施の形態５における要約エンジン１５０１の処理と同様である。だたし、実施の形態７にかかる要約エンジン１５０１は、選択したメディアセグメントを出力する際に、segment要素のstart属性，end属性に加えて、子要素であるalt要素も出力する。 The processing of summary engine 1501, which is the selection means in the seventh embodiment, is the same as the processing of summary engine 1501 in the fifth embodiment. However, when outputting the selected media segment, the summary engine 1501 according to the seventh embodiment outputs an alt element as a child element in addition to the start attribute and end attribute of the segment element.

また、実施の形態７における記述コンバータ１５０２の処理は、実施の形態１、実施の形態２、または実施の形態３に示した図４の構造記述データからSMILへの変換の手順の処理と同様である。 The processing of the description converter 1502 in the seventh embodiment is the same as the processing of the procedure for converting the structure description data into SMIL in FIG. 4 shown in the first, second, or third embodiment. is there.

図３４に示す構造記述データは、しきい値を４とした場合に図３３の内容記述１５０３から生成される中間的な構造記述データの例である。 The structure description data shown in FIG. 34 is an example of intermediate structure description data generated from the content description 1503 in FIG.

図３４に記された構造記述データには、重要度であるscore属性の値が４以上のセグメントとこのセグメントの代替データを、メディアのタイプ毎に分けて記述してある。 In the structure description data shown in FIG. 34, a segment having an importance score attribute value of 4 or more and alternative data of this segment are described separately for each type of media.

（実施の形態８）
実施の形態８は、映像情報と音声情報とが同期した連続視聴覚情報（メディアコンテンツ）のあらすじやハイライトシーンのようなメディアコンテンツの代表的な部分のみを、端末にあった表示メディアによって再生、配信を行うためのものである。つまり、メディアコンテンツに対して、メディアコンテンツを区分けした各区分（メディアセグメント）の集合によって該当メディアコンテンツの構成を表現し、かつ各メディアセグメントの時間情報と、該当メディアセグメントの文脈内容に基づいた重要度を記述したものである構造記述データと、文脈内容に基づいた重要度のしきい値とを入力とし、しきい値以上のメディアセグメントだけを選択するものである。そして、選択したメディアセグメントの再生形態としてメディアセグメントあるいは該当メディアセグメントの代替データのどちらかを選択して、その再生順序、再生のタイミングを表現する表現記述データに変換し出力するものである。 (Embodiment 8)
In the eighth embodiment, only a representative portion of media content such as a synopsis of continuous audiovisual information (media content) in which video information and audio information are synchronized and a highlight scene is reproduced by a display medium in the terminal, It is for delivery. In other words, for the media content, the structure of the corresponding media content is expressed by a set of each segment (media segment) into which the media content is divided, and important based on the time information of each media segment and the context content of the corresponding media segment. The structure description data describing the degree and the threshold value of the importance based on the context contents are input, and only the media segment equal to or higher than the threshold value is selected. Then, either the media segment or alternative data of the corresponding media segment is selected as the playback mode of the selected media segment, converted into expression description data expressing the playback order and playback timing, and output.

これにより、メディアコンテンツの構成に関する情報から、重要度の高いメディアセグメントのみを選択することによって、あらすじやハイライトシーンを構成するメディアセグメントだけを選び出し、選び出したメディアセグメントのみの再生に関する表現記述データへの変換を行うことができる。よって、メディアコンテンツを再生する端末の能力や配信するネットワークの状況に合わせたメディアの選択を実現することができる。 As a result, by selecting only the media segments with high importance from the information on the configuration of the media content, only the media segments constituting the synopsis or highlight scene are selected, and the expression description data relating to the playback of only the selected media segments is selected. Can be converted. Therefore, it is possible to realize the selection of media in accordance with the capability of the terminal that reproduces the media content and the situation of the network to be distributed.

本発明に実施の形態８について説明する。実施の形態６がメディアセグメントの代替データが指定されており、メディアセグメントを再生するか代替データを再生するかの選択を行わないものに対し、実施の形態８は、メディアセグメントの代替データが指定されており、メディアセグメントを再生するか代替データを再生するかの選択を行うものである。実施の形態８においては、選択手段が、メディアセグメント選択手段と、再生メディア選択手段に分かれる。また、選択条件はセグメント選択条件と再生メディア選択条件に分かれる。 An eighth embodiment of the present invention will be described. In the sixth embodiment, the alternative data of the media segment is designated and the selection of whether to reproduce the media segment or the alternative data is not performed. In the eighth embodiment, the alternative data of the media segment is designated. The user selects whether to play the media segment or the alternative data. In the eighth embodiment, the selection means is divided into media segment selection means and playback media selection means. The selection conditions are divided into segment selection conditions and playback media selection conditions.

図３５に実施の形態８におけるデータ処理装置のブロック図を示す。図３５において、２８０１で記されるものはメディアセグメント選択手段である要約エンジンである。また、２８００で記されるものは記述コンバータである。また、記述コンバータ２８００は、再生メディア選択手段である再生メディア選択部２８０２と、変換手段である変換部２８０３とから構成されている。 FIG. 35 shows a block diagram of a data processing apparatus according to the eighth embodiment. In FIG. 35, reference numeral 2801 denotes a summary engine which is a media segment selection means. Also, what is indicated by 2800 is a description converter. The description converter 2800 includes a reproduction media selection unit 2802 as reproduction media selection means and a conversion unit 2803 as conversion means.

２８０４で記されるものは入力データであり構造記述データである内容記述を、２８０５で記されるものはセグメント選択条件を、２８０６で記されるものは再生メディア選択条件を、２８０７で記されるものは出力であり表現記述データある再生方法記述を表す。 What is described by 2804 is input data and content description which is structure description data, what is described by 2805 is segment selection condition, what is described by 2806 is playback media selection condition by 2807 A thing is an output and represents a reproduction method description having expression description data.

実施の形態８において、構造記述データである内容記述２８０４は実施の形態６における内容記述１５０３と同じものを用いる。すなわち、内容記述２８０４は、図３０に示したＤＴＤを用いたもので、一例は図３１に示されているものである。また、セグメント選択条件２８０５は、実施の形態４あるいは実施の形態６における選択条件１５０４と同様のものを用いる。この場合、メディアセグメント選択手段である要約エンジン２８０１の処理は、実施の形態６における要約エンジン１５０１の処理と同様となる。 In the eighth embodiment, the content description 2804 which is the structure description data is the same as the content description 1503 in the sixth embodiment. That is, the content description 2804 uses the DTD shown in FIG. 30, and an example is shown in FIG. The segment selection condition 2805 is the same as the selection condition 1504 in the fourth embodiment or the sixth embodiment. In this case, the process of the summary engine 2801 as the media segment selection means is the same as the process of the summary engine 1501 in the sixth embodiment.

次に、再生メディア選択部２８０２の処理について説明する。再生メディア選択部２８０２は、再生メディア選択条件２８０６として、メディアコンテンツを配送するネットワークの接続ビットレートを条件とする。すなわち、再生メディア選択部２８０２は、接続ビットレートが５６ｋｂｐｓ以上の場合はメディアセグメントを再生し、５６ｋｂｐｓ未満の場合は代替データを再生することとする。再生メディア選択部２８０２は、接続ビットレートを調べ、どちらを再生するかを判定し、変換部２８０３へ通知する。 Next, processing of the reproduction media selection unit 2802 will be described. The playback media selection unit 2802 uses the connection bit rate of the network that delivers the media content as the playback media selection condition 2806. That is, the playback media selection unit 2802 plays the media segment when the connection bit rate is 56 kbps or higher, and plays back alternative data when the connection bit rate is lower than 56 kbps. The reproduction media selection unit 2802 checks the connection bit rate, determines which one to reproduce, and notifies the conversion unit 2803.

変換部２８０３は、メディアセグメント選択手段である要約エンジン２８０１が選択したメディアセグメントの要素と、再生メディア選択部２８０２が選択した結果を入力し、再生メディア選択部２８０２の結果に基づいて、SMILによる表現記述データである再生方法記述２８０７を出力する。 The conversion unit 2803 inputs the element of the media segment selected by the summary engine 2801 as the media segment selection means and the result selected by the playback media selection unit 2802, and based on the result of the playback media selection unit 2802, the expression by SMIL A reproduction method description 2807 which is description data is output.

変換部２８０３が内容記述２８０４をSMILに変換する処理は、実施の形態１あるいは実施の形態に示した図４の構造記述データからSMILへの変換の手順の処理と同様である。 The process of converting the content description 2804 into SMIL by the conversion unit 2803 is the same as the process of converting the structure description data to SMIL in FIG. 4 shown in the first embodiment or the embodiment.

なお、本実施の形態においては、要約エンジン２８０１が、選択したメディアセグメントの要素の内容を変換部２８０３へ出力し、変換部２８０３はそれを用いて処理を行う構成であるが、要約エンジン２８０１が選択されたメディアセグメントだけを残した中間的な構造記述データを作成し、変換部２８０３はこの中間的な構造記述データを入力して処理を行うものであってもよい。 In this embodiment, the summary engine 2801 outputs the content of the selected media segment element to the conversion unit 2803, and the conversion unit 2803 performs processing using the content. The intermediate structure description data that leaves only the selected media segment may be created, and the conversion unit 2803 may input the intermediate structure description data and perform processing.

また、再生メディア選択条件２８０６として、ネットワークのビットレートを用いたが、他に、再生端末の能力や、ユーザからの要求などであってもよい。 Further, although the network bit rate is used as the playback media selection condition 2806, it may be the capability of the playback terminal, a request from the user, or the like.

（実施の形態９）
本発明に実施の形態９について述べる。実施の形態７がメディアセグメントの代替データが指定されており、メディアセグメントを再生するか代替データを再生するかの選択を行わないものに対し、実施の形態９は、メディアセグメントの代替データが指定されており、メディアセグメントを再生するか代替データを再生するかの選択を行うものである。また、実施の形態９は、選択手段においてメディアセグメントを再生するか代替データを再生するかの選択を行う構成に関するものである。 (Embodiment 9)
Embodiment 9 of the present invention will be described. In the ninth embodiment, the alternative data of the media segment is designated and the selection of whether to reproduce the media segment or the alternative data is not performed. In the ninth embodiment, the alternative data of the media segment is designated. The user selects whether to play the media segment or the alternative data. The ninth embodiment relates to a configuration in which the selection means selects whether to reproduce a media segment or alternative data.

実施の形態９においても、実施の形態８と同様、選択手段が、メディアセグメント選択手段と、再生メディア選択手段に分かれる。また、選択条件はセグメント選択条件と再生メディア選択条件に分かれる。したがって、本実施の形態におけるデータ処理装置のブロック図は、図３５に示したものと同様となる。 Also in the ninth embodiment, as in the eighth embodiment, the selection means is divided into a media segment selection means and a reproduction media selection means. The selection conditions are divided into segment selection conditions and playback media selection conditions. Therefore, the block diagram of the data processing apparatus in the present embodiment is the same as that shown in FIG.

実施の形態９において、構造記述データである内容記述２８０４は実施の形態７における構造記述データ１５０３と同じものを用いる。すなわち、内容記述２８０４は図３０に示したDTDを用いたもので、内容記述２８０４の一例は図３４に示されているものである。また、セグメント選択条件２８０５は、実施の形態８と同様のものを用いる。したがって、要約エンジン２８０１の処理は、実施の形態７における要約エンジン１５０１の処理と同様となる。 In the ninth embodiment, the content description 2804 which is the structure description data is the same as the structure description data 1503 in the seventh embodiment. That is, the content description 2804 uses the DTD shown in FIG. 30, and an example of the content description 2804 is shown in FIG. The segment selection condition 2805 is the same as that in the eighth embodiment. Therefore, the process of summary engine 2801 is the same as the process of summary engine 1501 in the seventh embodiment.

実施の形態９にかかる再生メディア選択部２８０２の処理は、実施の形態８に記述したものと同様のものを用いる。 The processing of the playback media selection unit 2802 according to the ninth embodiment uses the same processing as that described in the eighth embodiment.

変換部２８０３は、要約エンジン２８０１が選択したメディアセグメントの要素と、再生メディア選択部２８０２が選択した結果を入力し、再生メディア選択部２８０２の結果に基づいて、SMILによる表現記述データである再生方法記述２８０７を出力する。変換部２８０３が行う構造記述データからSMILへの変換処理は、実施の形態１あるいは実施の形態２に示した図４の構造記述データからSMILへの変換の手順と同様である。 The conversion unit 2803 inputs the element of the media segment selected by the summary engine 2801 and the result selected by the reproduction media selection unit 2802, and based on the result of the reproduction media selection unit 2802, a reproduction method that is expression description data by SMIL A description 2807 is output. The conversion process from the structure description data to SMIL performed by the conversion unit 2803 is the same as the conversion procedure from the structure description data to SMIL in FIG. 4 shown in the first or second embodiment.

（実施の形態１０）
実施の形態１０は、映像情報と音声情報とが同期した連続視聴覚情報（メディアコンテンツ）に関して、ユーザの嗜好に合わせたメディアコンテンツの代表的な部分のみの再生、配信を行うためのものである。つまり、実施の形態１０は、メディアコンテンツに対して、メディアコンテンツを区分けした各区分（メディアセグメント）の集合によって該当メディアコンテンツの構成を表現し、かつ各メディアセグメントの時間情報と、キーワードで表現される観点に基づいた該当メディアセグメントの重要度とを記述したものである構造記述データと、ユーザの嗜好に合った観点とその重要度のしきい値とを入力とし、しきい値以上のメディアセグメントだけを選択するものである。そして、選択したメディアセグメントの再生形態としてメディアセグメントの再生順序、再生のタイミングを表現する表現記述データに変換し出力するものである。これにより、メディアコンテンツの構成に関する情報から、該当観点に基づく重要度がしきい値以上のメディアセグメントだけを選び出し、選び出したメディアセグメントのみの再生に関する表現記述データへの変換を行うものである。この結果、観点に基づく重要度により、ユーザの嗜好に合わせたハイライトシーン集などを構成でき、その部分だけの再生、配送を行うことができるといった作用を有する。 (Embodiment 10)
The tenth embodiment is for reproducing and distributing only a representative portion of media content that matches the user's preference regarding continuous audiovisual information (media content) in which video information and audio information are synchronized. That is, in the tenth embodiment, the configuration of the corresponding media content is expressed by a set of each division (media segment) obtained by dividing the media content, and the time information of each media segment and the keyword are expressed. The structure description data that describes the importance of the corresponding media segment based on the viewpoint of the media, the viewpoint that matches the user's preference and the threshold of the importance, are input, and the media segment that exceeds the threshold Only to choose. Then, it is converted into expression description data expressing the playback order and playback timing of the media segments as the playback mode of the selected media segment, and output. As a result, only the media segment whose importance based on the corresponding viewpoint is equal to or higher than the threshold value is selected from the information related to the configuration of the media content, and converted into the expression description data relating to the reproduction of only the selected media segment. As a result, according to the importance based on the viewpoint, it is possible to configure a collection of highlight scenes and the like according to the user's preference, and to play and deliver only that portion.

以下、発明の実施の形態１０について説明する。実施の形態１０は、メディアセグメントの代替データが指定されていない構成に関するものである。実施の形態１０におけるデータ処理装置のブロック図は図２２に示したものと同様となる。 Embodiment 10 of the invention will be described below. The tenth embodiment relates to a configuration in which alternative data for a media segment is not specified. The block diagram of the data processing apparatus according to the tenth embodiment is the same as that shown in FIG.

図３６に、実施の形態１０で用いる構造記述データのDTDを示す。図中３６０１に示すように、図３６に示すDTDは、図２（ａ）で示したDTDのsegment要素に、キーワードで表現される観点に基づく重要度を表すスコアを表すため、pointOfViewという要素を子要素となるように加えたものである。 FIG. 36 shows the DTD of the structure description data used in the tenth embodiment. As indicated by 3601 in the figure, the DTD shown in FIG. 36 has an element called pointOfView in order to represent a score representing importance based on the viewpoint expressed by the keyword in the segment element of the DTD shown in FIG. It is added to become a child element.

また、図中３６０２に示すように、pointOfView要素は、viewPoint属性によって観点を表し、score属性によってviewPoint属性で示した観点に基づく重要度を表す。この重要度は、正の整数値で表されるものとし、１が最も重要度が低いとする。また、ひとつのsegment要素に複数のpointOfView要素をつけることができる。図３７に、実施の形態１０で用いる構造記述データである内容記述１５０３の例を示す。 Further, as indicated by 3602 in the figure, the pointOfView element represents a viewpoint by the viewPoint attribute, and represents the importance based on the viewpoint indicated by the viewPoint attribute by the score attribute. This importance is represented by a positive integer value, and 1 is the lowest importance. Also, multiple pointOfView elements can be attached to one segment element. FIG. 37 shows an example of a content description 1503 that is structure description data used in the tenth embodiment.

図３７からわかるように、segment要素毎に、pointOfView要素と、その属性であるviewPointと、属性であるscoreが記述されている。 As can be seen from FIG. 37, for each segment element, a pointOfView element, an attribute viewPoint, and an attribute score are described.

実施の形態１０においては、選択条件１５０４は、ある観点に関して、メディアセグメントのその観点に基づく重要度がしきい値以上であるとする。また、選択条件１５０４となる観点は複数であっても構わない。この場合の選択手段である要約エンジン１５０１における処理のフローチャートを図３８に示す。 In the tenth embodiment, the selection condition 1504 assumes that, for a certain viewpoint, the importance of the media segment based on that viewpoint is equal to or greater than a threshold value. Further, there may be a plurality of viewpoints that serve as the selection condition 1504. FIG. 38 shows a flowchart of processing in the summary engine 1501 as selection means in this case.

まず、要約エンジン１５０１は、ステップＳ３８０１において、先頭のメディアセグメントであるsegment要素を取り出す。次に、要約エンジン１５０１は、ステップＳ３８０２において、取り出したメディアセグメントであるsegment要素の内容であるpointOfView要素をすべて調べる。そして、要約エンジン１５０１は、調べたpointOfView要素のviewPoint属性に、選択条件１５０４で指定された観点が指定されているものがあるかどうかを調べる。 First, in step S3801, the summary engine 1501 extracts a segment element that is the head media segment. Next, in step S3802, the summary engine 1501 examines all pointOfView elements that are the contents of the segment element that is the extracted media segment. Then, the summary engine 1501 checks whether there is a viewpoint attribute specified in the selection condition 1504 in the viewPoint attribute of the checked pointOfView element.

そして、要約エンジン１５０１は、選択条件１５０４で指定された観点が指定されているものがある場合は、選択条件１５０４で指定された観点に基づく重要度としきい値を比べるために、ステップＳ３８０３の処理に移行する。一方、要約エンジン１５０１は、選択条件１５０４で指定された観点が指定されているものがない場合は、選択条件１５０４で指定された観点に基づく重要度がつけられていないため、ステップＳ３８０５の処理に移行する。 If there is one in which the viewpoint specified in the selection condition 1504 is specified, the summary engine 1501 performs processing in step S3803 to compare the importance based on the viewpoint specified in the selection condition 1504 with a threshold value. Migrate to On the other hand, if none of the viewpoints specified in the selection condition 1504 is specified, the summarization engine 1501 does not assign importance based on the viewpoint specified in the selection condition 1504, and therefore the processing in step S3805 is not performed. Transition.

次に、要約エンジン１５０１は、ステップＳ３８０３において、選択条件１５０４で指定された観点に基づく重要度がしきい値以上であるかを調べる。そして、選択条件１５０４は指定された観点に基づく重要度がしきい値以上の場合は、ステップＳ３８０４の処理に移行し、この重要度がしきい値未満の場合はステップＳ３８０５の処理に移行する。 Next, in step S3803, the summary engine 1501 checks whether the importance based on the viewpoint specified by the selection condition 1504 is equal to or greater than a threshold value. If the importance based on the designated viewpoint is equal to or greater than the threshold value, the selection condition 1504 proceeds to the process of step S3804. If the importance is less than the threshold value, the process proceeds to step S3805.

そして、要約エンジン１５０１は、ステップＳ３８０４において、該当メディアセグメントの開始時間と終了時間を表す、segment要素のstart属性とend属性の値を記述コンバータ１５０２に出力する。また、要約エンジン１５０１は、ステップＳ３８０５において、未処理のメディアセグメントがあるかどうかを調べ、ある場合はステップＳ３８０６の処理に移行する。一方、要約エンジン１５０１は、未処理のメディアセグメントがない場合は、処理を終了する。 In step S3804, the summary engine 1501 outputs the start attribute and end attribute values of the segment element indicating the start time and end time of the corresponding media segment to the description converter 1502. In step S3805, the summary engine 1501 checks whether there is an unprocessed media segment. If there is, the summary engine 1501 proceeds to the processing in step S3806. On the other hand, if there is no unprocessed media segment, the summary engine 1501 ends the process.

また、要約エンジン１５０１は、ステップＳ３８０６では、未処理のメディアセグメントのうち、先頭のsegment要素を取り出し、ステップＳ３８０２の処理に移行する。 In step S3806, the summary engine 1501 extracts the first segment element from unprocessed media segments, and proceeds to the process in step S3802.

記述コンバータ１５０２の処理は、実施の形態１に示した図４の構造記述データからSMILへの変換の手順と同様である。 The processing of the description converter 1502 is the same as the conversion procedure from the structure description data to SMIL in FIG. 4 shown in the first embodiment.

なお、実施の形態１０においては要約エンジン１５０１が、選択したメディアセグメントの要素の内容を記述コンバータ１５０２へ出力し、記述コンバータ１５０２はそれを用いて処理を行う構成であるが、要約エンジン１５０１が選択されたメディアセグメントだけを残した中間的な構造記述データを作成し、記述コンバータ１５０２はこの中間的な構造記述データを入力して処理を行う形態であってもよい。 In the tenth embodiment, the summary engine 1501 outputs the content of the selected media segment element to the description converter 1502, and the description converter 1502 performs processing using the content, but the summary engine 1501 selects the content. It is also possible to create intermediate structure description data that leaves only the media segments that have been processed, and the description converter 1502 inputs the intermediate structure description data and performs processing.

また、選択条件は、メディアセグメントのある観点に関する重要度があるしきい値以上であることとしたが、選択したメディアセグメントの再生時間の総和があるしきい値以下であることでもよい。この場合、要約エンジン１５０１は、すべてのメディアセグメントを指定された観点に関する重要度の高い順にソートし、再生時間の総和がしきい値以下で、かつ最大となるまで、ソートした先頭からメディアセグメントを選択していく処理を行うこととなる。 In addition, the selection condition is that the importance regarding a certain viewpoint of the media segment is equal to or higher than a certain threshold value, but the total sum of reproduction times of the selected media segments may be equal to or less than a certain threshold value. In this case, the summarization engine 1501 sorts all media segments in descending order of importance with respect to the designated viewpoint, and selects media segments from the sorted head until the total playback time is equal to or less than the threshold value and reaches the maximum. Processing to select is performed.

また、要約エンジン１５０１は、指定された観点が複数ある場合は、各メディアセグメントに関して、指定された観点に関する重要度のうち最大のものをとり、その値でソートしてもよいし、それらの総和や相加平均をとり、その値でソートしてもよい。 In addition, when there are a plurality of designated viewpoints, the summarization engine 1501 may take the maximum importance of the designated viewpoints for each media segment, and may sort by the value, or may sum them up. Alternatively, an arithmetic average may be taken and sorted by that value.

また、メディアセグメントのある観点に関する重要度の条件と再生時間の条件との組み合わせであってもよい。 Further, it may be a combination of the importance condition and the playback time condition regarding a certain viewpoint of the media segment.

以上のように、実施の形態１０によれば、メディアセグメントに付加された、キーワードで表現される観点に基づいた重要度から、ユーザの興味のあるメディアセグメントだけの選択を行うことにより、ユーザの嗜好に合わせたあらすじやハイライトシーン集などを構成し、それらの表現記述データの生成が行える。これにより、ユーザが希望する部分だけのメディアコンテンツの再生、配信が行える。 As described above, according to the tenth embodiment, by selecting only the media segment of interest to the user from the importance based on the viewpoint expressed by the keyword added to the media segment, You can create synopses and highlight scene collections according to your preferences, and generate expression description data for them. As a result, it is possible to reproduce and distribute the media content only for the portion desired by the user.

（実施の形態１１）
以下、本発明の実施の形態１１について説明する。実施の形態１０が複数のタイプのメディアを持たないのに対し、実施の形態１１は複数のタイプのメディアを有し、メディアセグメントの代替データが指定されていない構成に関するものである。実施の形態１１におけるデータ処理装置のブロック図は図２２に示したものと同様である。 (Embodiment 11)
Hereinafter, an eleventh embodiment of the present invention will be described. While the tenth embodiment does not have a plurality of types of media, the eleventh embodiment relates to a configuration having a plurality of types of media and no alternative data for the media segment is designated. The block diagram of the data processing apparatus in the eleventh embodiment is the same as that shown in FIG.

実施の形態１１においても、内容記述データ１５０３のためのDTDとして、図３６に示したものを用いる。また、図３９に実施の形態１１における構造記述データである内容記述１５０３の例を示す。 Also in the eleventh embodiment, as the DTD for the content description data 1503, the one shown in FIG. 36 is used. FIG. 39 shows an example of a content description 1503 that is structure description data in the eleventh embodiment.

図３９からわかるように、図３９に示す構造記述データは、タイプの異なるmediaObject要素を有し、segment要素毎に、pointOfView要素と、その属性であるviewPointおよび属性であるscoreが記述されている。 As can be seen from FIG. 39, the structure description data shown in FIG. 39 has mediaObject elements of different types, and for each segment element, a pointOfView element, its viewPoint and its attribute score are described.

実施の形態においても、選択条件１５０４は、実施の形態１０と同様であり、ある観点に関して、メディアセグメントのその観点に基づく重要度がしきい値以上であることとする。また、選択条件１５０４となる観点は複数であっても構わない。この場合の要約エンジン１５０１の処理は、実施の形態１０における要約エンジン１５０１の処理を、各mediaObject要素毎に行うこととなる。図４０に、実施の形態１１における要約エンジン１５０１の処理のフローチャートを示す。 Also in the embodiment, the selection condition 1504 is the same as in the tenth embodiment, and regarding a certain viewpoint, the importance of the media segment based on that viewpoint is equal to or greater than a threshold value. Further, there may be a plurality of viewpoints that serve as the selection condition 1504. In this case, the summary engine 1501 performs the summary engine 1501 processing in the tenth embodiment for each mediaObject element. FIG. 40 shows a flowchart of processing of summary engine 1501 in the eleventh embodiment.

まず、要約エンジン１５０１は、ステップＳ４００１において、最初のmediaObject要素を取り出す。次に、要約エンジン１５０１は、ステップＳ４００２において、取り出したmediaObject要素の内容のうち、先頭のメディアセグメントであるsegment要素を取り出す。そして、要約エンジン１５０１は，ステップＳ４００３において、取り出したメディアセグメントであるsegment要素の内容であるpointOfView要素をすべて調べ、調べたpointOfView要素のviewPoint属性に、選択条件１５０４で指定された観点が指定されているものがあるか調べる。 First, the summary engine 1501 takes out the first mediaObject element in step S4001. Next, in step S4002, the summary engine 1501 extracts a segment element that is the first media segment from the contents of the extracted mediaObject element. In step S4003, the summary engine 1501 examines all the pointOfView elements that are the contents of the segment elements that are the extracted media segments, and the viewpoint designated by the selection condition 1504 is designated in the viewPoint attribute of the examined pointOfView elements. Find out what you have.

そして、要約エンジン１５０１は、調べたpointOfView要素のviewPoint属性に、選択条件１５０４で指定された観点が指定されているものがある場合は、選択条件１５０４で指定された観点に基づく重要度としきい値とを比較するため、ステップＳ４００４の処理に移行する。一方、要約エンジン１５０１は、調べたpointOfView要素のviewPoint属性に、選択条件１５０４で指定された観点が指定されているものがない場合は、選択条件１５０４で指定された観点に基づく重要度がつけられていないため、ステップＳ４００６の処理に移行する。 If there is an aspect specified in the selection condition 1504 in the viewPoint attribute of the examined pointOfView element, the summarization engine 1501 determines the importance and threshold based on the viewpoint specified in the selection condition 1504. Therefore, the process proceeds to step S4004. On the other hand, the summary engine 1501 assigns the importance based on the viewpoint specified in the selection condition 1504 when there is no viewpoint specified in the selection condition 1504 in the viewPoint attribute of the examined pointOfView element. Therefore, the process proceeds to step S4006.

要約エンジン１５０１は、ステップＳ４００４において、選択条件１５０４で指定された観点に基づく重要度がしきい値以上であるかを調べる。また、要約エンジン１５０１は、選択条件１５０４で指定された観点に基づく重要度がしきい値以上の場合は、ステップＳ４００５の処理に移行し、選択条件１５０４で指定された観点に基づく重要度がしきい値未満の場合はステップＳ４００６の処理に移行する。 In step S4004, the summary engine 1501 checks whether the importance based on the viewpoint designated by the selection condition 1504 is equal to or greater than a threshold value. If the importance based on the viewpoint specified in the selection condition 1504 is equal to or greater than the threshold value, the summary engine 1501 proceeds to the process of step S4005 and determines the importance based on the viewpoint specified in the selection condition 1504. If it is less than the threshold value, the process proceeds to step S4006.

要約エンジン１５０１は、ステップＳ４００５では、該当メディアセグメントの開始時間と終了時間を表す、segment要素のstart属性とend属性の値を記述コンバータ１５０２へ出力する。また、要約エンジン１５０１は、ステップＳ４００６では、未処理のメディアセグメントがあるかどうかを調べ、未処理のメディアセグメントがある場合はステップＳ４００７の処理に移行する。また要約エンジン１５００は、未処理のメディアセグメントがない場合は、ステップＳ４００８の処理に移行する。 In step S4005, the summary engine 1501 outputs the start attribute and end attribute values of the segment element indicating the start time and end time of the corresponding media segment to the description converter 1502. In step S4006, the summary engine 1501 checks whether there is an unprocessed media segment. If there is an unprocessed media segment, the summary engine 1501 proceeds to the process of step S4007. If there is no unprocessed media segment, the summary engine 1500 proceeds to the process of step S4008.

要約エンジン１５０１は、ステップＳ４００８では未処理のmediaObject要素がまだ残っているかどうかを調べ、残っている場合はステップＳ４００９の処理に移行する。要約エンジン１５０１は、未処理のmediaObject要素が残っていない場合は処理を終了する。 In step S4008, the summary engine 1501 checks whether or not an unprocessed mediaObject element still remains. If it remains, the summary engine 1501 proceeds to the process of step S4009. The summary engine 1501 ends the process when there is no unprocessed mediaObject element remaining.

また、要約エンジン１５０１は、ステップＳ４００９では、未処理のmediaObject要素のうち、先頭のmediaObject要素を取り出し、ステップＳ４００２の処理に移行する。 In step S4009, the summary engine 1501 extracts the first mediaObject element from the unprocessed mediaObject elements, and proceeds to the process of step S4002.

実施の形態１１の記述コンバータ１５０２も、各mediaObject要素毎の処理を行うが、それ以外は、実施の形態１に示した図４の構造記述データからＳＭＩＬへの変換の手順の処理と同様の処理を行う。 The description converter 1502 of the eleventh embodiment also performs processing for each mediaObject element, but otherwise the same processing as the processing of the procedure for converting the structure description data to SMIL of FIG. 4 shown in the first embodiment I do.

なお、実施の形態１１においては要約エンジン１５０１が、選択したメディアセグメントの要素の内容を記述コンバータ１５０２へ出力し、記述コンバータ１５０２はそれを用いて処理を行う構成であるが、要約エンジン１５０１が選択されたメディアセグメントだけを残した中間的な構造記述データを作成し、記述コンバータ１５０２はこの中間的な構造記述データを入力して処理を行う形態であってもよい。 In the eleventh embodiment, the summary engine 1501 outputs the content of the selected media segment element to the description converter 1502, and the description converter 1502 performs processing using the content, but the summary engine 1501 selects the content. It is also possible to create intermediate structure description data that leaves only the media segments that have been processed, and the description converter 1502 inputs the intermediate structure description data and performs processing.

（実施の形態１２）
本発明の実施の形態１２について述べる。実施の形態１０がメディアセグメントの代替データが指定されていないのに対し、実施の形態１２はメディアセグメントの代替データが指定されているものである。また、実施の形態１２は、選択手段においてメディアセグメントを再生するか代替データを再生するかの選択を行わない構成に関するものである。実施の形態１２におけるデータ処理装置のブロック図は図２２に示したものと同様である。 (Embodiment 12)
Embodiment 12 of the present invention will be described. In the tenth embodiment, alternative data for the media segment is not specified, whereas in the twelfth embodiment, alternative data for the media segment is specified. The twelfth embodiment relates to a configuration in which the selection unit does not select whether to reproduce a media segment or alternative data. The block diagram of the data processing apparatus according to the twelfth embodiment is the same as that shown in FIG.

実施の形態１２において用いる構造記述データのDTDの例を図４１に示す。図４１に示すDTDは、図１３で示したDTDのsegment要素に、キーワードで表現される観点に基づく重要度を表すスコアを表すため、pointOfViewという要素を子要素となるように加えたものである。pointOfView要素は、viewPoint属性によって観点を表し、score属性によってviewPoint属性で示した観点に基づく重要度を表す。この重要度は、正の整数値で表されるものとし、１が最も重要度が低いとする。また、ひとつのsegment要素に複数のpointOfView要素をつけることができる。図４２に内容記述データ１５０３の例を示す。 An example of the DTD of the structure description data used in the twelfth embodiment is shown in FIG. The DTD shown in FIG. 41 is obtained by adding an element called pointOfView to the segment element of the DTD shown in FIG. 13 so as to represent a score representing importance based on the viewpoint expressed by the keyword so as to be a child element. . The pointOfView element represents a viewpoint by a viewPoint attribute, and represents an importance level based on the viewpoint indicated by the viewPoint attribute by a score attribute. This importance is represented by a positive integer value, and 1 is the lowest importance. Also, multiple pointOfView elements can be attached to one segment element. FIG. 42 shows an example of the content description data 1503.

図からもわかるように、図４２に示す内容記述データには、DTDのsegment要素に、pointOfViewという要素が子要素となるように加えられている。また、pointOfView要素は、viewPoint属性とscore属性とが記述されている。 As can be seen from the figure, in the content description data shown in FIG. 42, an element called pointOfView is added as a child element to the segment element of the DTD. Also, the pointOfView element describes a viewPoint attribute and a score attribute.

実施の形態１２における要約エンジン１５０１の処理は、実施の形態１０における要約エンジン１５０１の処理と同様である。だたし、実施の形態１２における要約エンジン１５０１の処理は、選択したメディアセグメントの出力する際に、segment要素のstart属性とend属性に加えて、子要素であるalt要素も出力する。 The processing of summary engine 1501 in the twelfth embodiment is the same as the processing of summary engine 1501 in the tenth embodiment. However, in the process of the summary engine 1501 in the twelfth embodiment, when the selected media segment is output, in addition to the start attribute and end attribute of the segment element, an alt element that is a child element is also output.

また、実施の形態１２における記述コンバータ１５０２の処理は、実施の形態１、実施の形態２、あるいは実施の形態３に示した図４の構造記述データからＳＭＩＬへの変換の手順の処理と同様である。 The processing of the description converter 1502 in the twelfth embodiment is the same as the processing of the procedure for converting the structure description data into SMIL shown in FIG. 4 shown in the first, second, or third embodiment. is there.

なお、実施の形態においてはなお、実施の形態１０においては要約エンジン１５０１が、選択したメディアセグメントの要素の内容を記述コンバータ１５０２へ出力し、記述コンバータ１５０２はそれを用いて処理を行う構成であるが、要約エンジン１５０１が選択されたメディアセグメントだけを残した中間的な構造記述データを作成し、記述コンバータ１５０２はこの中間的な構造記述データを入力して処理を行う形態であってもよい。 In the embodiment, the summary engine 1501 outputs the content of the element of the selected media segment to the description converter 1502 in the embodiment 10, and the description converter 1502 performs processing using it. However, the summary engine 1501 may create intermediate structure description data that leaves only the selected media segment, and the description converter 1502 may input the intermediate structure description data and perform processing.

（実施の形態１３）
本発明の実施の形態１３について述べる。実施の形態１１がメディアセグメントの代替データが指定されていないのに対し、実施の形態１３はメディアセグメントの代替データが指定されているものに関する。また、実施の形態１３は、選択手段においてメディアセグメントを再生するか代替データを再生するかの選択を行わない構成に関するものである。実施の形態１３におけるデータ処理装置のブロック図は図１５に示したものと同様である。 (Embodiment 13)
A thirteenth embodiment of the present invention will be described. While the eleventh embodiment does not designate media segment alternative data, the thirteenth embodiment relates to media segment alternative data designated. The thirteenth embodiment relates to a configuration in which the selection unit does not select whether to reproduce a media segment or alternative data. The block diagram of the data processing apparatus according to the thirteenth embodiment is the same as that shown in FIG.

実施の形態１３においても、内容記述１５０３のためのＤＴＤとして、図４１に示したものを用いる。図４３、図４４に実施の形態１２における構造記述データ１５０３の例を示す。 Also in the thirteenth embodiment, as the DTD for the content description 1503, the one shown in FIG. 41 is used. 43 and 44 show examples of the structure description data 1503 in the twelfth embodiment.

図からもわかるように、実施の形態１３の構造記述データには、異なるタイプのmediaObject要素を有し、mediaObject要素毎にsegment要素を有する。そして、segment要素毎に、pointOfView要素と、その属性であるviewPointおよび属性であるscoreが記述されている。 As can be seen from the figure, the structure description data of the thirteenth embodiment has different types of mediaObject elements, and each mediaObject element has a segment element. For each segment element, a pointOfView element, its attribute viewPoint, and an attribute score are described.

実施の形態１２における要約エンジン１５０１の処理は、実施の形態１１における要約エンジンの処理と同様である。だたし、実施の形態１２における要約エンジン１５０１は、選択したメディアセグメントを出力する際に、segment要素のstart属性とend属性に加えて、子要素であるalt要素も出力する。 The processing of summary engine 1501 in the twelfth embodiment is the same as the processing of summary engine in the eleventh embodiment. However, when outputting the selected media segment, the summary engine 1501 according to the twelfth embodiment outputs an alt element as a child element in addition to the start attribute and end attribute of the segment element.

（実施の形態１４）
本発明の実施の形態１４について説明する。実施の形態１２が選択手段においてメディアセグメントを再生するか代替データを再生するかの選択を行わない構成に関するものであるのに対し、実施の形態１４は選択手段においてメディアセグメントを再生するか代替データを再生するかの選択を行う構成に関するものである。実施の形態１４においては、選択手段が、メディアセグメント選択手段と、再生メディア選択手段に分かれる。また、選択条件はセグメント選択条件と再生メディア選択条件に分かれる。したがって、実施の形態１４におけるデータ処理装置のブロック図は、図３５に示したものと同様となる。 (Embodiment 14)
Embodiment 14 of the present invention will be described. While the twelfth embodiment relates to a configuration in which the selection means does not select whether to reproduce the media segment or the alternative data, the fourteenth embodiment relates to whether the selection means reproduces the media segment or the alternative data. The present invention relates to a configuration for selecting whether or not to reproduce. In the fourteenth embodiment, the selection means is divided into media segment selection means and playback media selection means. The selection conditions are divided into segment selection conditions and playback media selection conditions. Therefore, the block diagram of the data processing apparatus according to the fourteenth embodiment is the same as that shown in FIG.

実施の形態１４において、内容記述データ２８０４は実施の形態１２における内容記述１５０３と同じものを用いる。すなわち、実施の形態１４の内容記述データ２８０４は図４１に示したDTDを用いたもので、実施の形態１４の内容記述データ２８０４の一例は図４２に示されているものである。 In the fourteenth embodiment, the content description data 2804 is the same as the content description 1503 in the twelfth embodiment. That is, the content description data 2804 of the fourteenth embodiment uses the DTD shown in FIG. 41, and an example of the content description data 2804 of the fourteenth embodiment is shown in FIG.

また、セグメント選択条件２８０５は、実施の形態１０あるいは実施の形態１２における選択条件１５０４と同様のものを用いる。この場合、要約エンジン２８０１の処理は、実施の形態１２における要約エンジン１５０１の処理と同様となる。 The segment selection condition 2805 is the same as the selection condition 1504 in the tenth or twelfth embodiment. In this case, the process of summary engine 2801 is the same as the process of summary engine 1501 in the twelfth embodiment.

変換部２８０３が内容記述２８０４をSMILに変換する処理は、実施の形態１あるいは実施の形態２に示した図４の構造記述データからSMILへの変換の手順の処理と同様である。 The process of converting the content description 2804 into SMIL by the conversion unit 2803 is the same as the process of converting the structure description data to SMIL in FIG. 4 shown in the first embodiment or the second embodiment.

（実施の形態１５）
本発明の実施の形態１５について述べる。実施の形態１３が選択手段においてメディアセグメントを再生するか代替データを再生するかの選択を行う構成に関するものであるのに対し、実施の形態１５は選択手段においてメディアセグメントを再生するか代替データを再生するかの選択を行う構成に関するものである。実施の形態１５においても、実施の形態８と同様、選択手段が、メディアセグメント選択手段と、再生メディア選択手段に分かれる。また、選択条件はセグメント選択条件と再生メディア選択条件に分かれる。したがって、本実施の形態におけるデータ処理装置のブロック図は、図３５に示したものと同様となる。 (Embodiment 15)
Embodiment 15 of the present invention will be described. While the thirteenth embodiment relates to a configuration for selecting whether to reproduce a media segment or alternative data in the selection means, the fifteenth embodiment relates to a selection of reproducing a media segment or alternative data in the selection means. The present invention relates to a configuration for selecting whether to reproduce. Also in the fifteenth embodiment, as in the eighth embodiment, the selection means is divided into a media segment selection means and a reproduction media selection means. The selection conditions are divided into segment selection conditions and playback media selection conditions. Therefore, the block diagram of the data processing apparatus in the present embodiment is the same as that shown in FIG.

実施の形態１５の構造記述データ２８０４は実施の形態１３における構造記述データ１５０３と同じものを用いる。すなわち、実施の形態１５の構造記述データ２８０４は、図４１に示したＤＴＤを用いたもので、一例は図４３、図４４に示されているものである。 The structure description data 2804 of the fifteenth embodiment is the same as the structure description data 1503 of the thirteenth embodiment. That is, the structure description data 2804 of the fifteenth embodiment uses the DTD shown in FIG. 41, and an example is shown in FIG. 43 and FIG.

また、実施の形態１５のセグメント選択条件２８０５は、実施の形態１４のセグメント選択条件２８０５と同様のものを用いる。したがって、要約エンジン２８０１の処理は、実施の形態１３における要約エンジン１５０１の処理と同様となる。 The segment selection condition 2805 of the fifteenth embodiment is the same as the segment selection condition 2805 of the fourteenth embodiment. Therefore, the process of summary engine 2801 is the same as the process of summary engine 1501 in the thirteenth embodiment.

実施の形態１５にかかる再生メディア選択部２８０２の処理は、実施の形態１４に記述した再生メディア選択部２８０２と同様のものを用いる。 The processing of the playback media selection unit 2802 according to the fifteenth embodiment uses the same processing as that of the playback media selection unit 2802 described in the fourteenth embodiment.

実施の形態１５の変換部２８０３は、要約エンジン２８０１が選択したメディアセグメントの要素と、再生メディア選択部２８０２が選択した結果を入力し、再生メディア選択部２８０２の結果に基づいて、SMILによる表現記述データである再生方法記述２８０７を出力する。 The conversion unit 2803 of the fifteenth embodiment inputs the element of the media segment selected by the summary engine 2801 and the result selected by the reproduction media selection unit 2802, and based on the result of the reproduction media selection unit 2802, the expression description by SMIL A reproduction method description 2807 which is data is output.

実施の形態１５の変換部２８０３が行うSMILへの変換処理は、実施の形態１あるいは実施の形態２に示した図４の構造記述データからSMILへの変換の手順と同様である。 The conversion process to SMIL performed by the conversion unit 2803 of the fifteenth embodiment is the same as the conversion procedure from the structure description data shown in FIG. 4 to the SMIL shown in the first or second embodiment.

（実施の形態１６）
本発明の実施の形態１６について説明する。図４５に、実施の形態１６におけるデータ処理装置のブロック図を示す。図４５において、３８０１で記されるものは構造記述データデータベースを、３８０２で記されるものは選択部を、３８０３で記されるものは変換部を、３８０４で記されるものは再生部を、３８０５で記されるものはメディアコンテンツデータベースを表す。また、３８０６で記されるものは構造記述データを、３８０８で記されるものは要約内容記述データを、３８０９で記されるものは表現記述データを、３８１０で記されるものはメディアコンテンツデータを表す。 (Embodiment 16)
A sixteenth embodiment of the present invention will be described. FIG. 45 shows a block diagram of a data processing apparatus in the sixteenth embodiment. In FIG. 45, 3801 indicates a structure description data database, 3802 indicates a selection unit, 3803 indicates a conversion unit, 3804 indicates a reproduction unit, What is described by 3805 represents a media content database. Also, what is written in 3806 is structure description data, what is written in 3808 is summary content description data, what is written in 3809 is expression description data, and what is written in 3810 is media content data. To express.

また、選択部３８０２、変換部３８０３、構造記述データ３８０６、選択条件３８０７および表現記述データ３８０９は、実施の形態４から実施の形態１５のいずれかに示したものと同様である。また、要約構造記述データ３８０８は、実施の形態４から実施の形態１５のいずれかに示した、選択されたメディアセグメントだけを残した中間的な構造記述データにあたるものである。選択部３８０２、変換部３８０３は、コンピュータ上でプログラムを実行することにより実現できる。 The selection unit 3802, the conversion unit 3803, the structure description data 3806, the selection condition 3807, and the expression description data 3809 are the same as those described in any of Embodiments 4 to 15. The summary structure description data 3808 corresponds to intermediate structure description data in which only the selected media segment is left as shown in any of the fourth to fifteenth embodiments. The selection unit 3802 and the conversion unit 3803 can be realized by executing a program on a computer.

再生部３８０４としては、表現記述データ３８０９がSMILによって表現されているため、SMILプレーヤを用いることができる。SMILプレーヤは、コンピュータ上でプログラムを実行することにより実現でき、SMILプレーヤソフトウェアとしては、例えばReal NetworksのReal Playerなどフリーのソフトが流通している。 As the playback unit 3804, since the description description data 3809 is expressed in SMIL, a SMIL player can be used. A SMIL player can be realized by executing a program on a computer, and free software such as Real Player of Real Networks is distributed as SMIL player software.

なお、実施の形態１６では、選択部３８０２が要約構造記述データ３８０８を出力するものとしたが、実施の形態４から実施の形態１５のいずれかに示したように、要約構造記述データ３８０８を出力せず、選択したメディアセグメントを出力する形態のものであってもよい。 In the sixteenth embodiment, the selection unit 3802 outputs the summary structure description data 3808. However, as shown in any of the fourth to fifteenth embodiments, the summary structure description data 3808 is output. Alternatively, the selected media segment may be output.

（実施の形態１７）
発明の実施の形態１７にかかるサーバ−クライアントシステムついて、図４６を用いて説明する。実施の形態１７は、選択部３８０２と変換部３８０３をサーバ４６０１側に備え、再生部３８０４をクライアント４６０２側に備えたものである。そして、実施の形態１７は、変換部３８０３と再生部３８０４との接続をネットワーク４６０３上で行うものである。これにより、実施の形態１７は、ネットワークを通じて、表現記述データ３８０９を通信する構成のサーバ−クライアントシステムとなる。 (Embodiment 17)
A server-client system according to Embodiment 17 of the present invention will be described with reference to FIG. In the seventeenth embodiment, a selection unit 3802 and a conversion unit 3803 are provided on the server 4601 side, and a playback unit 3804 is provided on the client 4602 side. In the seventeenth embodiment, the conversion unit 3803 and the reproduction unit 3804 are connected on the network 4603. Thus, the seventeenth embodiment becomes a server-client system configured to communicate the expression description data 3809 through a network.

また、各処理部が行う処理内容は、コンピュータで実行できるプログラムとして書かれ、サーバ４６０１側とクライアント４６０２側内の記憶媒体に格納され、それぞれ実行される。 The processing contents performed by each processing unit are written as programs that can be executed by a computer, stored in storage media on the server 4601 side and the client 4602 side, and executed respectively.

なお、構造記述データベース３８０１のかわりにメタデータデータベース１００１を、選択部３８０２のかわりに要約エンジン１００２、１５０１、２８０１を、変換部３８０３のかわりに記述コンバータ１００３、１５０２、２８００を、再生部３８０４のかわりに再生機１００４を、メディアコンテンツデータベース３８０５のかわりにメディアコンテンツデータベース１００５を用いても良い。 Instead of the structure description database 3801, the metadata database 1001, the summary engines 1002, 1501, 2801 instead of the selection unit 3802, the description converters 1003, 1502, 2800 instead of the conversion unit 3803, and the playback unit 3804 are replaced. Alternatively, the player 1004 may use the media content database 1005 instead of the media content database 3805.

なお、実施の形態１７は、図４７に示すようにメディアコンテンツデータベース３８０５をサーバ４６０１ａに備え、メディアコンテンツデータ３８１０をネットワーク４６０３を介してクライアント４６０２ａに送る形態であっても良い。 In the seventeenth embodiment, as shown in FIG. 47, the media content database 3805 may be provided in the server 4601a, and the media content data 3810 may be sent to the client 4602a via the network 4603.

（実施の形態１８）
本発明の実施の形態１８にかかるサーバ−クライアントシステムついて述べる。 (Embodiment 18)
A server-client system according to the eighteenth embodiment of the present invention will be described.

実施の形態１８について、図４８を用いて説明する。実施の形態１８は、選択部３８０２をサーバ４７０１側に備え、変換部３８０３と再生部３８０４をクライアント４７０２側に備えたものである。また、実施の形態１８は、選択部３８０２と変換部３８０３との接続をネットワーク４６０３上で行う。これにより、実施の形態１８は、ネットワークを通じて、要約構造記述データ３８０８を通信する構成のサーバ−クライアントシステムとなる。 An eighteenth embodiment will be described with reference to FIG. In the eighteenth embodiment, a selection unit 3802 is provided on the server 4701 side, and a conversion unit 3803 and a reproduction unit 3804 are provided on the client 4702 side. In the eighteenth embodiment, the selection unit 3802 and the conversion unit 3803 are connected on the network 4603. Thus, the eighteenth embodiment is a server-client system configured to communicate the summary structure description data 3808 via a network.

また、各処理部が行う処理内容は、コンピュータで実行できるプログラムとして書かれ、サーバ４７０１側とクライアント４７０２側内の記憶媒体に格納され、それぞれ実行される。 The processing contents performed by each processing unit are written as a program that can be executed by a computer, stored in storage media on the server 4701 side and the client 4702 side, and executed.

なお、実施の形態１８は、図４９に示すように、メディアコンテンツデータベース３８０５をサーバ４７０１ａに備え、メディアコンテンツデータ３８１０をネットワーク４６０３を介してクライアント４７０２ａに送る形態であっても良い。 In the eighteenth embodiment, as shown in FIG. 49, the media content database 3805 may be provided in the server 4701a, and the media content data 3810 may be sent to the client 4702a via the network 4603.

以上説明したように、本発明によれば、メディアセグメントによるメディアコンテンツの構成を記述する構造記述データから、メディアコンテンツを再生する形態を表現する表現記述データへの変換を行える。このため、メディアコンテンツの再生において、メディアセグメント毎に再生するタイミング、同期情報などの制約を加えることができ、さまざまな再生が実現できる。 As described above, according to the present invention, it is possible to convert the structure description data describing the configuration of the media content by the media segment into the expression description data expressing the form for reproducing the media content. For this reason, in the playback of media content, restrictions such as playback timing and synchronization information for each media segment can be added, and various playbacks can be realized.

また、本発明によれば、メディアセグメントの代替データを構造記述データに記述しておくことにより、メディアセグメント自体を再生するか、代替データを再生するかの選択が行える。これにより、メディアコンテンツを配送するネットワークの容量や通信量、メディアコンテンツを再生する端末の能力などに合わせたメディアで、コンテンツの配送や再生が行える。 Further, according to the present invention, it is possible to select whether to reproduce the media segment itself or the alternative data by describing the alternative data of the media segment in the structure description data. Accordingly, it is possible to deliver and play back content using media that matches the capacity and communication volume of the network that delivers the media content and the capability of the terminal that plays back the media content.

また、本発明によれば、構造記述データに各メディアセグメントの文脈内容に基づいたスコアをさらに記述することによって、例えば、さまざまな再生時間のハイライトシーン集などの再生や配送を容易に行うことができ、また、スコアをキーワードで示される観点に基づくものとすることにより、キーワードを指定することによって、ユーザの好みに合わせたシーンだけを再生、配送することができる。 In addition, according to the present invention, by further describing the score based on the context contents of each media segment in the structure description data, for example, it is possible to easily reproduce and distribute highlight scene collections having various reproduction times. In addition, since the score is based on the viewpoint indicated by the keyword, by designating the keyword, it is possible to reproduce and deliver only the scene according to the user's preference.

本発明の実施の形態１にかかるデータ処理システムの概念図1 is a conceptual diagram of a data processing system according to a first embodiment of the present invention. （ａ）実施の形態１における構造記述データのＤＴＤを示す図（ｂ）実施の形態１における構造記述データの一例を示す図(A) The figure which shows DTD of the structure description data in Embodiment 1 (b) The figure which shows an example of the structure description data in Embodiment 1 実施の形態１における構造記述データのもうひとつの一例を示す図The figure which shows another example of the structure description data in Embodiment 1. 実施の形態１における構造記述データから表現記述データへの変換のフロー図Flow chart of conversion from structure description data to expression description data in the first embodiment 実施の形態１にかかる記述コンバータが構造記述データである要約内容記述からSMIL文書である再生方法記述を出力するフロー図The flowchart which the description converter concerning Embodiment 1 outputs the reproduction | regeneration method description which is a SMIL document from the summary content description which is structure description data SMIL文書の構成を示した図Diagram showing the structure of a SMIL document 実施の形態１における表現記述データの一例を示す図FIG. 5 is a diagram illustrating an example of expression description data according to the first embodiment 実施の形態１における表現記述データの一例を示す図FIG. 5 is a diagram illustrating an example of expression description data according to the first embodiment 実施の形態１にかかる記述コンバータが構造記述データである要約内容記述からSMIL文書である再生方法記述を出力するフロー図The flowchart which the description converter concerning Embodiment 1 outputs the reproduction | regeneration method description which is a SMIL document from the summary content description which is structure description data 実施の形態１における表現記述データの一例を示す図FIG. 5 is a diagram illustrating an example of expression description data according to the first embodiment 実施の形態１における表現記述データの一例を示す図FIG. 5 is a diagram illustrating an example of expression description data according to the first embodiment 実施の形態１における表現記述データの一例を示す図FIG. 5 is a diagram illustrating an example of expression description data according to the first embodiment 本発明の実施の形態２における構造記述データのＤＴＤを示す図The figure which shows DTD of the structure description data in Embodiment 2 of this invention 実施の形態２における構造記述データの一例を示す図FIG. 10 shows an example of structure description data in the second embodiment 実施の形態２における構造記述データのもうひとつの一例を示す図The figure which shows another example of the structure description data in Embodiment 2. 実施の形態２にかかる記述コンバータが構造記述データである要約内容記述からSMIL文書である再生方法記述を出力するフロー図The flowchart which the description converter concerning Embodiment 2 outputs the reproduction | regeneration method description which is a SMIL document from the summary content description which is structure description data 実施の形態２における表現記述データの一例を示す図FIG. 10 is a diagram illustrating an example of expression description data according to the second embodiment 実施の形態２にかかる記述コンバータが構造記述データである要約内容記述からSMIL文書である再生方法記述を出力するフロー図The flowchart which the description converter concerning Embodiment 2 outputs the reproduction | regeneration method description which is a SMIL document from the summary content description which is structure description data 実施の形態３における構造記述データから表現記述データへの変換のフロー図Flow chart of conversion from structure description data to expression description data in the third embodiment 実施の形態３における表現記述データの一例を示す図FIG. 10 is a diagram illustrating an example of expression description data according to the third embodiment （ａ）実施の形態３における構造記述データの拡張のDTDを示す図（ｂ）実施の形態３における構造記述データの拡張の一例を示す図(A) The figure which shows DTD of the extension of the structure description data in Embodiment 3 (b) The figure which shows an example of the extension of the structure description data in Embodiment 3 本発明の実施の形態４におけるデータ処理装置のブロック図The block diagram of the data processor in Embodiment 4 of this invention 実施の形態４における構造記述データのＤＴＤを示す図The figure which shows DTD of the structure description data in Embodiment 4 実施の形態４における構造記述データの一例を示す図FIG. 10 is a diagram illustrating an example of structure description data according to the fourth embodiment. 実施の形態４における選択部の処理におけるフロー図Flow chart in processing of selection unit in embodiment 4 実施の形態４における中間的な構造記述データの一例を示す図FIG. 10 is a diagram illustrating an example of intermediate structure description data in the fourth embodiment 本発明の実施の形態５における構造記述データの一例を示す図The figure which shows an example of the structure description data in Embodiment 5 of this invention 実施の形態５における選択部の処理におけるフロー図Flow chart in processing of selection unit in embodiment 5 実施の形態５における中間的な構造記述データの一例を示す図The figure which shows an example of the intermediate structure description data in Embodiment 5 本発明の実施の形態６における構造記述データのＤＴＤを示す図The figure which shows DTD of the structure description data in Embodiment 6 of this invention 実施の形態６における構造記述データの一例を示す図FIG. 18 shows an example of structure description data in the sixth embodiment. 実施の形態６における中間的な構造記述データの一例を示す図FIG. 18 is a diagram illustrating an example of intermediate structure description data in the sixth embodiment 本発明の実施の形態７における構造記述データの一例を示す図The figure which shows an example of the structure description data in Embodiment 7 of this invention 実施の形態７における中間的な構造記述データの一例を示す図FIG. 18 shows an example of intermediate structure description data in the seventh embodiment. 本発明の実施の形態８におけるデータ処理装置のブロック図The block diagram of the data processor in Embodiment 8 of this invention 本発明の実施の形態１０における構造記述データのＤＴＤを示す図The figure which shows DTD of the structure description data in Embodiment 10 of this invention. 実施の形態１０における構造記述データの一例を示す図A diagram showing an example of structure description data in the tenth embodiment 実施の形態１０における選択部の処理におけるフロー図Flow chart in processing of selection unit in embodiment 10 本発明の実施の形態１１における構造記述データの一例を示す図The figure which shows an example of the structure description data in Embodiment 11 of this invention 実施の形態１１における選択部の処理におけるフロー図Flow chart in processing of selection unit in embodiment 11 本発明の実施の形態１２における構造記述データのＤＴＤを示す図The figure which shows DTD of the structure description data in Embodiment 12 of this invention 実施の形態１２における構造記述データの一例を示す図A diagram showing an example of structure description data in Embodiment 12 本発明の実施の形態１３における構造記述データの一例を示す第１の図First diagram showing an example of structure description data in Embodiment 13 of the present invention 実施の形態１３における構造記述データの一例を示す第２の図FIG. 21 is a second diagram illustrating an example of structure description data according to the thirteenth embodiment. 本発明の実施の形態１６におけるデータ処理装置のブロック図Block diagram of a data processing apparatus in Embodiment 16 of the present invention 本発明の実施の形態１７のサーバ−クライアントシステムのブロック図Block diagram of a server-client system according to a seventeenth embodiment of the present invention 実施の形態１７のサーバ−クライアントシステムのその他の例のブロック図Block diagram of another example of server-client system of embodiment 17 本発明の実施の形態１８のサーバ−クライアントシステムのブロック図18 is a block diagram of a server-client system according to an eighteenth embodiment of the present invention. 実施の形態１８のサーバ−クライアントシステムのその他の例のブロック図Block diagram of another example of server-client system of embodiment 18 従来の動画配信装置のブロック図Block diagram of a conventional video distribution device

Explanation of symbols

１００１メタデータデータベース
１００２、１５０１、２８０１要約エンジン
１００３、１５０２、２８００記述コンバータ
１００４再生機
１００５メディアコンテンツデータベース
２８０１メディアセグメント選択手段
２８０２再生メディア選択部
２８０３変換部
３８０１構造記述データデータベース
３８０２選択部
３８０３変換部
３８０４再生部
３８０５メディアコンテンツデータベース
４６０１、４７０１サーバ
４６０２、４７０２クライアント
４６０３ネットワーク 1001 metadata database 1002, 1501, 2801 summary engine 1003, 1502, 2800 description converter 1004 player 1005 media content database 2801 media segment selection means 2802 playback media selection unit 2803 conversion unit 3801 structure description data database 3802 selection unit 3803 conversion unit 3804 Playback unit 3805 Media content database 4601, 4701 Server 4602, 4702 Client 4603 Network

Claims

A plurality of media including content based on contextual content and time information consisting of a combination of playback start time and end time or duration time. Structure description data including synchronization information described in segments and representing that a plurality of media are reproduced in synchronization;
An input unit for receiving an input of a selection condition for selecting a predetermined media segment from the structure description data;
A selection unit that selects only a media segment having a score that meets the selection condition from the structure description data;
Synchronization is performed from the media type, the time information, and the synchronization information described for each media segment of summary structure description data that is intermediate structure description data that leaves only the media segment selected by the selection unit. Grouping different types of media segments to be played back, and setting the synchronization information indicating that playback is performed in synchronization with each group of the obtained media segments,
For the group of media segments, from the time information, as a playback order, set to arrange in the order of early group,
For each group of media segments, from the media type and the time information, as the playback timing, set the playback start time and end time of each media segment according to the media type, respectively.
An expression description in which the summary structure description data is arranged in the order of the set playback order for each group of media segments in which the synchronization information is set, and the playback timing set for each media segment is described. A data processing apparatus comprising: a conversion unit that converts to data and outputs the data.

The conversion unit integrates the media segments having continuous time information into one for each type of the media, resets the start time and end time of playback, and arranges the media segments in the playback order that is earlier in time. The data processing apparatus according to claim 1, wherein:

The structure description data has a set of alternative data replacing the media segment,
The conversion unit sets the summary structure description data in the media segment describing the set synchronization information, sets at least one of the media segment and the alternative data, and arranges in the order of the set playback order, 2. The data processing apparatus according to claim 1, wherein the data processing device is converted into expression description data describing the set reproduction timing for each media segment.

4. The data processing apparatus according to claim 3, wherein the selection unit selects whether to reproduce the media segment or the alternative data when selecting the media segment.

5. The data processing apparatus according to claim 1, wherein the expression description data is a SMIL document.

6. The data processing apparatus according to claim 1, wherein the score is an importance level of the corresponding media segment based on the context content of the media content.

The data according to any one of claims 1 to 6, wherein a viewpoint expressed by a keyword is given to the media segment, and the score is an importance of the corresponding media segment based on the viewpoint. Processing equipment.

An input unit that inputs structure description data describing the content of media content that is continuous audiovisual information, a selection unit that selects a predetermined media segment from the structure description data, and an intermediate structure that leaves only the selected media segment A data processing method in a data processing apparatus, comprising: a conversion unit that converts summary structure description data, which is description data, into expression description data and outputs the data,
The content of the media content, which is continuous audiovisual information, is divided by time for each type of media, and the score based on the time information and the context content consisting of a combination of the playback start time and end time or duration time Structure description data including synchronization information that is described in a plurality of media segments including and represents that the plurality of media are synchronized and played back;
A step of accepting a selection condition for selecting a predetermined media segment from the structure description data, enter the,
The selecting unit selecting only a media segment having a score that meets the selection condition from the structure description data;
The conversion unit is
Synchronization is performed from the media type, the time information, and the synchronization information described for each media segment of the summary structure description data that is intermediate structure description data that leaves only the media segment selected in the selection step. Grouping different types of media segments to be played back, and setting the synchronization information indicating that playback is performed in synchronization with each group of the obtained media segments,
For the group of media segments, from the time information, as a playback order, set to arrange in the order of early group,
For each group of media segments, from the media type and the time information, as the playback timing, set the playback start time and end time of each media segment according to the media type, respectively.
An expression description in which the summary structure description data is arranged in the order of the set playback order for each group of media segments in which the synchronization information is set, and the playback timing set for each media segment is described. Converting to data and outputting, and
A data processing method characterized by comprising:

The conversion unit collectively sets the media segments having continuous time information for each of the media types, resets the playback start time and end time, and arranges the media segments in the playback order that is earlier in time. 9. A data processing method according to claim 8, wherein:

An input unit that inputs structure description data describing the content of media content that is continuous audiovisual information, a selection unit that selects a predetermined media segment from the structure description data, and an intermediate structure that leaves only the selected media segment A conversion unit that converts summary structure description data, which is description data, into expression description data and outputs the data, and a program used for a data processing apparatus,
Computer
In the input unit, the content of the media content, which is continuous audiovisual information, is divided by time for each type of media, and a score based on time information and contextual content consisting of a set of playback start time and end time or duration time Structure description data including synchronization information that is described in a plurality of media segments including and represents that the plurality of media are synchronized and played back;
A step of causing reception and selection condition for selecting a predetermined media segment from the structure description data, the input,
Causing the selection unit to select only media segments having a score that meets the selection condition from the structure description data;
In the converter,
Synchronization is performed from the media type, the time information, and the synchronization information described for each media segment of the summary structure description data that is intermediate structure description data that leaves only the media segment selected by the selection unit. Grouping different types of media segments to be played back, and setting the synchronization information indicating that playback is performed in synchronization with each group of the obtained media segments,
For the group of media segments, from the time information, as a playback order, set to arrange in the order of the group with the earliest time,
For each group of media segments, from the media type and the time information, the playback timing of each media segment according to the media type is set as the playback timing, respectively.
An expression description in which the summary structure description data is arranged in the order of the set playback order for each group of media segments in which the synchronization information is set, and the playback timing set for each media segment is described. Converting to data and outputting, and
A program for functioning as the data processing device by executing each of the above.