JP2006129133A

JP2006129133A - Content reproducing apparatus

Info

Publication number: JP2006129133A
Application number: JP2004315450A
Authority: JP
Inventors: Tetsuhiko Kaneaki; 哲彦金秋
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2004-10-29
Filing date: 2004-10-29
Publication date: 2006-05-18

Abstract

<P>PROBLEM TO BE SOLVED: To efficiently reproduce content consisting of the same program separately video-recorded, using meta data created by a third person. <P>SOLUTION: A synchronization tag is previously added to the meta data. The tag indicates the difference in feature between a frame to be indicated by the tag and screens before and after the frame, a synchronization detecting means 7 detects the synchronization tag from the meta data read from a meta data accumulating means 2, and it is detected that to which frame the corresponding tag is equivalent on the basis of the difference in feature in the content read from a content accumulating means 6. The difference is obtained between the detected frame and a time described in the meta data, a meta data correcting means 8 corrects the meta data, a reproduction control means 3 controls a content reproduction means 5 on the basis of the corrected meta data to decide which part should be reproduced, and the content is reproduced. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、一般放送番組を録画して得られたコンテンツを、メタデータを用いて効率よく再生するコンテンツ再生装置に関するものである。 The present invention relates to a content reproduction apparatus that efficiently reproduces content obtained by recording a general broadcast program using metadata.

従来、メタデータを用いるコンテンツ再生装置としては、特許文献１に記載されたものが知られている。これによれば、ビデオ再生装置は、ビデオマテリアルを記録するビデオマテリアル記録手段（コンテンツストレージ装置）と、ビデオマテリアルの情報コンテンツを定義する関連付けられた情報データ（メタデータ）を受け取り、記憶するメタデータ保存装置と、メタデータに基づいて、ビデオマテリアル記録手段に記録されたビデオマテリアルの再生を制御する再生制御手段とを備える、というものである。メタデータには固有の識別インデクスであるＵＭＩＤ、及び開始／終了タイムコードを含んでおりこれら情報に基づいて、あまり動きの無い映像は早送りし、コンテンツ全体を見るのに要する時間を短縮しようというものである。
特開２００２−２８１４５７号公報（第１頁、第５頁） 2. Description of the Related Art Conventionally, a content reproduction apparatus using metadata is known as described in Patent Document 1. According to this, the video playback device receives and stores the video material recording means (content storage device) for recording the video material and the associated information data (metadata) defining the information content of the video material. A storage device and playback control means for controlling playback of the video material recorded in the video material recording means based on the metadata are provided. The metadata includes a unique identification index, UMID, and start / end time codes. Based on this information, video that does not move much is fast-forwarded to reduce the time required to view the entire content. It is.
JP 2002-281457 (first page, fifth page)

しかしながら上記のような構成では、メタデータはビデオコンテンツに特化されたものである必要があり、例えば、各家庭にてそれぞれで録画した番組（コンテンツ）に対して第３者（別の機器）が作成した同じコンテンツに対するメタデータを用いてコンテンツ再生を行なうことを考えると以下のような不都合が生じる。即ち、１）録画を行なった機器によってコンテンツ録画を開始した時刻がことなり、同じ映像であっても録画開始時からの時刻が異なってくる。２）個々の録画用機器でそれが有している時計がずれており、仮に同時に録画を開始した場合であっても、録画されたコンテンツに付加される録画開始時刻が異なってくる。また、圧縮時のビットレートや、更に言えば、各家庭に送られてくる電波の状態により、受診された画像そのものがノイズやゴーストにより全く同一とは限らない。 However, in the configuration as described above, the metadata needs to be specialized for video content. For example, a third party (another device) for a program (content) recorded in each home. Considering that content reproduction is performed using metadata for the same content created by the above, the following inconvenience occurs. That is, 1) The time at which content recording is started differs depending on the device that performed the recording, and the time from the recording start time is different even for the same video. 2) Each recording device has a different clock, and even if recording starts simultaneously, the recording start time added to the recorded content differs. In addition, depending on the bit rate at the time of compression and, more specifically, the state of the radio wave transmitted to each home, the received image itself is not always the same due to noise or ghost.

故に、メタデータに記載されている時刻を基に再生を行なおうとすると、思い通りのシーンから正しく再生できない。 Therefore, if playback is performed based on the time described in the metadata, it cannot be correctly played back from the desired scene.

本発明は上記の問題点に鑑み、第３者が作成した同じコンテンツに対するメタデータを用いても希望どおりのコンテンツ再生を行うことができるメタデータ及びコンテンツ再生装置を提供するものである。 SUMMARY OF THE INVENTION In view of the above problems, the present invention provides metadata and a content playback apparatus that can perform content playback as desired even when metadata for the same content created by a third party is used.

この課題を解決するために本発明のコンテンツ再生装置は、ビデオコンテンツの情報が抽出され、抽出に用いたビデオコンテンツにおけるシーンチェンジに基づいた同期用情報を含むメタデータを蓄積するメタデータ蓄積手段と、ビデオコンテンツを蓄積するコンテンツ蓄積手段と、メタデータとコンテンツ蓄積手段に蓄積されたビデオコンテンツとの同期を取る同期手段と、同期手段出力に基づき、メタデータを補正する補正手段とを備えるようにしたものである。 In order to solve this problem, a content playback apparatus according to the present invention includes metadata storage means for storing video content information and storing metadata including synchronization information based on scene changes in the video content used for extraction. A content storage unit that stores video content, a synchronization unit that synchronizes the metadata and the video content stored in the content storage unit, and a correction unit that corrects the metadata based on the output of the synchronization unit. It is a thing.

更に、本発明のコンテンツ再生装置は、同期用情報が、同期用情報が付加された箇所の直前の画面に対して、どのように変化したかを示す特徴差に関する情報を含むようにしたものである。 Furthermore, the content reproduction apparatus of the present invention is configured such that the synchronization information includes information regarding a feature difference indicating how the synchronization information is changed with respect to the screen immediately before the location where the synchronization information is added. is there.

以上のように本発明によれば、第三者が作成したメタデータを用いて独自に録画したビデオコンテンツを再生でき、逆に、独自に録画したビデオコンテンツより作成したメタデータを用いて、第三者が自身で録画したビデオコンテンツを再生することができるという効果が得られる。 As described above, according to the present invention, it is possible to reproduce video content uniquely recorded using metadata created by a third party, and conversely, using metadata created from uniquely recorded video content, There is an effect that the video content recorded by the three parties can be reproduced.

本発明の請求項１に記載の発明は、ビデオコンテンツの情報が抽出され、抽出に用いたビデオコンテンツにおけるシーンチェンジに基づいた同期用情報を含むメタデータを蓄積するメタデータ蓄積手段と、ビデオコンテンツを蓄積するコンテンツ蓄積手段と、前記メタデータと前記コンテンツ蓄積手段に蓄積されたビデオコンテンツとの同期を取る同期手段と、前記同期手段出力に基づき、前記メタデータを補正する補正手段とを備えることを特徴としたものであり、これにより、どこか１箇所においてメタデータによって示される画面がどのフレームであるのかが決められれば、ある画面からある画面までのフレーム数は画質、録画したコンテンツのビットレートによらず同じであるので、第３者が作成したメタデータを用いて自身の有するビデオコンテンツと同期がとれたメタデータを得ることができるという作用を有する。 According to a first aspect of the present invention, there is provided a metadata storage means for storing video content information, and storing metadata including synchronization information based on a scene change in the video content used for the extraction, and the video content Content storage means for storing data, synchronization means for synchronizing the metadata and video content stored in the content storage means, and correction means for correcting the metadata based on the output of the synchronization means As a result, if it is possible to determine which frame is the screen indicated by the metadata at one location, the number of frames from one screen to another is the image quality, bit of recorded content. Because it is the same regardless of the rate, it has its own using metadata created by a third party An effect that can be obtained metadata of the video content and synchronization is established.

また、請求項２に記載の発明は、請求項１記載のコンテンツ再生装置において、前記同期用情報が、該同期用情報が付加された箇所の直前の画面に対してどのように変化したかを示す特徴差に関する情報を含むことを特徴としたものであり、これにより、メタデータによって予め示されている比較結果と実際の比較結果が一致するかをみることでより効率的に同期用情報によって示された箇所を絞り込むことができるという作用を有する。 The invention according to claim 2 is the content playback apparatus according to claim 1, wherein the synchronization information changes with respect to the screen immediately before the location where the synchronization information is added. It is characterized in that it includes information on the characteristic difference to be shown, and this makes it possible to more efficiently use the synchronization information by checking whether the comparison result shown in advance by the metadata matches the actual comparison result. It has the effect that the indicated location can be narrowed down.

また、請求項３に記載の発明は、請求項１記載のコンテンツ再生装置において、前記抽出に用いたビデオコンテンツ及びコンテンツ蓄積手段に蓄積されたビデオコンテンツがＭＰＥＧ方式により圧縮されたストリームであることを特徴としたものであり、これにより、コンテンツ蓄積手段に蓄えられたビデオコンテンツにおけるフレーム毎の特徴がより少ないデータ量でより顕著に現れるという作用を有する。 Further, the invention according to claim 3 is the content playback apparatus according to claim 1, wherein the video content used for the extraction and the video content stored in the content storage means are a stream compressed by the MPEG system. As a result, the feature of each frame in the video content stored in the content storage means appears more prominently with a smaller amount of data.

また、請求項４に記載の発明は、請求項３記載のコンテンツ再生装置において、前記同期用情報が、該同期用情報が付加された箇所とその直前、直後のＩピクチャの少なくとも一方に対してどのように変化したかを示す特徴差に関する情報を含むことを特徴としたものであり、これにより、メタデータによって予め示されている比較結果と実際の比較結果が一致するかをみることでより効率的に同期用情報によって示された箇所を絞り込むことができるという作用を有する。 According to a fourth aspect of the present invention, there is provided the content reproduction apparatus according to the third aspect, wherein the synchronization information is applied to at least one of the location where the synchronization information is added and the immediately preceding and immediately following I picture. It is characterized by including information on the feature difference indicating how it has changed, and by this, it is more possible to see whether the comparison result shown in advance by the metadata matches the actual comparison result It has the effect that the location indicated by the synchronization information can be narrowed down efficiently.

また、請求項５に記載の発明は、請求項３記載のコンテンツ再生装置において、前記同期用情報が、直前のＩピクチャとの特徴差が特に大きい箇所に付されている情報であることを特徴としたものであり、これにより、同期を検出する際、隣接するＩピクチャの特徴を比較し、その差が最も大きいところを第１候補とすることでより効率的に同期をとることができるという作用を有する。 Further, the invention according to claim 5 is the content playback apparatus according to claim 3, wherein the synchronization information is information attached to a location where the feature difference from the immediately preceding I picture is particularly large. As a result, when detecting synchronization, the features of adjacent I pictures are compared, and the point where the difference is the largest is the first candidate, so that synchronization can be achieved more efficiently. Has an effect.

また、請求項６に記載の発明は、請求項４または５に記載のコンテンツ再生装置において、前記特徴差が、Ｉピクチャのサイズ、輝度、色相、色の濃さ、のうち少なくとも１つを含むことを特徴としたものであり、これにより、誤ったＩピクチャを選択した場合に、メタデータによって予め示されている比較結果と実際の比較結果が一致する確率が低くなり、同期情報によって示されるＩピクチャがどれに相当するかをより効率的に絞り込むことができるという作用を有する。 The content reproduction apparatus according to claim 6, wherein the feature difference includes at least one of the size, luminance, hue, and color density of the I picture. As a result, when an incorrect I picture is selected, the probability that the comparison result indicated in advance by the metadata matches the actual comparison result is low, and is indicated by the synchronization information. This has the effect that the I picture can be narrowed down more efficiently.

また、請求項７に記載の発明は、請求項３記載のコンテンツ再生装置において、前記同期用情報が、該同期用情報により指示されるシーンチェンジの前後に少なくとも数秒間以上シーンチェンジがない箇所に付されている情報であることを特徴としたものであり、これにより、同期用情報によって示される箇所のみ直前のＩピクチャとの差異が大きくなり、同期用情報によって示されるＩピクチャがどれに相当するかがより求め易くなるという作用を有する。 Further, according to a seventh aspect of the present invention, in the content reproduction device according to the third aspect, the synchronization information is located at a place where there is no scene change for at least several seconds before and after the scene change indicated by the synchronization information. As a result, the difference from the immediately preceding I picture increases only at the location indicated by the synchronization information, and which corresponds to the I picture indicated by the synchronization information. This has the effect of making it easier to obtain.

また、請求項８に記載の発明は、請求項３記載のコンテンツ再生装置において、前記同期用情報がメタデータ内に複数個存在することを特徴としたものであり、これにより、より誤検出を防ぐという作用を有する。 The invention according to claim 8 is the content playback apparatus according to claim 3, characterized in that a plurality of the synchronization information exist in the metadata. It has the effect of preventing.

また、請求項９に記載の発明は、請求項８記載のコンテンツ再生装置において、前記同期手段が、同期用情報が付加された箇所間のフレーム数を算出し、前記ビデオコンテンツにおいて前記算出値どおりの箇所にＩピクチャあるいは該ピクチャを挟む形で存在するＩピクチャの中でサイズが最大となるＢピクチャまたはＰピクチャが存在することを利用して同期検出を行なうようにしたことを特徴としたものであり、これにより、Ｉピクチャ、Ｂピクチャ、Ｐピクチャのデータ量のみを調べるだけで検出処理を行なうことができ、ＭＰＥＧ圧縮されたコンテンツをデコードすることなく同期をとることができるという作用を有する。 The content reproduction apparatus according to claim 9 is the content reproduction apparatus according to claim 8, wherein the synchronization unit calculates the number of frames between locations to which synchronization information is added, and the video content is in accordance with the calculated value. It is characterized in that synchronization detection is performed by utilizing the presence of the B picture or P picture having the maximum size among the I pictures or the I pictures existing between the pictures at As a result, the detection process can be performed only by examining the data amount of the I picture, B picture, and P picture, and synchronization can be achieved without decoding the MPEG compressed content. .

また、請求項１０に記載の発明は、請求項１記載のコンテンツ再生装置において、前記コンテンツ蓄積手段に格納されたビデオコンテンツを再生するための再生条件を蓄積する蓄積手段と、前記補正手段出力及び前記再生条件に基づき、前記ビデオコンテンツの再生を制御する制御手段と、前記コンテンツ蓄積手段出力と前記制御手段出力に基づき前記ビデオコンテンツの再生を行なう再生手段とを備えたことを特徴としたものであり、これにより、補正されたメタデータを用いて自由に好みの条件によりコンテンツを再生することができるという作用を有する。 Further, the invention according to claim 10 is the content playback apparatus according to claim 1, wherein the storage means for storing the playback conditions for playing back the video content stored in the content storage means, the output of the correction means, And a control unit that controls playback of the video content based on the playback condition, and a playback unit that plays back the video content based on the output of the content storage unit and the output of the control unit. With this, the content can be freely reproduced under desired conditions using the corrected metadata.

また、請求項１１に記載の発明は、請求項１記載のコンテンツ再生装置において、前記補正手段により補正されたメタデータを保存するか補正前のメタデータと置き換える手段を有することを特徴としたものであり、これにより、一度メタデータと蓄積されたビデオコンテンツの同期をとれば、次回、そのメタデータを用いてビデオコンテンツを再生する場合に同期を取り直す必要がなくなるという作用を有する。 The invention according to claim 11 is the content playback apparatus according to claim 1, further comprising means for storing the metadata corrected by the correcting means or replacing the metadata before correction. As a result, once the metadata and the accumulated video content are synchronized, there is no need to re-synchronize the next time the video content is reproduced using the metadata.

また、請求項１２に記載の発明は、請求項１１記載のコンテンツ再生装置において、前記メタデータが前記第１の蓄積手段に蓄積されたビデオコンテンツと既に同期がとれているものであるのかを識別する手段を有していることを特徴としたものであり、これにより、メタデータを用いてビデオコンテンツの再生を行なう際、予め同期を取る必要の有無を瞬時に判断できるという作用を有する。 The invention according to claim 12 is the content playback apparatus according to claim 11, wherein whether the metadata is already synchronized with the video content stored in the first storage means is identified. Therefore, when reproducing video content using metadata, it is possible to instantaneously determine whether or not synchronization is required in advance.

また、請求項１３に記載の発明は、請求項１記載のコンテンツ再生装置において、前記コンテンツ蓄積手段に蓄積されたビデオコンテンツが、どのような方式で圧縮されているかを示す情報を有していることを特徴としたものであり、これにより、第３者が作成したビデオコンテンツに対してもメタデータとの同期を取ることが可能となるという作用を有する。 The invention described in claim 13 has information indicating in what manner the video content stored in the content storage means is compressed in the content playback apparatus described in claim 1. Thus, the video content created by a third party can be synchronized with the metadata.

以下、本発明の実施の形態について図面を用いて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（実施の形態）
図１は本発明の実施の形態によるコンテンツ再生装置の構成を示すブロック図である。この図を説明すると、コンテンツ蓄積手段６にはビデオコンテンツが記録されており、ここでは一般にテレビ放送された番組が録画機器によってＭＰＥＧ２圧縮録画されたビデオコンテンツが蓄積されている。ビデオコンテンツはファイル化されており、ファイルのアトリビュート（attribute）を読みとることにより、録画日時やその内容の概略を知ることが出来る。 (Embodiment)
FIG. 1 is a block diagram showing a configuration of a content reproduction apparatus according to an embodiment of the present invention. Referring to this figure, video content is recorded in the content storage means 6, and here, video content that is generally MPEG2 compressed and recorded by a recording device is stored. Video contents are filed, and by reading the file attributes, the recording date and time and the outline of the contents can be known.

メタデータ蓄積手段２にはコンテンツ蓄積手段６に蓄積されたビデオコンテンツに対応したメタデータが蓄積されている。ここでは、第３者（別個の機器）が前述のテレビ放送を個別に録画し、その録画映像を基に作成したメタデータが格納されている。メタデータには、録画日時や、その内容、また、該当するビデオコンテンツにおいてどこにどのようなシーンが録画されているか、また、このメタデータを、例えば、他のビデオ録画機器で録画した同一の番組に適応するときに用いるための同期用情報が所定のフォーマットによって示されている。 The metadata storage unit 2 stores metadata corresponding to the video content stored in the content storage unit 6. Here, metadata created by a third party (separate device) individually recording the above-described television broadcast and created based on the recorded video is stored. The metadata includes the date and time of recording, the contents thereof, where and what scene is recorded in the corresponding video content, and the same program recorded by other video recording devices, for example. Information for synchronization to be used when adapting to is shown in a predetermined format.

図２には、本発明に適用可能なメタデータの具体例が示されている。ここでは、メタデータはＸＭＬ記法（eXtensible Markup Language）によって記述されており、メタデータの概略の内容を示すヘッダ部とビデオコンテンツの具体的な内容を示すボディ部によって構成されている。 FIG. 2 shows a specific example of metadata applicable to the present invention. Here, the metadata is described in XML notation (eXtensible Markup Language), and is composed of a header portion indicating the general content of the metadata and a body portion indicating the specific content of the video content.

ヘッダ部はヘッダタグ（＜header＞）により開始部が示されており、ビデオコンテンツの番組のタイトル（＜title＞）、番組ジャンル（＜category＞）、録画日時（date）、録画時間（duration）、内容を示すキーワード（kywd）、このメタデータを作成したときに用いたビデオコンテンツがどのように圧縮されたかを示すエンコード情報（＜enc＞）が設定されている。エンコード情報は、エンコード方式を示すｍｅｔｈｏｄパラメータとそのモードを示すｍｏｄｅパラメータより成る。ｍｅｔｈｏｄパラメータは、ｍｐｅｇ２、ｍｐｅｇ４、ｎｏｎｅ、の３とおりがあり、ｍｐｅｇ２、ｍｐｅｇ４は文字通り、そのビデオコンテンツがＭＰＥＧ２、またはＭＰＥＧ４方式で圧縮されたものであったことを示す。またｎｏｎｅとは、ＭＰＥＧ圧縮されていないビデオコンテンツであったことを示す。ｍｏｄｅパラメータは、ｍｅｔｈｏｄパラメータの値がｎｏｎｅ以外の場合に有効であり、ここでは、シーンチェンジに関わらず、Ｉピクチャが一定の周期で挿入されるモードを意味するｃｏｎｓｔａｎｔと、シーンチェンジ毎にＩピクチャが強制的に割り当てられ、次のシーンチェンジがあるまでは一定の周期でＩピクチャが挿入されるモードを意味するｏｐｔｉｍｕｍの２種類を用意している。本実施の形態ではｃｏｎｓｔａｎｔであることが示されている。オフセットタグ（＜offset＞）はメタデータ補正用のタグであり、後述する。 The header part is indicated by a header tag (<header>), and the title of the video content program (<title>), program genre (<category>), recording date and time (date), recording time (duration), A keyword indicating the content (kywd) and encoding information (<enc>) indicating how the video content used when the metadata is created are compressed are set. The encoding information includes a method parameter indicating the encoding method and a mode parameter indicating the mode. There are three method parameters, mpeg2, mpeg4, and none, and mpeg2 and mpeg4 literally indicate that the video content was compressed by the MPEG2 or MPEG4 system. “None” indicates that the video content is not MPEG-compressed. The mode parameter is effective when the value of the method parameter is other than “none”. Here, “constant” means a mode in which an I picture is inserted at a constant period regardless of a scene change, and an I picture for each scene change. Are forcibly assigned, and two types of optimum, which means a mode in which an I picture is inserted at a constant period until the next scene change, are prepared. In the present embodiment, it is shown to be constant. The offset tag (<offset>) is a metadata correction tag, which will be described later.

ここでは、番組のジャンルが「ｓｐｏｒｔｓ／ｂａｓｅｂａｌｌ」となっており、スポーツ番組であり、そのスポーツが野球であることがわかる。ここでは、「／」を用いてジャンルの更なる細かな分類をするようにしている。また、エンコード情報に「ｍｅｔｈｏｄ＝ｍｐｅｇ２；ｍｏｄｅ＝ｃｏｎｓｔａｎｔ」とあるように、ＭＰＥＧ２でエンコードされ、シーンチェンジの際にＩピクチャを強制的に割り当てないｃｏｎｓｔａｎｔモードであること等が示されている。なお、ここではすべてローマ字を用いて表記しているが、無論カタカナや漢字を用いて良いものである。 Here, the genre of the program is “sports / baseball”, which indicates that the program is a sports program and that the sport is baseball. Here, “/” is used to further categorize the genre. Further, the encoding information is “method = mpeg2; mode = constant”, which is encoded in MPEG2 and is in a constant mode in which an I picture is not forcibly assigned at the time of a scene change. In addition, although all are described using Roman characters here, it is a matter of course that katakana and kanji may be used.

ボディ部はボディタグ（＜body＞）によってその開始部が示されており、どこにどのようなシーンが記録されているかを、各シーンをセグメントと見なして、セグメントタグ（＜seg＞）で示されている。セグメントタグ内には、そのブロックが全体のビデオコンテンツにおいてどの録画期間にあるのかを示すポジションタグ（＜pos＞）、また、その内容を示すキーワードタグ（＜kywd＞）などが含まれている。ポジションタグには、そのセグメントがビデオコンテンツ上において開始からどの時刻より開始するかを示すｆｒｏｍパラメータと、セグメントの長さを示すｄｕｒａｔｉｏｎパラメータ、更には、それらに与えられた数値の単位を指定するｕｎｉｔパラメータを有している。 The start of the body part is indicated by the body tag (<body>). The scene tag is recorded as a segment tag (<seg>), where each scene is regarded as a segment. ing. The segment tag includes a position tag (<pos>) indicating which recording period the block is in the entire video content, a keyword tag (<kywd>) indicating the content, and the like. In the position tag, the from parameter indicating from which time the segment starts on the video content, the duration parameter indicating the length of the segment, and the unit for specifying the unit of the numerical value given to them Has parameters.

ここでは、セグメント１では、キーワードとしては、Ｎ．Ｙ．、ｂａｔｔｅｒ＃１、等とあり、ニューヨーク、最初の打者、のシーンであり、ｆｒｏｍパラメータが１２０００、ｄｕｒａｔｉｏｎパラメータが５５００、単位がｍｓであるので、ビデオコンテンツ開始から１２０００ms、即ち１２秒後から５５００ms、つまり、５．５秒間がそのシーンに該当していることが示されている。セグメント４では、セグメントタグ内に「ｓｙｎｃ＝ｏｎ」として表示されており、このセグメントを用いてビデオコンテンツとの同期を取ればよいこが示されており、同期を取りやすくするために、ここでは、ヘッダ部にあるとおり、このメタデータはＭＰＥＧ２で圧縮されたビデオコンテンツを基に作成されているので、そのセグメントの開始フレームとその直前、直後のＩピクチャとの特徴差を示す情報が表示されている。ここでは、直前、直後のＩピクチャとの特徴差を示すようにしているが、通常、シーンチェンジ直後はそれほど映像に差がでない場合が多いので、直前のＩピクチャとの特徴差のみを示すようにしてもよい。 Here, in segment 1, as keywords, N.I. Y. , Bitter # 1, etc., the scene of New York, the first batter, the from parameter is 12000, the duration parameter is 5500, and the unit is ms, so 12000ms from the start of the video content, that is, 5500ms after 12 seconds, That is, it is shown that 5.5 seconds corresponds to the scene. Segment 4 is displayed as “sync = on” in the segment tag, which indicates that this segment should be used to synchronize with the video content. To facilitate synchronization, here, As shown in the header section, this metadata is created based on video content compressed in MPEG2, so that information indicating the feature difference between the start frame of the segment and the immediately preceding and immediately following I picture is displayed. Yes. Here, the feature difference from the immediately preceding and immediately following I picture is shown, but usually there is not much difference in the video immediately after the scene change, so only the feature difference from the immediately preceding I picture is shown. It may be.

以下、図５とともにその動作について説明する。制御手段１０によりコンテンツ蓄積手段６及びメタデータ蓄積手段２よりビデオコンテンツ及びそれに該当するメタデータが選択される(ステップ３１)。コンテンツ蓄積手段６及びメタデータ蓄積手段２においてビデオコンテンツ及びメタデータの読み込みが開始され（ステップ３２）、選択されたメタデータが表示される（ステップ３３）。表示装置については特に新規のものではないので図示していない。 The operation will be described below with reference to FIG. The control means 10 selects video contents and corresponding metadata from the content storage means 6 and the metadata storage means 2 (step 31). The content storage unit 6 and the metadata storage unit 2 start reading the video content and metadata (step 32), and the selected metadata is displayed (step 33). Since the display device is not particularly new, it is not shown.

表示されたメタデータに基づき、利用者は制御手段１０を介して再生条件設定を行なう（ステップ３４）。どのように設定するかは設定者の好みで自由に設定可能であるが、例えば、特定のシーンを繰り返し何度か再生する、あるいは、特定のシーンはスキップする、などがあり得る。 Based on the displayed metadata, the user sets playback conditions via the control means 10 (step 34). The setting method can be freely set according to the preference of the setter. For example, a specific scene can be repeatedly reproduced several times, or a specific scene can be skipped.

一方、ステップ３２において読み込まれたメタデータからは同期用タグの検索が行なわれる（ステップ３５）。ここでは、同期用タグは図２、セグメント４、及びセグメント９に示されるとおり、セグメントタグ内に「ｓｙｎｃ＝ｏｎ」として表示されており、このセグメントを同期用に用いてビデオコンテンツとの同期を取ればよいこと意味している。「ｓｙｎｃ＝ｏｎ」以降に表示されているパラメータは、ここでは使用していない。 On the other hand, the synchronization tag is searched from the metadata read in step 32 (step 35). Here, as shown in FIG. 2, segment 4 and segment 9, the synchronization tag is displayed as “sync = on” in the segment tag, and this segment is used for synchronization to synchronize with the video content. It means you can take it. The parameters displayed after “sync = on” are not used here.

同期用タグが見つかると、該当するセグメントが実際にどこに存在するかを調べるために、ビデオコンテンツの解析が行なわれる（ステップ３６）。解析が終了し、同期がとれた場合にはステップ３７において「同期完了」が選択され、ステップ３８へと移行する。コンテンツ解析がどのようにしてなされるかについては後述する。 When the synchronization tag is found, the video content is analyzed in order to find out where the corresponding segment actually exists (step 36). When the analysis is completed and synchronization is established, “synchronization complete” is selected in step 37, and the process proceeds to step 38. How the content analysis is performed will be described later.

ステップ３７において同期ＯＫとなると、ステップ３８に移りメタデータの補正を行なう。即ち、ステップ３６、３７、４１による一連の操作によって得られた、メタデータに示される各シーンの開始時刻と、コンテンツ蓄積手段６に格納されたビデオコンテンツのアトリビュート等より実際に得られた該当するシーンの開始時刻との差を補正する。ここでは、図２に示すメタデータのヘッダ部において、オフセットタグを用い、上記のメタデータによって示される各シーンの開始時刻と実際の開始時刻との差を表記することで補正を行なっている。 If the synchronization is OK in step 37, the process proceeds to step 38 to correct the metadata. That is, the corresponding actual value obtained from the start time of each scene shown in the metadata obtained by a series of operations in steps 36, 37, and 41, the attribute of the video content stored in the content storage means 6, and the like. Correct the difference from the scene start time. Here, in the header portion of the metadata shown in FIG. 2, the offset tag is used to correct the difference between the start time of each scene indicated by the metadata and the actual start time.

即ち、オフセットタグには、このメタデータにて示されているセグメントの開始時刻と、再生しようとする実際のビデオコンテンツにおける時刻の差を示す数値と、その数値の単位をｔｉｍｅパラメータ、ｕｎｉｔパラメータで示し、更には、このオフセットデータがどのビデオ再生装置で補正されたものかを示す情報をｂａｓｅパラメータで明示している。ここではｂａｓｅパラメータには、再生装置のシリアル番号を書き込むようにしている。逆に、このｂａｓｅパラメータの値が再生装置のシリアル番号と一致すれば、このメタデータは補正済みのものとして扱われる。 In other words, the offset tag includes a numerical value indicating the difference between the start time of the segment indicated in the metadata, the time difference in the actual video content to be reproduced, and the unit of the numerical value by the time parameter and the unit parameter. In addition, information indicating which video playback device has corrected the offset data is clearly indicated by a base parameter. Here, the serial number of the playback device is written in the base parameter. Conversely, if the value of the base parameter matches the serial number of the playback device, the metadata is treated as corrected.

オフセットタグの値は次のように用いられる。即ち、再生条件データ蓄積手段４においてセグメント１の再生が指定されていたとすると、再生制御手段３は、メタデータ補正手段８よりメタデータ・ヘッダ部におけるオフセットタグを読み取る。ｔｉｍｅ＝−１２３３、ｕｎｉｔ＝ｍｓとなっているので、−１２３３msを各セグメントのｆｒｏｍパラメータに加えることとなる。 The offset tag value is used as follows. That is, assuming that the reproduction of the segment 1 is designated in the reproduction condition data storage unit 4, the reproduction control unit 3 reads the offset tag in the metadata header portion from the metadata correction unit 8. Since time = 1-1233 and unit = ms, −1233 ms is added to the from parameter of each segment.

次に、セグメントタグ＜ｓｅｇ＝１＞におけるポジションタグの値を読み取る。ここにはｆｒｏｍパラメータに１２０００、ｕｎｉｔパラメータにｍｓとなっているので、ビデオコンテンツ開始より、１２０００msに先ほどのオフセットタグによる−１２３３msを加えた１０７６７ms後からのビデオ映像を再生するようコンテンツ再生手段５に対して指令を与える。 Next, the value of the position tag in the segment tag <seg = 1> is read. Here, since the from parameter is 12000 and the unit parameter is ms, the content reproduction means 5 is configured to reproduce the video image from 10767 ms after adding 1233 ms to the previous 20001 offset tag from the start of the video content. Give a command to it.

各家庭にてそれぞれで録画したビデオ映像は、その録画開始時刻が異なっていた場合であっても、フレーム単位で見れば必ず同じ映像が各家庭に送信されており、一度同期がとれればすべての部分において完全な同期が得られる。故に、上述したように、どこか一箇所で同期の補正を行なえば、すべての箇所で確実に同期をとることが可能となる。 Even if the recording start time is different for each video recorded at each home, the same video is always sent to each home when viewed in frame units, and once synchronized, all video Full synchronization is obtained in the part. Therefore, as described above, if synchronization correction is performed at one location, it is possible to ensure synchronization at all locations.

補正ステップ（ステップ３８）が終了すると通常の再生が開始される（ステップ３９）。ここでは、再生制御手段３が、再生条件データ蓄積手段４に格納されている、どのシーンを再生するか、といった情報に基づき、メタデータ蓄積手段２より該当するシーンのセグメントタグを選び、メタデータ補正手段８よって補正されたタグ情報に基づいてコンテンツ再生手段５に対し、ビデオコンテンツのどの部分を再生するかの指示を出す。 When the correction step (step 38) is completed, normal reproduction is started (step 39). Here, the reproduction control means 3 selects the segment tag of the corresponding scene from the metadata accumulation means 2 based on the information stored in the reproduction condition data accumulation means 4 such as which scene is reproduced, and the metadata. Based on the tag information corrected by the correcting unit 8, the content reproducing unit 5 is instructed which part of the video content is to be reproduced.

以下同様にして再生条件データ蓄積手段４において指定されたとおりの順でビデオコンテンツが再生される。 In the same manner, video contents are reproduced in the order designated by the reproduction condition data storage means 4.

ステップ３６において同期用タグがついたセグメントがどこに存在するかが解析できなかった場合は、ステップ３７において「Ｎｏ」が選択され、ステップ４１へ移行し、同期がとれなかった旨を表示し、ビデオコンテンツを冒頭部から通常再生を行なう（ステップ４２）。 If it is not possible to analyze where the segment with the synchronization tag exists in step 36, “No” is selected in step 37, the process proceeds to step 41, indicating that synchronization has not been achieved, and video The content is normally reproduced from the beginning (step 42).

なお、上記実施の形態においては、単位を示すｕｎｉｔパラメータとしてms（ミリ秒）を用いたが無論これに限ったものではなく、秒、分、あるいは、フレームやフィールドを単位として用いてもよい。特にビデオ映像はフレーム単位で管理されている場合が多いため、単位としてフレームを用いると個々のコンテンツ再生装置においてクロック信号を発生する源発振器のバラツキを抑えることができ、有効である。 In the above embodiment, ms (millisecond) is used as the unit parameter indicating the unit. However, the present invention is not limited to this, and the unit may be seconds, minutes, or a frame or field. In particular, since video images are often managed in units of frames, using a frame as a unit is effective because it can suppress variations in source oscillators that generate clock signals in individual content playback apparatuses.

このように構成することにより、利用者はコンテンツ再生装置１において、ビデオコンテンツとそれに該当するメタデータを選択し、どのシーンを見たいかを入力するだけで簡単に第３者が作成したメタデータを用いてのビデオコンテンツ再生が可能となる。 With this configuration, the user can select the video content and the corresponding metadata in the content playback apparatus 1, and simply enter the scene that he / she wants to view, and the metadata created by the third party can be easily obtained. The video content can be reproduced using the.

次に、コンテンツ解析手法について述べる。説明を分かり易くするため、具体的な手法を述べる前に、ＭＰＥＧ圧縮を用いたビデオレコーダで一般的に行なわれる画像圧縮について図３とともに説明する。図３（１）は圧縮前のビデオ映像、即ち放送局から送信されてきたテレビ映像番組を示す。各区切りはフレームを表す。（２）はその信号をＭＰＥＧ圧縮した結果である。Ｉと示しているのはＩピクチャ、Ｐと示しているのはＰピクチャ、Ｂと示しているのはＢピクチャである。 Next, a content analysis method will be described. In order to make the explanation easy to understand, before describing a specific method, image compression generally performed in a video recorder using MPEG compression will be described with reference to FIG. FIG. 3 (1) shows a video image before compression, that is, a television image program transmitted from a broadcasting station. Each break represents a frame. (2) is the result of MPEG compression of the signal. I represents an I picture, P represents a P picture, and B represents a B picture.

通常、ビデオ映像を圧縮・録画するとき、他のフレームと無相関のＩピクチャ、Ｉピクチャとの差分を示すＰピクチャ、その前後にあるＩピクチャ、あるいはＰピクチャとの差分を表すＢピクチャを周期的に割り当てることで圧縮シーケンスの簡略化と再生時の利便性を達成するようにしている。多くは、図３（２）の時刻Ａ以前の部分に示すように、「ＩＢＢＰＢＢＰＢＢＰＢＢＰＢＢ」を周期的に繰り返す等の方法を採っている。しかし、例えば、図３（１）において、時刻Ａでシーンチェンジがあった場合、前の映像データとの差分を取ると膨大なデータ量となるため、このような場合には、この画面をＩピクチャで構成し、ここを起点として前述のＩＢＢＰＢＢ・・・の周期を繰り返すようにしている。メタデータ・ヘッダ部におけるエンコード情報・ｍｏｄｅパラメータ＝ｏｐｔｉｍｕｍに相当する。しかし機種によっては、データ量が増えることよりもエンコードに要する演算量を優先させるため、図４（２）に示すように、シーンチェンジの有無に関わらず、前述のＩＢＢＰＢＢ…の周期を守るようにしているものもある。メタデータ・ヘッダ部におけるエンコード情報・ｍｏｄｅパラメータ＝ｃｏｎｓｔａｎｔに相当する。 Usually, when video images are compressed / recorded, an I picture uncorrelated with other frames, a P picture indicating a difference from the I picture, an I picture before or after the I picture, or a B picture indicating a difference from the P picture Thus, the compression sequence is simplified and the convenience during reproduction is achieved. In many cases, as shown in the part before time A in FIG. 3B, a method of periodically repeating “IBBPBBPBBPBBPBB” is employed. However, for example, in FIG. 3 (1), when there is a scene change at time A, if the difference from the previous video data is taken, the amount of data becomes huge. It is composed of pictures, and the cycle of IBBPBB... Described above is repeated starting from this. This is equivalent to encoding information in the metadata header part, mode parameter = optimum. However, depending on the model, the calculation amount required for encoding is given priority over the increase in the amount of data. Therefore, as shown in FIG. 4 (2), the cycle of IBBPBB... Some have. This is equivalent to encoding information / mode parameter = constant in the metadata header section.

図３（３）は、図３（１）に示すビデオコンテンツを他のビデオレコーダで録画した場合のＭＰＥＧ圧縮結果である。録画開始時刻が異なるため、録画開始から暫くの期間は、送信されてきた映像がＩ、Ｐ、Ｂのどれに割り当てられるかはビデオレコーダ次第である。この図においては、エンコーダがシーンチェンジの有無に関わらず、前述のＩＢＢＰＢＢ・・・の周期を守るようにしているため、時刻Ａにおけるシーンチェンジが発生した後も、図３（２）、（３）でＩ、Ｐ、Ｂの同期はとれないことがわかる。 FIG. 3 (3) shows an MPEG compression result when the video content shown in FIG. 3 (1) is recorded by another video recorder. Since the recording start time is different, it is up to the video recorder whether the transmitted video is assigned to I, P, or B for a period of time from the start of recording. In this figure, since the encoder keeps the above-described cycle of IBBPBB... Regardless of whether or not there is a scene change, even after a scene change at time A occurs, FIGS. ) That I, P, and B cannot be synchronized.

図４（３）はビデオレコーダにおけるエンコード方式が異なった場合を示しており、この図においては、エンコーダがシーンチェンジの毎にＩピクチャを割り当てるようにし、ここを起点として前述のＩＢＢＰＢＢ・・・の周期を繰り返している。 FIG. 4 (3) shows a case where the encoding method in the video recorder is different. In this figure, the encoder allocates an I picture at every scene change, and this point is used as a starting point for the IBBPBB. The cycle is repeated.

コンテンツ蓄積手段６に蓄積されているビデオコンテンツが図３（３）、図４（３）のいずれのタイプであるかは、このコンテンツが自分自身で録画したものであれば当然把握で着るものであるが、仮に第３者が録画したものであればいずれであるか判明しない場合がある。そこでここでは、コンテンツ蓄積手段６に蓄積された各ビデオコンテンツのヘッダ部に、図３（３）、図４（３）のいずれのタイプであるかを示す情報を、メタデータに倣い、ｏｐｔｉｍｕｍモード、ｃｏｎｓｔａｎｔモードとして付加するようにしている。 The type of the video content stored in the content storage means 6 can be determined as a matter of course as long as this content is recorded by itself, as shown in FIG. 3 (3) or FIG. 4 (3). However, there is a case where it is not clear if it is recorded by a third party. Therefore, here, in the header portion of each video content stored in the content storage means 6, information indicating which type is shown in FIG. 3 (3) or FIG. , A constant mode is added.

一方、メタデータにおける同期用タグは、シーンチェンジが発生した時を中心に付加するようにしている。特に、シーンチェンジが発生し、しかもその際に作成されたピクチャのデータ量が特に多いものを選んで付すようにすると効果的である。 On the other hand, a synchronization tag in metadata is added mainly when a scene change occurs. In particular, it is effective to select and attach a scene change that has a particularly large amount of picture data.

さて、コンテンツ解析手法に戻ると、図６はステップ３６におけるコンテンツ解析の具体例を示すシーケンス図である。解析が開始されると、同期検出手段７は、同期用タグがついているセグメントを抽出し、それらセグメントの間隔を算出する（ステップ５１）。各セグメントにはその開始時刻を示すポジションタグがあるため、それに表示されているｆｒｏｍパラメータを読み込み、その差を求めれば容易にその値を得ることが出来る。ここでは、フレームを単位としてそれらセグメントの間隔を求めるようにしており、Ｎ番目の同期タグが付加されたセグメントとＮ＋１番目の同期タグが付加されたセグメントとの間隔をＦＮフレームとしている。 Returning to the content analysis method, FIG. 6 is a sequence diagram showing a specific example of content analysis in step 36. When the analysis is started, the synchronization detection means 7 extracts segments with synchronization tags and calculates the interval between the segments (step 51). Since each segment has a position tag indicating its start time, the value can be easily obtained by reading the from parameter displayed on the segment and obtaining the difference. Here, the interval between these segments is obtained in units of frames, and the interval between the segment to which the Nth synchronization tag is added and the segment to which the (N + 1) th synchronization tag is added is the FN frame.

同期タグが付加された各セグメント間の間隔が求まると、次は同期タグが付加された先頭のセグメントとなる候補選びに入る。 When the interval between the segments to which the synchronization tag is added is obtained, the next candidate selection for the first segment to which the synchronization tag is added is entered.

仮に、今同期を取ろうとしているビデオコンテンツがＭＰＥＧ圧縮時に図４（３）に示すような方式、即ちｏｐｔｉｍｕｍモードを用いていれば、シーンチェンジ毎にＩピクチャが割り当てられ、同期タグはシーンチェンジの際に付されるようになっているので、ステップ５２では候補となるフレームとしてＩピクチャを選べば良い。実際には、ビデオコンテンツのアトリビュートに示される録画開始時刻や、セグメントタグに示される同期用タグの時刻から概略のＩピクチャを算出し、近傍にあるＩピクチャを選択すればよい。 If the video content to be synchronized now uses the method shown in FIG. 4 (3) at the time of MPEG compression, that is, the optimum mode, an I picture is assigned for each scene change, and the synchronization tag is set to the scene change. In step 52, an I picture may be selected as a candidate frame. Actually, an approximate I picture may be calculated from the recording start time indicated by the attribute of the video content or the time of the synchronization tag indicated by the segment tag, and an adjacent I picture may be selected.

逆に、今同期を取ろうとしているビデオコンテンツがＭＰＥＧ圧縮時に図３（３）に示すような方式、即ちｃｏｎｓｔａｎｔモードを用いていれば、シーンチェンジの有無に関わらず一定の周期でＩピクチャ、Ｐピクチャ、Ｂピクチャが割り当てられていくので、ステップ５２では候補となるフレームとして、ビデオコンテンツのアトリビュートに示される録画開始時刻や、セグメントタグに示される同期用タグの時刻から概略の位置を算出し、近傍にあり、サイズが極大となっているＰピクチャまたはＢピクチャ、あるいはＩピクチャを選択すればよい。ここで言う「極大」とは、該ピクチャを挟む形で存在するＩピクチャの中でサイズが最大であることを意味する。以下も特に指定しない限り同様である。 On the contrary, if the video content to be synchronized now uses the method as shown in FIG. 3 (3) at the time of MPEG compression, that is, the constant mode, the I picture, Since P pictures and B pictures are allocated, in step 52, approximate positions are calculated as candidate frames from the recording start time indicated in the video content attribute and the time of the synchronization tag indicated in the segment tag. A P picture, a B picture, or an I picture that is in the vicinity and has a maximum size may be selected. “Maximum” as used herein means that the size is the largest among the I pictures existing between the pictures. The same applies to the following unless otherwise specified.

ステップ５３では、仮に、今同期を取ろうとしているビデオコンテンツがｏｐｔｉｍｕｍモードを用いていれば、シーンチェンジ毎にＩピクチャが割り当てられるので、もし、候補として選択したＩピクチャが同期用タグによって示されているシーンであればステップ５３によって選ばれたピクチャはすべてＩピクチャとなっている筈である。この場合は同期がとれたものと判断してステップ５５に進み、候補として選択したシーンの実際のビデオコンテンツにおける時刻と、メタデータによって示されるデータとの差を抽出し、解析を終了する。 In step 53, if the video content to be synchronized is using the optimum mode, an I picture is assigned for each scene change. Therefore, the I picture selected as a candidate is indicated by the synchronization tag. If it is a scene, all the pictures selected in step 53 should be I pictures. In this case, it is determined that synchronization has been achieved, and the process proceeds to step 55, where the difference between the time in the actual video content of the scene selected as a candidate and the data indicated by the metadata is extracted, and the analysis is terminated.

今同期を取ろうとしているビデオコンテンツがＭＰＥＧ圧縮時にｃｏｎｓｔａｎｔモードを用いていれば、前述のように、同期用タグは必ずシーンチェンジの際に付加されており、シーンチェンジの際には、ピクチャサイズは必ず極大となるか、たまたまＩピクチャが割り当てられているかのいずれかであるので、もし、候補として選択したピクチャが該当するシーンであればステップ５３によって選ばれたピクチャは、必ず、サイズが極大となっているＰピクチャまたはＢピクチャであるか、あるいはＩピクチャのいずれかである筈である。その場合は同期がとれたものと判断してステップ５５に進み、候補として選択したシーンの実際のビデオコンテンツにおける時刻と、メタデータによって示されるデータとの差を抽出し、解析を終了する。 If the video content to be synchronized now uses the constant mode at the time of MPEG compression, as described above, the synchronization tag is always added at the time of the scene change, and at the time of the scene change, the picture size Is always the maximum or happens to be assigned an I picture, so if the picture selected as a candidate is a corresponding scene, the picture selected in step 53 must have a maximum size. Either a P picture or a B picture, or an I picture. In this case, it is determined that synchronization has been achieved, and the process proceeds to step 55, where the difference between the time in the actual video content of the scene selected as a candidate and the data indicated by the metadata is extracted, and the analysis is terminated.

逆に、ステップ５３にて選択されたピクチャの１枚以上が条件を満たしたピクチャでは無い場合は、同期がとれていないと判断されるため次の候補を抽出しなければならない。この場合は、ステップ５７へ移行し、候補となりうるＩピクチャ、あるいはサイズが極大となるＰピクチャかＢピクチャがまだ存在するかどうかを判定する。実際には前回選択したピクチャの前後にあるＩピクチャ、または、サイズが極大となっているＰピクチャかＢピクチャのいずれかを選択する。 Conversely, if one or more of the pictures selected in step 53 is not a picture that satisfies the condition, it is determined that synchronization is not achieved, and the next candidate must be extracted. In this case, the process proceeds to step 57, and it is determined whether there is still an I picture that can be a candidate, or a P picture or B picture having a maximum size. Actually, either an I picture before or after the previously selected picture, or a P picture or B picture having a maximum size is selected.

仮に、今同期を取ろうとしているビデオコンテンツがＭＰＥＧ圧縮時にｏｐｔｉｍｕｍモードを用いているとして、Ｉピクチャとして２０番目にあるものが最初に候補とされた場合には、次候補としては、２１番目、１９番目、２２番目、１８番目、・・・、といった具合に、徐々に最初の候補から離れたＩピクチャを候補として選択するとよい。 If the video content to be synchronized now uses the optimum mode during MPEG compression, and the 20th I picture is the first candidate, the next candidate is the 21st, It is preferable to select an I picture that gradually moves away from the first candidate as a candidate, such as 19th, 22nd, 18th,.

今同期を取ろうとしているビデオコンテンツがＭＰＥＧ圧縮時にｃｏｎｓｔａｎｔモードを用いていれば、次候補としては、その前にあって極大となるＰピクチャ、Ｂピクチャ、あるいはＩピクチャ、最初の候補のその後ろにあって極大となるＰピクチャ、Ｂピクチャ、あるいはＩピクチャ、その更に前、あるいは後ろ、といった具合に、徐々に最初の候補から離れていくように候補を選択するとよい。 If the video content to be synchronized now uses the constant mode at the time of MPEG compression, the next candidate is the P picture, B picture, or I picture, which is the maximum before that, and behind the first candidate. In this case, the candidate may be selected so as to gradually move away from the first candidate, such as a P picture, a B picture, or an I picture, which is a maximum, and before or after that.

また、次候補のあり／無しに関しては、最初の候補として選択されたＩピクチャからの距離が一定以上離れた場合に「次候補無し」とする。「一定以上」をどの程度とするかは、利用者が選択できるようにする。 As for the presence / absence of the next candidate, “no next candidate” is set when the distance from the I picture selected as the first candidate is more than a certain distance. The user can select how much “a certain level or more” is set.

ステップ５７において「次候補無し」となった場合は、ステップ５９へ進み、同期がとれなかったことを示す「検出不能フラグ」を立ててシーケンスを終了する。 If “no next candidate” is found in step 57, the process proceeds to step 59, where the “detection impossible flag” indicating that synchronization cannot be established is set and the sequence is terminated.

なお、上記実施例において、メタデータはＭＰＥＧ圧縮されたビデオコンテンツを基に作成されたものとして説明したが、もし、このメタデータがＭＰＥＧ２やＭＰＥＧ４で圧縮されたビデオコンテンツではなく、デジタルビデオムービーで用いられているＭｉｎｉＤＶ方式等で録画されたビデオコンテンツを基に作成されている場合は、そのセグメントの開始フレームとその直前、直後のフレームとの特徴差を示す情報を表示するようにメタデータを作成する。ここでは、直後のフレームとの特徴差、としているが、これはあくまで一例であり、例えば図３、４におけるＩピクチャが挿入される周期である１５フレーム後のフレームとの特徴差としても良い。このようにすることで、もしコンテンツ再生装置１において、ビデオコンテンツがＭＰＥＧ圧縮であってｏｐｔｉｍｕｍモードを用いていた場合にはメタデータに示されている特徴差との一致度をより高くすることができる。 In the above embodiment, the metadata is described as being created based on the MPEG compressed video content. However, if the metadata is not a video content compressed by MPEG2 or MPEG4, it is a digital video movie. If it is created based on video content recorded in the MiniDV format, etc., the metadata is displayed so that information indicating the feature difference between the start frame of the segment and the immediately preceding and immediately following frames is displayed. create. Here, the feature difference from the immediately subsequent frame is merely an example, and for example, the feature difference from the frame after 15 frames, which is the period in which the I picture in FIGS. 3 and 4 is inserted, may be used. In this way, in the content reproduction apparatus 1, if the video content is MPEG compression and the optimum mode is used, the degree of coincidence with the feature difference indicated in the metadata can be further increased. it can.

図７は本発明によるコンテンツ再生装置の他の実施の形態である。この図において、図５、図６と同一機能を有するステップにおいては同一の符号を付し細かな説明は省略する。この実施の形態においては、メタデータは同様のものが用いられているがコンテンツ解析を行なうステップが図５と異なっている。 FIG. 7 shows another embodiment of the content reproduction apparatus according to the present invention. In this figure, steps having the same functions as those in FIGS. 5 and 6 are denoted by the same reference numerals, and detailed description thereof is omitted. In this embodiment, the same metadata is used, but the content analysis step is different from FIG.

ステップ３５において、ステップ３２において読み込まれたメタデータからは同期用タグの検索が行なわれ、同期用タグが見つかると、該当するセグメントが実際にどこに存在するかを調べるために、ビデオコンテンツの解析が行なわれる（ステップ４６）。解析が終了し、同期がとれた場合にはステップ３７において「同期完了」が選択され、ステップ３８へと移行する。以下、図５に示した場合と同様、メタデータの補正、ビデオコンテンツの再生が開始される。コンテンツ解析の詳細については後述する。 In step 35, a search for a synchronization tag is performed from the metadata read in step 32. When a synchronization tag is found, the video content is analyzed in order to find out where the corresponding segment actually exists. Performed (step 46). When the analysis is completed and synchronization is established, “synchronization complete” is selected in step 37, and the process proceeds to step 38. Thereafter, as in the case shown in FIG. 5, correction of metadata and playback of video content are started. Details of the content analysis will be described later.

ステップ４６において同期用タグがついたセグメントがどこに存在するかが解析できなかった場合は、ステップ３７において「Ｎｏ」が選択され、次の同期用タグをサーチする（ステップ４３）。同期用タグが見つかればステップ４６に戻り、再度コンテンツ解析が実施される。逆に、同期用タグが見つからなければステップ４１へ移行し、同期がとれなかった旨を表示し、ビデオコンテンツを冒頭部から通常再生を行なう（ステップ４２）。 If it is not possible to analyze where the segment with the synchronization tag exists in step 46, "No" is selected in step 37, and the next synchronization tag is searched (step 43). If the synchronization tag is found, the process returns to step 46 and the content analysis is performed again. On the contrary, if the synchronization tag is not found, the process proceeds to step 41 to display that the synchronization is not established, and normal reproduction of the video content is performed from the beginning (step 42).

次に、コンテンツ解析手法について図８とともに述べる。前述のように、同期タグが付加されたセグメントはシーンチェンジの際に付されるようになっているので、仮に、今同期を取ろうとしているビデオコンテンツがＭＰＥＧ圧縮時にｏｐｔｉｍｕｍモードを用いていれば、シーンチェンジ毎にＩピクチャが割り当てられているので、ステップ６１で候補となるフレームとしてＩピクチャを選ぶ。逆に、今同期を取ろうとしているビデオコンテンツがＭＰＥＧ圧縮時にｃｏｎｓｔａｎｔモードを用いていれば、シーンチェンジの有無に関わらず一定の周期でＩピクチャ、Ｐピクチャ、Ｂピクチャが割り当てられていくので、ステップ６１では候補となるフレームとして、ビデオコンテンツのアトリビュートに示される録画開始時刻や、セグメントタグに示される同期用タグの時刻から概略の位置を算出し、近傍にあり、サイズが極大となっているＰピクチャまたはＢピクチャ、あるいはＩピクチャを選択する。 Next, the content analysis method will be described with reference to FIG. As described above, since the segment to which the synchronization tag is added is added at the time of the scene change, if the video content to be synchronized now uses the optimum mode at the time of MPEG compression Since an I picture is assigned for each scene change, an I picture is selected as a candidate frame in step 61. Conversely, if the video content to be synchronized now uses the constant mode during MPEG compression, I pictures, P pictures, and B pictures will be allocated at a constant cycle regardless of the presence or absence of a scene change. In step 61, as a candidate frame, an approximate position is calculated from the recording start time indicated in the video content attribute and the time of the synchronization tag indicated in the segment tag. Select P picture, B picture, or I picture.

実際には、ビデオコンテンツのアトリビュートに示される録画開始時刻や、セグメントタグに示される同期用タグの時刻から概略のピクチャを算出し、近傍にある上記条件を満たすピクチャを候補として選択する。 Actually, a rough picture is calculated from the recording start time indicated by the attribute of the video content and the time of the synchronization tag indicated by the segment tag, and a picture that satisfies the above condition is selected as a candidate.

ステップ６２、６３では、仮に、今同期を取ろうとしているビデオコンテンツがＭＰＥＧ圧縮時にｏｐｔｉｍｕｍモードを用いていれば以下のとおりの処理を行なう。即ち、先ず候補となるＩピクチャの手前、９個目、１０個目、１１個目のＩピクチャを抽出する。仮に候補となるＩピクチャがビデオコンテンツ先頭から５０個目であったとすると、ＩＰ４１、ＩＰ４０、ＩＰ３９を抽出する。Ｉピクチャはそれ単独で１枚の画像をデコードすることが可能なので、ステップ６３においてこれらをデコードし、その特徴差を抽出する。ここでは、Ｉピクチャのサイズ、全体の輝度、色合いの変化、それぞれのＩピクチャに付加されている音声信号の音量差を比べている。これらの変化がどうあるべきかについては、同期用タグの後に表記されており、今、セグメント４について調べているとすれば、セグメント４のタグには、ｆｒａｍｅ＝ｉｐｂｒｉｔｅ＝ｕｐｄｃｓｉｚｅ＝ｄｏｗｎｕｐｃｏｌ＝ｕｐｄｏｗｎｔｉｎｔ＝ｕｐｄｏｗｎａｕｄｉｏ＝ｕｐｎｃとあり、これらはそれぞれ、タグが付加されているピクチャの種類（Ｉ、Ｐ、Ｂのいずれであるか）、輝度の変化、ピクチャのサイズの変化、色の濃さの変化、色合いの変化、音量の変化、を示しており、直前、直後のＩピクチャとの比較を表示している。ｕｐｄｏｗｎとあれば、直前のＩピクチャより増加しており、直後のＩピクチャは現在のものより減少していることを意味する。ｄｃはｄｏｎｔｃａｒｅで「不問」を意味し、ｎｃはｎｏｃｈａｎｇｅで「変化なし」を意味する。 In steps 62 and 63, if the video content to be synchronized now uses the optimum mode at the time of MPEG compression, the following processing is performed. That is, first, the ninth, tenth, and eleventh I pictures are extracted before the candidate I picture. Assuming that the candidate I picture is the 50th picture from the top of the video content, IP41, IP40, and IP39 are extracted. Since one picture can be decoded by itself as an I picture, these are decoded in step 63, and the feature difference is extracted. Here, the size of the I picture, the overall luminance, the change in hue, and the volume difference of the audio signal added to each I picture are compared. How these changes should be described is described after the synchronization tag, and if the segment 4 is examined now, the tag of the segment 4 has frame = ip brite = updc size = downup col. = Updint tint = updaudio audio = upnc, which are the type of picture (I, P, or B) to which the tag is added, the change in luminance, the change in picture size, and the darkness of the color, respectively. A change in depth, a change in hue, and a change in volume are shown, and a comparison with the immediately preceding and immediately following I pictures is displayed. If it is “down”, it means that it has increased from the immediately preceding I picture and the immediately following I picture has decreased from the current one. dc means “don't care” in don't care, and nc means “no change” in no change.

メタデータ作成の際に用いたビデオコンテンツでは、輝度は増、不問、サイズは減、増、色合いは増、減、音量は増、変化なし、と記載されているので、実際にＩピクチャＩＰ４１、ＩＰ４０、ＩＰ３９においてそのように変化しているかどうかをチェックする。すべての項目が記載内容と合致していれば同期がとれたものと考えてステップ５５に進み、時間差を抽出し、コンテンツ解析処理を終了する。１項目でも合致しないものがあればステップ６７に進む。 In the video content used in creating the metadata, it is described that the luminance is increased, no question, the size is decreased, increased, the hue is increased, decreased, the volume is increased, and there is no change. It is checked whether or not IP40 and IP39 have changed in that way. If all items match the description, it is considered that synchronization has been achieved, the process proceeds to step 55, a time difference is extracted, and the content analysis process is terminated. If even one item does not match, go to step 67.

一方、今同期を取ろうとしているビデオコンテンツがＭＰＥＧ圧縮時にｃｏｎｓｔａｎｔモードを用いていれば、ステップ６２、６３では、以下のとおりの処理を行なう。即ち、解析しようとしているビデオコンテンツは、シーンチェンジの有無に関わらず一定の周期でＩピクチャ、Ｐピクチャ、Ｂピクチャが割り当てられていくので、候補となるピクチャの手前、９個目と１０個目のＩピクチャ間にある、極大となるＰピクチャまたやＢピクチャとそれを挟む形で存在するＩピクチャを抽出する。仮に候補となるピクチャがＢピクチャであり、ビデオコンテンツ先頭から５０個目と５１個目のＩピクチャに挟まれたピクチであったとすると、ＩＰ４２、ＩＰ４１とＢピクチャを抽出する。候補となったピクチャがＩピクチャであった場合もその前後のＩピクチャを抽出する。 On the other hand, if the video content to be synchronized now uses the constant mode at the time of MPEG compression, the following processing is performed in steps 62 and 63. That is, the video content to be analyzed is assigned an I picture, a P picture, and a B picture at a constant cycle regardless of the presence or absence of a scene change, so the ninth and tenth pictures before the candidate pictures. A maximum P picture or B picture and an I picture existing between them are extracted. Assuming that a candidate picture is a B picture and is a picture sandwiched between the 50th and 51st I pictures from the top of the video content, IP42, IP41, and B pictures are extracted. If the candidate picture is an I picture, I pictures before and after that are extracted.

Ｂピクチャは、そのピクチャから画面を再現しようとすると、その前のＩピクチャ、Ｐピクチャを用いてデコード処理を行なう必要があるが、直前のＩピクチャからのデコードでよいので処理に要する時間は人間の感覚からは無視できる範囲で済むと考えられる。Ｉピクチャはそれ単独で１枚の画像をデコードすることが可能なので、ステップ６３においてこれらをデコードし、その特徴差を抽出する。以下、ｏｐｔｉｍｕｍモード時と同様に、サイズ、輝度、色合い、音量の差を比べ、すべての項目がメタデータに記載されている内容と合致していれば同期がとれたものと考えてステップ５５に進み、時間差を抽出し、コンテンツ解析処理を終了する。１項目でも合致しないものがあればステップ６７に進む。 When a B picture is reproduced from the picture, it is necessary to perform decoding processing using the previous I picture and P picture. However, since the decoding from the immediately preceding I picture is sufficient, the time required for processing is human. From this sense, it can be ignored. Since one picture can be decoded by itself as an I picture, these are decoded in step 63, and the feature difference is extracted. Hereinafter, as in the optimum mode, the differences in size, brightness, hue, and volume are compared, and if all items match the contents described in the metadata, it is considered that synchronization has been achieved and the process proceeds to step 55. Proceed to extract the time difference and end the content analysis process. If even one item does not match, go to step 67.

ステップ６７では、同期検出を行なう範囲が一定の範囲内であることを確認し、ここでは、ｏｐｔｉｍｕｍモードの場合は２０回以上、ｃｏｎｓｔａｎｔモードの場合は、４０回以上ステップ６３による比較処理を行なうと同期検出不能としてステップ５９へ進む。規定回数に以下の場合は、ステップ６８において次の候補となるピクチャを抽出する。ｏｐｔｉｍｕｍモードの場合はその次に存在するＩピクチャを抽出し、ｃｏｎｓｔａｎｔモードの場合は、前回抽出したピクチャがＩピクチャの場合はその次にある極大値を有するＢピクチャかＰピクチャを、そうでない場合は、そのピクチャの直後にあるＩピクチャを抽出する。 In step 67, it is confirmed that the range for performing synchronization detection is within a certain range. Here, the comparison process in step 63 is performed 20 times or more in the optimum mode and 40 times or more in the constant mode. Since the synchronization cannot be detected, the process proceeds to step 59. If the specified number of times is as follows, the next candidate picture is extracted in step 68. In the case of the optimum mode, the next existing I picture is extracted. In the case of the constant mode, if the previously extracted picture is an I picture, the B or P picture having the next maximum value is not selected. Extracts the I picture immediately after the picture.

以上のとおりのコンテンツ解析を行ない、同期検出を行なっている。 The content analysis as described above is performed, and synchronization detection is performed.

メタデータを作成する基となったビデオコンテンツと再生しようとするビデオコンテンツではＭＰＥＧ圧縮を行なう際のアルゴリズムが必ずしも全く同じとは限らず、また、前述のように、電波の状態によってはノイズやゴーストによって大元の画面においても差異があり得るが、輝度、Ｉピクチャのサイズ等の変化という切り口で画面の特徴を見ると、両者の差異は殆どないものと考えられ、これらのパラメータを用いることでメタデータとビデオコンテンツとの同期を容易に取ることができるものである。 The video content used to create the metadata and the video content to be played are not necessarily the same in the MPEG compression algorithm. As described above, depending on the radio wave condition, noise and ghost Depending on the characteristics of the screen, it is considered that there is almost no difference between the two. The metadata and the video content can be easily synchronized.

なお、本実施の形態では、ステップ６２において、候補となるピクチャの手前１０個目のＩピクチャをスタートとして解析を開始したが、どの範囲を解析範囲とするかは利用者が設定できるようにして良いことは言うまでもない。また、１０個手前から順次合致を検査するようにしたが、候補となるピクチャを中心として順次そのピクチャから遠ざかっていくように（＋１、−１、＋２、−２、…、のように）しても良い。 In the present embodiment, in step 62, the analysis is started with the tenth I picture before the candidate picture as a start. However, the user can set which range is the analysis range. It goes without saying that it is good. In addition, the match is sequentially checked from the previous ten, but the candidate picture is centered away from the picture (such as +1, -1, +2, -2,...). May be.

また、この実施の形態においては、同期タグが付加されたピクチャ１個のみでの特徴比較を行なうようにしたが、ステップ６５で時間差を補正した後、次の同期タグが付加されたピクチャを求め、このピクチャに対してステップ６２、６３の処理を行ない、ステップ６４において合致が得られることを確認するようにしてもよい。もし合致が得られなければ、ステップ６７の判定がＹｅｓとなる範囲で、次なる候補となるピクチャを抽出し、ステップ６２、６３、６４、６７、６８より成るループ処理を行ない、全項目が合致するＩピクチャがあるかどうかを検索するようにしてもよい。 In this embodiment, the feature comparison is performed only for one picture to which a synchronization tag is added. However, after the time difference is corrected in step 65, a picture to which the next synchronization tag is added is obtained. The processing of steps 62 and 63 may be performed on this picture, and it may be confirmed in step 64 that a match is obtained. If no match is obtained, the next candidate picture is extracted within the range in which the determination in step 67 is Yes, and a loop process including steps 62, 63, 64, 67, and 68 is performed, and all items match. You may make it search whether there exists an I picture to do.

また、この実施の形態においては、候補となるピクチャの前後のＩピクチャとの特徴差を行なうようにしたが、直前のＩピクチャのみとの差を求めてメタデータの記載内容との比較を取るようにしても良いものである。 In this embodiment, the feature difference between the previous and next I pictures of the candidate picture is performed. However, the difference from only the previous I picture is obtained and compared with the content described in the metadata. It may be good.

また、ピクチャの輝度、色相、といった値を得る場合、例えば、輝度であれば画面全体の平均的な明るさ（輝度）を求めるようにし、色相の場合は、画面の中央部のみの平均的な色相を求めるように決めておくと、ピクチャの特徴がより鮮明化し、誤判定をより少なくすることが可能である。 In addition, when obtaining values such as the brightness and hue of a picture, for example, if it is luminance, the average brightness (luminance) of the entire screen is obtained, and in the case of hue, the average of only the center of the screen is obtained. If the hue is determined to be obtained, the feature of the picture becomes clearer, and it is possible to reduce misjudgment.

なお、同期用タグの個数やその間隔については、ビデオコンテンツの長さにもよるが、３〜５分毎に１箇所程度の割合で、コンテンツの最初の方に少なくとも４〜５箇所あるようにすると同期をとりやすく適当であると考えられる。 Note that the number of synchronization tags and the interval thereof depend on the length of the video content, but there should be at least 4-5 locations at the beginning of the content at a rate of about 1 location every 3-5 minutes. Then, it is considered easy to synchronize and is appropriate.

本発明にかかるコンテンツ再生装置は、異なる機器で作成したメタデータを用いて独自に録画したビデオコンテンツを再生でき、逆に、独自に録画したビデオコンテンツより作成したメタデータを用いて、他の機器で録画したビデオコンテンツを再生することができるという効果を有し、放送番組を録画して得られたコンテンツを、メタデータを用いて効率よく再生するコンテンツ再生装置、映像再生装置等として有用である。 The content reproduction apparatus according to the present invention can reproduce video content uniquely recorded using metadata created by different devices, and conversely, by using metadata created from video content originally recorded, It is useful as a content playback device, video playback device, etc. that can efficiently play back video content recorded by using the metadata. .

本発明の実施の形態によるコンテンツ再生装置の構成を示すブロック図1 is a block diagram showing a configuration of a content reproduction apparatus according to an embodiment of the present invention 同コンテンツ再生装置におけるメタデータの具体例を示した図The figure which showed the specific example of the metadata in the same content reproduction apparatus 同コンテンツ再生装置におけるビデオ映像とＭＰＥＧ圧縮後のストリームとの関係を示す概念図Conceptual diagram showing the relationship between video images and MPEG-compressed streams in the content playback apparatus 同コンテンツ再生装置におけるビデオ映像とＭＰＥＧ圧縮後のストリームとの関係を示す概念図Conceptual diagram showing the relationship between video images and MPEG-compressed streams in the content playback apparatus 同コンテンツ再生装置における同期検出及びコンテンツ再生の流れを示したフローチャートFlow chart showing the flow of synchronization detection and content playback in the content playback apparatus 同コンテンツ再生装置における同期検出方法の流れを示したフローチャートA flowchart showing a flow of a synchronization detection method in the content reproduction apparatus 同コンテンツ再生装置における第２の同期検出及びコンテンツ再生の流れを示したフローチャートA flowchart showing a flow of second synchronization detection and content reproduction in the content reproduction apparatus 同コンテンツ再生装置における第２の同期検出方法の流れを示したフローチャートThe flowchart which showed the flow of the 2nd synchronization detection method in the same content reproduction apparatus

Explanation of symbols

１コンテンツ再生装置
２メタデータ蓄積手段
３再生制御手段
４再生条件データ蓄積手段
５コンテンツ再生手段
６コンテンツ蓄積手段
７同期検出手段 DESCRIPTION OF SYMBOLS 1 Content playback apparatus 2 Metadata storage means 3 Playback control means 4 Playback condition data storage means 5 Content playback means 6 Content storage means 7 Synchronization detection means

Claims

Metadata storage means for extracting video content information and storing metadata including information for synchronization based on scene changes in the video content used for extraction;
Content storage means for storing video content;
Synchronization means for synchronizing the metadata and the video content stored in the content storage means;
A content reproducing apparatus comprising: a correcting unit that corrects the metadata based on the output of the synchronizing unit.

2. The content reproduction apparatus according to claim 1, wherein the synchronization information includes information regarding a feature difference indicating how the screen has changed with respect to a screen immediately before the portion to which the synchronization information is added.

2. The content playback apparatus according to claim 1, wherein the video content used for the extraction and the video content stored in the content storage means are a stream compressed by an MPEG system.

The information for synchronization includes information on a feature difference indicating how the synchronization information is changed with respect to at least one of a location where the synchronization information is added and an immediately preceding or immediately following I picture. 3. The content reproduction apparatus according to 3.

4. The content reproduction apparatus according to claim 3, wherein the synchronization information is information attached to a portion having a particularly large feature difference from the immediately preceding I picture.

6. The content reproduction apparatus according to claim 4, wherein the feature difference includes at least one of a size, luminance, hue, and color density of an I picture.

4. The content reproducing apparatus according to claim 3, wherein the synchronization information is information attached to a portion where there is no scene change for at least several seconds before and after the scene change instructed by the synchronization information.

4. The content reproduction apparatus according to claim 3, wherein a plurality of the synchronization information exist in metadata.

The synchronization means calculates the number of frames between locations to which synchronization information is added, and the size of the I picture or the I picture existing in a form sandwiching the picture at the location according to the calculated value in the video content. 9. The content reproduction apparatus according to claim 8, wherein synchronization detection is performed by utilizing the presence of the largest B picture or P picture.

The content reproduction apparatus is
Accumulation means for accumulating reproduction conditions for reproducing the video content stored in the content accumulation means;
Control means for controlling reproduction of the video content based on the output of the correction means and the reproduction condition;
2. The content reproduction apparatus according to claim 1, further comprising: reproduction means for reproducing the video content based on the output of the content storage means and the output of the control means.

2. The content reproducing apparatus according to claim 1, further comprising means for storing the metadata corrected by the correcting means or replacing the metadata before correction.

12. The content playback apparatus according to claim 11, further comprising means for identifying whether the metadata is already synchronized with the video content stored in the content storage means.

2. The content reproducing apparatus according to claim 1, further comprising information indicating in what manner the video content stored in the content storage means is compressed.