JP2009124298A

JP2009124298A - Device and method for reproducing coded video image

Info

Publication number: JP2009124298A
Application number: JP2007294133A
Authority: JP
Inventors: Masaaki Shimada; 昌明島田; Hidetoshi Mishima; 英俊三嶋
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2007-11-13
Filing date: 2007-11-13
Publication date: 2009-06-04

Abstract

<P>PROBLEM TO BE SOLVED: To provide a device and method for reproducing coded video images capable of searching and reproducing wanted video scenes quickly, easily, and steadily. <P>SOLUTION: The device 100 for reproducing coded video images includes a characteristics scene extracting portion 135 for detecting segments and shots which are video sections of a plurality of hierarchical levels from physical variations of video signals and extracting information of characteristics representing respective video sections of each hierarchical level, an OSD creating portion 137 for creating a video search menu including thumbnail pictures representing segments and shots which are respective video sections of each hierarchical level, and a system controlling portion 120 for starting to reproduce from a video scene related to a thumbnail picture selected from the video search menu by the operation entry portion 110, wherein a video search menu is created for a video section excluding an unwanted scene identified by an unwanted scene identifying portion 136 if necessary. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、ストリーム情報ファイルから視聴したい映像シーンを検索し、再生することができる符号化映像再生装置及び符号化映像再生方法に関するものである。 The present invention relates to an encoded video reproduction apparatus and an encoded video reproduction method capable of searching and reproducing a video scene desired to be viewed from a stream information file.

ハードディスクレコーダ又は光ディスクレコーダのような符号化映像記録再生装置においては、早送り再生や巻戻し再生などの特殊再生によって、視聴したい映像シーンを見つけることができる。例えば、ユーザーは、番組を早送り再生し、視聴したい映像シーンが表示されたと判断したときに、操作入力部を手動操作して、通常再生に移行させる。しかし、ユーザーが早送り再生中の視聴したい映像シーンを確認してから通常再生の手動操作を行うまでの間に、映像が時間的に進んでしまうので、ユーザーは、進み過ぎた時間だけ巻戻すための追加の操作を行う必要があった。この追加の操作を無くするために、早送り再生中に通常再生の手動操作が行われた場合には、早送り再生の速度に応じたオフセット分だけ、通常再生の手動操作の時点よりも時間的に過去の映像から、通常再生を開始させる提案がある（例えば、特許文献１参照）。 In an encoded video recording / playback apparatus such as a hard disk recorder or an optical disk recorder, a video scene desired to be viewed can be found by special playback such as fast forward playback and rewind playback. For example, when the user fast-plays a program and determines that a video scene desired to be viewed is displayed, the user manually operates the operation input unit to shift to normal playback. However, since the video progresses in time from the time the user confirms the video scene that he / she wants to watch during fast-forward playback to the time when manual operation for normal playback is performed, the user rewinds only the time that has been advanced too much. Needed to do additional operations. In order to eliminate this additional operation, when a normal playback manual operation is performed during fast-forward playback, the offset corresponding to the fast-forward playback speed is offset in time from the time of normal playback manual operation. There is a proposal to start normal reproduction from a past video (for example, see Patent Document 1).

また、記録された番組中に散在する複数の映像シーンを、複数のサムネイル画像（縮小画像）として表示装置の画面に一覧表示させ、ユーザーが視聴したい映像シーンを複数のサムネイル画像の中から選択することで、選択されたサムネイル画像に対応する映像位置から通常再生を開始させる方法も提案されている（例えば、特許文献２参照）。 In addition, a plurality of video scenes scattered in the recorded program are displayed as a list of thumbnail images (reduced images) on the display device screen, and a video scene that the user wants to view is selected from the plurality of thumbnail images. Thus, a method of starting normal playback from the video position corresponding to the selected thumbnail image has also been proposed (see, for example, Patent Document 2).

さらに、操作入力部で所定の操作がなされた場合に、再生画像に重ねて、再生画像に時間的に近い複数枚のサムネイル画像を表示する方法も提案されている（例えば、特許文献３参照）。 Furthermore, there has been proposed a method of displaying a plurality of thumbnail images that are temporally close to the reproduced image and superimposed on the reproduced image when a predetermined operation is performed on the operation input unit (see, for example, Patent Document 3). .

特開２００５−２９３６８０号公報（第４−６頁、図５）Japanese Patent Laying-Open No. 2005-293680 (page 4-6, FIG. 5) 特開平１０−１４５７４３号公報（第４−５頁、図９）Japanese Patent Laid-Open No. 10-145743 (page 4-5, FIG. 9) 特開２００５−８００２７号公報（第３−５頁、図４）Japanese Patent Laying-Open No. 2005-80027 (page 3-5, FIG. 4)

しかしながら、特許文献１の映像検索方法においては、ユーザーは、早送り再生又は巻戻し再生を行いながら視聴したい映像シーンを見つける必要があるので、例えば、番組後半にある映像シーンを、早送り再生を用いて検索する場合に、多くの時間を要するという問題がある。また、早送り再生の速度を上げた場合には、早送り再生中における単位時間あたりの表示可能枚数は一定（早送り再生速度とは無関係）であるため、表示される映像シーンの時間間隔が広くなり、視聴したい映像シーンが表示されず、見つけることができない場合が増える。また、表示される映像シーンの時間間隔が広がることによって、映像内容の連続性を把握し難くなるので、ユーザーは、視聴したい映像シーンを探すことが困難になる。 However, in the video search method of Patent Document 1, the user needs to find a video scene that the user wants to watch while performing fast-forward playback or rewind playback. For example, a video scene in the latter half of the program is searched using fast-forward playback. When searching, there is a problem that it takes a lot of time. In addition, when the fast-forward playback speed is increased, the number of displayable images per unit time during fast-forward playback is constant (regardless of the fast-forward playback speed), so the time interval of the displayed video scene becomes wider, The video scene you want to watch is not displayed and you can't find it. In addition, since the time interval of the displayed video scenes increases, it becomes difficult to grasp the continuity of the video content, and thus it becomes difficult for the user to search for a video scene that the user wants to view.

特許文献２の映像検索方法においては、多数のサムネイル画像を画面に一覧表示するが、縦横方向に並ぶ多数のサムネイル画像から視聴したい映像シーンを探す作業は、ユーザーに、視線の移動及び多数のサムネイル画像についての判断を強要するので、ユーザーの負担は大きく、また、視聴したい映像シーンを見落とす可能性が増大するという問題がある。 In the video search method of Patent Document 2, a large number of thumbnail images are displayed in a list on the screen. However, the task of searching for a video scene to be viewed from a large number of thumbnail images arranged in the vertical and horizontal directions is as follows. Since judgment about an image is compelled, there is a problem that the burden on the user is large and the possibility of overlooking a video scene to be viewed increases.

また、特許文献２の映像検索方法においては、所定の時間間隔で取得した映像シーンのサムネイル画像を表示している。このため、所定の時間間隔を長く設定すれば、視聴したい映像シーンが表示されない可能性が増大し、所定の時間間隔を短く設定すれば、ユーザーは、多数のサムネイル画像から視聴したい映像シーンを見つけなければならず、検索時間が長くなり、ユーザーの負担も増大するという問題がある。また、特許文献２におけるサムネイル画像の映像シーンは、番組内容に関係しない（一定時間ごとの）映像シーンであるので、サムネイル画像を選択した後に、視聴したい映像シーンから再生を開始させるためのユーザーによる手動操作が必要になる。 In addition, in the video search method of Patent Document 2, thumbnail images of video scenes acquired at predetermined time intervals are displayed. For this reason, if the predetermined time interval is set to be long, the possibility that the video scene to be viewed will not be displayed increases, and if the predetermined time interval is set to be short, the user can find the video scene to be viewed from a large number of thumbnail images. There is a problem that the search time becomes long and the burden on the user increases. In addition, since the video scene of the thumbnail image in Patent Document 2 is a video scene that does not relate to the program content (at regular intervals), after selecting the thumbnail image, the user for starting playback from the video scene that he / she wants to watch Manual operation is required.

特許文献３の映像検索方法においては、特許文献１の場合と同様に、始めにユーザーが早送り再生などの手動操作によって、おおよその再生開始位置（視聴したい映像シーンに時間的に近い位置）を見つけなければならないという問題がある。また、特許文献３の映像検索方法においては、おおよその再生開始位置を特定した後、再生開始位置の近傍の画像として、符号化圧縮単位（ＧＯＰ：ＧｒｏｕｐＯｆＰｉｃｔｕｒｅ）である０．５秒単位でサムネイル画像を表示するので、符号化圧縮単位毎の多数のサムネイル画像を保持する必要があった。この場合、符号化映像再生装置には、膨大な情報量を処理できる非常に高い情報処理能力が要求され、回路規模及び画像制御アルゴリズムが複雑になるという問題がある。 In the video search method of Patent Literature 3, as in the case of Patent Literature 1, the user first finds an approximate playback start position (position close in time to the video scene to be viewed) by manual operation such as fast-forward playback. There is a problem of having to. In addition, in the video search method of Patent Document 3, after specifying an approximate reproduction start position, an image in the vicinity of the reproduction start position is encoded in units of 0.5 seconds that are a coding compression unit (GOP: Group Of Picture). Since thumbnail images are displayed, it is necessary to store a large number of thumbnail images for each encoding compression unit. In this case, the encoded video reproduction device is required to have a very high information processing capability capable of processing an enormous amount of information, and there is a problem that a circuit scale and an image control algorithm become complicated.

そこで、本発明は、上記従来技術の課題を解決するためになされたものであり、その目的は、視聴したい映像シーンを迅速、簡単、且つ確実に検索することができる符号化映像再生装置及び符号化映像再生方法を提供することにある。 Accordingly, the present invention has been made to solve the above-described problems of the prior art, and an object of the present invention is to provide an encoded video reproduction apparatus and code that can quickly, easily, and reliably search for a video scene to be viewed. It is to provide a video playback method.

本発明の符号化映像再生装置は、符号化圧縮された映像信号及び音声信号を多重化した動画像データの記録再生を行う記録再生手段と、前記映像信号の物理変化量から、複数の判断基準を用いて、前記複数の判断基準に対応する複数の階層の映像区間をそれぞれ検出し、各階層の各映像区間を代表する特徴情報を抽出する特徴シーン抽出手段と、前記特徴情報に基づいて、各階層の各映像区間を代表するサムネイル画像を含む、映像検索メニューを生成する検索画面生成手段と、ユーザーによる操作入力を受け付ける操作入力手段と、前記操作入力部で、前記映像検索メニューの中の前記サムネイル画像の一つを選択することによって、前記記録再生手段に、選択された前記サムネイル画像に関連付けられた映像シーンから再生を開始させる制御手段とを有することを特徴としている。 The encoded video reproduction apparatus of the present invention includes a recording / reproduction means for recording / reproducing moving image data obtained by multiplexing an encoded and compressed video signal and audio signal, and a plurality of determination criteria based on the physical change amount of the video signal. Based on the feature information, feature scene extraction means for detecting video sections of a plurality of hierarchies corresponding to the plurality of determination criteria, respectively, and extracting feature information representing each video section of each hierarchy, A search screen generating means for generating a video search menu including thumbnail images representative of each video section of each hierarchy, an operation input means for accepting an operation input by a user, and the operation input unit in the video search menu Selecting one of the thumbnail images causes the recording / playback unit to start playback from a video scene associated with the selected thumbnail image. It is characterized by having a control means.

また、本発明の符号化映像再生方法は、記録再生手段によって記録された、符号化圧縮された映像信号及び音声信号を多重化した動画像データの前記映像信号の物理変化量から、特徴シーン抽出手段によって、複数の判断基準を用いて、前記複数の判断基準に対応する複数の階層の映像区間をそれぞれ検出し、各階層の各映像区間を代表する特徴情報を抽出するステップと、検索画面生成手段によって、前記特徴情報に基づいて、各階層の各映像区間を代表するサムネイル画像を含む、映像検索メニューを生成するステップと、ユーザーによる操作入力を受け付ける操作入力部で、前記映像検索メニューの中の前記サムネイル画像の一つを選択することによって、前記記録再生手段に、選択された前記サムネイル画像に関連付けられた映像シーンから再生を開始させるステップとを有することを特徴としている。 The encoded video reproduction method of the present invention also extracts a feature scene from the physical change amount of the video signal of the moving image data multiplexed with the encoded and compressed video signal and audio signal recorded by the recording / reproducing means. Means for detecting video sections of a plurality of hierarchies corresponding to the plurality of judgment standards by using a plurality of judgment criteria, extracting feature information representative of each video section of each hierarchy, and generating a search screen And a step of generating a video search menu including thumbnail images representing each video section of each layer based on the feature information, and an operation input unit for receiving an operation input by a user. By selecting one of the thumbnail images, the recording / playback unit causes the video scene associated with the selected thumbnail image to be recorded. It is characterized by a step for starting the reproduction from the emissions.

本発明によれば、視聴したい映像シーンを迅速、簡単、且つ確実に検索し、視聴したい映像シーンから再生を開始させることができるという効果がある。 According to the present invention, there is an effect that a video scene desired to be viewed can be searched quickly, easily and reliably, and reproduction can be started from the desired video scene.

図１は、本発明の実施の形態に係る符号化映像再生装置１００（すなわち、本発明の実施の形態に係る符号化映像再生方法を実施する装置）を含むシステムの構成を概略的に示すブロック図である。図１に示されるシステムは、ストリーム情報ファイルを再生することができる符号化映像再生装置１００と、ユーザー指示を入力するリモコンなどの操作入力部１１０と、符号化映像再生装置１００から出力される映像信号に基づく映像を表示し及び音声信号に基づく音声を出力する液晶モニタなどの表示装置１４０とを有する。図１に示されるように、符号化映像再生装置１００は、この符号化映像再生装置１００全体を制御するシステム制御部１２０と、メモリ部１２１と、デコーダブロック１３０とを有する。デコーダブロック１３０は、デジタル放送信号などを受け取る放送受信部１３１と、ストリーム制御部１３２と、着脱可能な又は固定された情報記録媒体１３３ａを備えた記録再生ドライブ部１３３と、映像音声デコーダ部１３４と、バッファメモリ１３５ａ，１３５ｂを備えた特徴シーン抽出部１３５と、不要シーン特定部１３６と、ＯＳＤ（ＯｎＳｃｒｅｅｎＤｉｓｐｌａｙ）生成部１３７と、加算回路１３８とを有する。デコーダブロック１３０は、映像音声ストリームの記録再生機能を持つ部分であり、システム制御部１２０からの指示にしたがって、ストリーム情報の記録再生を行う。なお、操作入力部１１０は、符号化映像再生装置１００の本体に備えたれた操作入力部であってもよい。また、表示装置１４０は、符号化映像再生装置１００の本体に備えられた表示部であってもよい。 FIG. 1 is a block diagram schematically showing the configuration of a system including an encoded video reproduction apparatus 100 according to an embodiment of the present invention (that is, an apparatus that implements an encoded video reproduction method according to an embodiment of the present invention). FIG. The system shown in FIG. 1 includes an encoded video reproduction device 100 that can reproduce a stream information file, an operation input unit 110 such as a remote controller for inputting a user instruction, and a video output from the encoded video reproduction device 100. And a display device 140 such as a liquid crystal monitor that displays video based on the signal and outputs sound based on the audio signal. As shown in FIG. 1, the encoded video reproduction device 100 includes a system control unit 120 that controls the entire encoded video reproduction device 100, a memory unit 121, and a decoder block 130. The decoder block 130 includes a broadcast receiving unit 131 that receives a digital broadcast signal and the like, a stream control unit 132, a recording / reproducing drive unit 133 including a removable or fixed information recording medium 133a, a video / audio decoder unit 134, , A feature scene extraction unit 135 including buffer memories 135a and 135b, an unnecessary scene specification unit 136, an OSD (On Screen Display) generation unit 137, and an addition circuit 138. The decoder block 130 is a part having a recording / reproducing function of a video / audio stream, and performs recording / reproducing of stream information in accordance with an instruction from the system control unit 120. Note that the operation input unit 110 may be an operation input unit provided in the main body of the encoded video reproduction device 100. Further, the display device 140 may be a display unit provided in the main body of the encoded video reproduction device 100.

符号化映像再生装置１００は、例えば、ハードディスクレコーダ又は光ディスクレコーダなどの映像記録再生装置である。また、符号化映像再生装置１００は、映像の記録再生機能を持つハードディスク内蔵型テレビ又はパーソナルコンピュータであってもよい。さらに、情報記録媒体１３３ａが着脱可能な場合には、符号化映像再生装置１００は、映像再生装置であってもよい。 The encoded video reproduction device 100 is a video recording / reproduction device such as a hard disk recorder or an optical disk recorder, for example. The encoded video reproduction apparatus 100 may be a hard disk built-in television or personal computer having a video recording / reproduction function. Further, when the information recording medium 133a is removable, the encoded video reproduction device 100 may be a video reproduction device.

符号化映像再生装置１００が記録するストリーム情報ファイルは、例えば、ＭＰＥＧ−２方式などで符号化圧縮された映像情報と、ＡＣ−３（ＡｕｄｉｏＣｏｄｅｎｕｍｂｅｒ３）方式などで符号化圧縮された音声情報とを多重化した単一のマルチメディアファイルである。また、符号化映像再生装置１００における映像検索とは、ユーザーが特定シーンや特定人物が記録されている部分のみを視聴したい場合に、ストリーム情報ファイルから視聴したい映像シーンが記録されている位置を特定し、視聴したい映像シーンから再生を開始することである。 The stream information file recorded by the encoded video reproduction apparatus 100 includes, for example, video information encoded and compressed by the MPEG-2 system and audio information encoded and compressed by the AC-3 (Audio Code number 3) system. Is a single multimedia file. The video search in the encoded video playback apparatus 100 is to specify the position where the video scene to be viewed is recorded from the stream information file when the user wants to view only the part where the specific scene or the specific person is recorded. Then, playback is started from the video scene to be viewed.

本実施の形態においては、記録再生ドライブ部１３３は、符号化圧縮された映像信号及び音声信号を多重化した動画像データの記録再生を行い、特徴シーン抽出部１３５は、映像信号の物理変化量から、複数の判断基準を用いて、複数の判断基準に対応する複数の階層の映像区間を検出し、各階層の各映像区間を代表する特徴情報を抽出する。また、ＯＳＤ生成部１３７は、特徴情報に基づいて、各階層の各映像区間を代表するサムネイル画像を含む、映像検索メニューを生成する。本実施の形態においては、複数の判断基準は、第１の閾値αと、第１の閾値αより小さい第２の閾値βを含み、第１の閾値αに対応する階層の各映像区間は、セグメントであり、第２の閾値βに対応する階層の各映像区間は、セグメントに一致する区間又はセグメントを分割した区間であるショットである。また、映像検索メニューは、各セグメントを代表するサムネイル画像を複数含むセグメント画像情報領域と、各ショットを代表するサムネイル画像を複数含むショット画像情報領域とを含む（後述する図８の符号７０２，８０１）。ショット画像情報領域に含まれるサムネイル画像は、セグメント画像情報領域に含まれるサムネイル画像の一つに対して、時間的に近い順に選ばれた複数のショットのサムネイル画像を含む。ユーザーが操作入力部１１０を用いて、映像検索メニューの中のサムネイル画像の一つを選択すると、システム制御部１２０からの指示にしたがって、記録再生ドライブ部１３３は、選択されたサムネイル画像に関連付けられた映像シーンから再生を開始する。なお、本実施の形態においては、複数のサムネイル画像から成る画像情報領域の階層数が２の場合を説明するが、この階層数は３以上であってもよい。 In the present embodiment, the recording / reproducing drive unit 133 performs recording / reproduction of moving image data obtained by multiplexing the encoded and compressed video signal and audio signal, and the feature scene extracting unit 135 performs the physical change amount of the video signal. Then, using a plurality of judgment criteria, video sections of a plurality of layers corresponding to the plurality of judgment standards are detected, and feature information representing each video section of each layer is extracted. In addition, the OSD generation unit 137 generates a video search menu including thumbnail images representing each video section of each layer based on the feature information. In the present embodiment, the plurality of determination criteria include a first threshold value α and a second threshold value β that is smaller than the first threshold value α, and each video section of the hierarchy corresponding to the first threshold value α is Each video section of the hierarchy corresponding to the second threshold value β is a shot that is a section that matches the segment or a section obtained by dividing the segment. The video search menu includes a segment image information area including a plurality of thumbnail images representing each segment, and a shot image information area including a plurality of thumbnail images representing each shot (reference numerals 702 and 801 in FIG. 8 described later). ). The thumbnail images included in the shot image information area include thumbnail images of a plurality of shots selected in the order of time relative to one of the thumbnail images included in the segment image information area. When the user selects one of the thumbnail images in the video search menu using the operation input unit 110, the recording / playback drive unit 133 is associated with the selected thumbnail image in accordance with an instruction from the system control unit 120. Playback starts from the selected video scene. In the present embodiment, a case is described in which the number of hierarchies of an image information area composed of a plurality of thumbnail images is two, but the number of hierarchies may be three or more.

本実施の形態においては、不要シーン特定部１３６は、映像信号及び音声信号の両方の物理変化量から、不要シーンを特定し、ＯＳＤ生成部１３７は、不必要な映像シーン（不要シーン）を除いた映像区間について、映像検索メニューを生成することもできる。この処理は、映像検索に際して、ユーザーが番組内容を把握しやすくするために、意味的重要性が低い不要シーン（例えば、コマーシャル映像や、コマーシャルの前後に存在する映像重複シーン）を除いた区間を対象に、サムネイル画像メニューを生成する処理である。 In the present embodiment, the unnecessary scene specifying unit 136 specifies an unnecessary scene from the physical change amounts of both the video signal and the audio signal, and the OSD generating unit 137 excludes an unnecessary video scene (unnecessary scene). A video search menu can be generated for the selected video section. In this process, in order to make it easier for the user to grasp the contents of a program when searching for a video, a section excluding unnecessary scenes with low semantic importance (for example, commercial video and video overlapping scenes existing before and after the commercial) is excluded. This is a process for generating a thumbnail image menu for the target.

一般に、番組内容の意味的な連続性は、映像の連続性に近似する。このため、本実施の形態においては、番組内容の意味的な連続性は、映像の連続性に等しいということを前提としている。映像の連続性とは、映像の物理量の連続性であり、映像の物理量の変化（物理変化量）が所定の閾値以下の場合には連続性があり、所定の閾値を超える場合には連続性が無いと判断することができる。本実施の形態においては、第２の閾値βを用いて、ある再生位置における映像の連続性が無いと判断された点を、シーンチェンジ（ショットの境界でもある。）と言い、あるシーンチェンジと次のシーンチェンジとの間の映像区間をショット（Ｓｈｏｔ）と言う。さらに、本実施の形態においては、第１の閾値α（α＞β）を用いて、あるショットとその次のショットとの間の映像の連続性が無いと判断された点をセグメントの境界（セグメントの境界は、シーンチェンジでもある。）と呼び、あるセグメントの境界と次のセグメントの境界との間の映像区間をセグメント（Ｓｅｇｍｅｎｔ）と定義する。本実施の形態においては、映像検索をする場合には、まずセグメントを検出し、セグメント単位による映像シーンの検索を行い、次に、検索された映像シーンのセグメント内をショット単位で検索する。 In general, the semantic continuity of program content approximates the continuity of video. For this reason, in this embodiment, it is assumed that the semantic continuity of the program content is equal to the continuity of the video. Video continuity is the continuity of the physical quantity of the video, and there is continuity when the change in the physical quantity of the video (physical change amount) is less than or equal to a predetermined threshold, and continuity when the video exceeds a predetermined threshold It can be determined that there is no. In the present embodiment, the point at which it is determined that there is no video continuity at a certain playback position using the second threshold value β is called a scene change (also a shot boundary). A video section between the next scene change is called a shot. Furthermore, in the present embodiment, the first threshold value α (α> β) is used to determine a point determined that there is no video continuity between a certain shot and the next shot as a segment boundary ( A segment boundary is also a scene change.) A video section between a certain segment boundary and the next segment boundary is defined as a segment. In this embodiment, when searching for a video, first, a segment is detected, and a video scene is searched for in units of segments. Next, the searched video scene is searched in units of shots.

既に概略を説明したように、映像の連続性は、映像の物理変化量（例えば、カラーヒストグラム、ＤＣ成分、動きベクトル、エッジ情報、テクスチャ情報など）と、所定の閾値から求めることができる。例えば、セグメントの境界の判断に際しては、物理変化量が所定の第１の閾値αより大きい場合に、セグメントの境界であると判断する。また、シーンチェンジ（ショットの境界）の判断に際しては、物理変化量が、所定の第２の閾値β（β＜α）より大きい場合に、シーンチェンジであると判断する。番組内容の意味的な連続性は、映像の連続性に近似するので、セグメントは、映像シーンの意味的（内容的）な連続性が確保されている大きな単位の映像区間であり、ショットは、映像シーンの意味的（内容的）な連続性が確保されている小さな単位の映像区間である。 As already outlined, video continuity can be obtained from a physical change amount of video (for example, color histogram, DC component, motion vector, edge information, texture information, etc.) and a predetermined threshold. For example, when determining the segment boundary, if the physical change amount is larger than a predetermined first threshold value α, the segment boundary is determined to be a segment boundary. When determining a scene change (shot boundary), it is determined that the scene change is made when the physical change amount is larger than a predetermined second threshold value β (β <α). Since the semantic continuity of the program content approximates the continuity of the video, the segment is a video unit of a large unit in which the semantic (content) continuity of the video scene is secured, It is a video unit of a small unit in which semantic (content) continuity of the video scene is secured.

図１に示される、記録再生ドライブ部１３３は、情報記録媒体１３３ａからの情報の読出し及び情報記録媒体１３３ａへの情報の書込みができる情報記録再生手段である。本実施の形態においては、記録再生ドライブ部１３３は、情報記録媒体１３３ａであるハードディスクを内蔵した、ハードディスクドライブである。なお、記録再生ドライブ部１３３は、光ディスクドライブ又やＳＤメディアドライブ（フラッシュメモリドライブ）などのような他の情報記録媒体を用いた情報記録再生手段であってもよい。 The recording / reproducing drive unit 133 shown in FIG. 1 is information recording / reproducing means capable of reading information from the information recording medium 133a and writing information to the information recording medium 133a. In the present embodiment, the recording / reproducing drive unit 133 is a hard disk drive having a built-in hard disk as the information recording medium 133a. Note that the recording / reproducing drive unit 133 may be information recording / reproducing means using another information recording medium such as an optical disk drive or an SD media drive (flash memory drive).

記録再生ドライブ部１３３には、符号化映像が多重化されたストリーム情報、ストリーム情報の再生制御情報、メタデータ制御ファイル、及びサムネイル画像ファイルが記録される。ストリーム情報の再生制御情報には、記録再生ドライブ部１３３に記録されているストリーム情報から分離した符号化映像音声ストリームに関する映像や音声の属性情報と、ストリーム情報のアクセス単位（通常、ＧＯＰ単位）毎に再生開始時間情報及び再生開始位置情報の対応関係を示す情報などが含まれる。なお、再生開始時間情報及び再生開始位置情報は、タイムサーチや特殊再生（早送り再生や巻き戻し再生）などのランダムアクセスを行うために用いられる。 The recording / playback drive unit 133 records stream information in which encoded video is multiplexed, playback control information for the stream information, a metadata control file, and a thumbnail image file. The stream information playback control information includes video and audio attribute information related to the encoded video / audio stream separated from the stream information recorded in the recording / playback drive unit 133, and each access unit (usually GOP) of the stream information. Includes information indicating the correspondence between the reproduction start time information and the reproduction start position information. The reproduction start time information and the reproduction start position information are used for performing random access such as time search and special reproduction (fast forward reproduction and rewind reproduction).

サムネイル画像ファイルは、主にサムネイル画像メニューを生成するために使用される、ある映像シーンの代表画の縮小画像を記録したファイルである。映像シーンの代表画とは、セグメントの先頭ピクチャの画像ファイル又はショットの先頭ピクチャの画像ファイルである。ただし、映像シーンの代表画は、セグメント内の先頭ピクチャ以外のピクチャ、例えば、セグメント内において任意の時間経過した後のピクチャ、又は、ショットの先頭ピクチャ以外のピクチャ、例えば、ショット内において任意の時間経過した後のピクチャであってもよい。なお、サムネイル画像ファイルのデータ形式は、ビットマップファイルやＲＡＷデータのような非圧縮データ形式であってもよく、ＪＰＥＧなどの圧縮データ形式であってもよい。また、サムネイル画像ファイルは、ＤＣ成分のようなサムネイル画像を生成するための情報を含んでもよい。 The thumbnail image file is a file in which a reduced image of a representative image of a certain video scene, which is mainly used for generating a thumbnail image menu, is recorded. The representative picture of the video scene is an image file of the first picture of the segment or an image file of the first picture of the shot. However, the representative picture of the video scene is a picture other than the first picture in the segment, for example, a picture after an arbitrary time has elapsed in the segment, or a picture other than the first picture in the shot, for example, an arbitrary time in the shot. It may be a picture after elapse. The data format of the thumbnail image file may be an uncompressed data format such as a bitmap file or RAW data, or a compressed data format such as JPEG. The thumbnail image file may include information for generating a thumbnail image such as a DC component.

メタデータ制御ファイルは、ストリーム情報ファイル中から、サムネイル画像ファイルが関連付けられているセグメントやショットを特定するための情報を記録するファイルである。また、メタデータ制御ファイルは、このような関連情報以外にも、セグメントやショットを補足説明する情報である、再生時間、再生開始位置、代表画のピクチャサイズ、映像ジャンル、及びピクチャの物理特性を示す情報などを含むことができる。 The metadata control file is a file that records information for specifying a segment or shot associated with a thumbnail image file from the stream information file. Further, in addition to such related information, the metadata control file is information that supplementally explains segments and shots, such as playback time, playback start position, picture size of representative picture, video genre, and physical characteristics of pictures. It can include information to show.

図１に示される、放送受信部１３１は、ＭＰＥＧ−２ＴｒａｎｓｐｏｒｔＳｔｒｅａｍ（ＭＰＥＧ−２ＴＳ）形式で符号化圧縮されたデジタル放送波を受信する。デジタル放送波は、複数の番組の映像音声情報が多重化された信号であってもよい。この場合には、放送受信部１３１は、特定の番組のＭＰＥＧ−２ＴＳのみを抽出する機能を持つ必要がある。また、ストリーム制御部１３２は、デコーダブロック１３０におけるストリームの流れを統括制御する。 A broadcast receiving unit 131 shown in FIG. 1 receives a digital broadcast wave encoded and compressed in the MPEG-2 Transport Stream (MPEG-2 TS) format. The digital broadcast wave may be a signal in which video / audio information of a plurality of programs is multiplexed. In this case, the broadcast receiving unit 131 needs to have a function of extracting only the MPEG-2 TS of a specific program. In addition, the stream control unit 132 performs overall control of the stream flow in the decoder block 130.

次に、本実施の形態における記録処理を説明する。放送受信部１３１は、デジタル放送波として受信されたＭＰＥＧ−２ＴＳを、ストリーム制御部１３２に供給する。システム制御部１２０からの指示にしたがって、記録再生ドライブ部１３３は、ストリーム制御部１３２に供給されたＭＰＥＧ−２ＴＳのストリーム情報を記録する。 Next, recording processing in the present embodiment will be described. The broadcast receiving unit 131 supplies the MPEG-2 TS received as a digital broadcast wave to the stream control unit 132. In accordance with an instruction from the system control unit 120, the recording / playback drive unit 133 records the MPEG-2 TS stream information supplied to the stream control unit 132.

ストリーム制御部１３２は、ストリーム情報を記録再生ドライブ部１３３に記録させる制御を行うとともに、符号化圧縮単位としてのＧＯＰ毎に、システム制御部１２０に対して、ランダムアクセスするための情報として、再生開始時間、再生開始位置などを通知する。その後、システム制御部１２０は、ランダムアクセスするための情報である再生開始時間、再生開始位置などを再生制御情報として、記録再生ドライブ部１３３に記録させる。 The stream control unit 132 performs control for recording the stream information in the recording / reproduction drive unit 133, and starts reproduction as information for randomly accessing the system control unit 120 for each GOP as an encoding compression unit. Notify time, playback start position, etc. Thereafter, the system control unit 120 causes the recording / playback drive unit 133 to record the playback start time, the playback start position, and the like, which are information for random access, as playback control information.

特徴シーン抽出部１３５は、記録再生ドライブ部１３３に対するストリーム情報の記録中に、ストリーム情報からセグメント及びショットに関連する情報を抽出する。具体的には、ストリーム制御部１３２から特徴シーン抽出部１３５にストリーム情報が入力され、その後、特徴シーン抽出部１３５は、入力されたストリーム情報のアクセス単位毎に、符号化圧縮単位の先頭に位置するＩピクチャ（ＩｎｔｒａＰｉｃｔｕｒｅ）の映像情報を特定する。なお、Ｉピクチャは、符号化圧縮単位の基本ピクチャであり、動き予測を用いずに映像信号を直接符号化したフレーム映像である。 The feature scene extracting unit 135 extracts information related to the segment and the shot from the stream information during recording of the stream information to the recording / reproducing drive unit 133. Specifically, stream information is input from the stream control unit 132 to the feature scene extraction unit 135, and then the feature scene extraction unit 135 is positioned at the head of the encoding compression unit for each access unit of the input stream information. The video information of the I picture (Intra Picture) to be specified is specified. Note that an I picture is a basic picture of a coding compression unit, and is a frame video obtained by directly coding a video signal without using motion prediction.

特徴シーン抽出部１３５は、アクセス単位毎にＩピクチャの物理変化量を算出し、特徴シーン抽出部１３５内に備えられたバッファメモリに格納する。なお、物理変化量とは、ピクチャの画像情報から得られるカラーヒストグラム、ＤＣ成分、エッジ情報、テクスチャ情報、動きベクトルなどの物理情報、又はそれらの組み合わせた情報である。本実施の形態においては、ピクチャの特徴を示すＤＣ成分をバッファメモリ１３５ａ又は１３５ｂに蓄積する場合を説明する。その後、順次アクセス単位毎に物理変化量の蓄積処理を行うとともに、過去に蓄えた物理変化量との比較を行う。画像比較を行う際には、物理変化量を元にその特徴（空間的、時間的）によって、セグメントであるかショットであるかの分類を行うことができる。このように取得した比較結果から、予め設定している閾値（第１の閾値α又は第２の閾値β）を越えるか否かによって、セグメントの境界又はショットの境界（シーンチェンジ）の判定を行う。 The feature scene extraction unit 135 calculates the physical change amount of the I picture for each access unit, and stores it in a buffer memory provided in the feature scene extraction unit 135. The physical change amount is physical information such as a color histogram, DC component, edge information, texture information, motion vector, etc. obtained from picture image information, or information obtained by combining them. In the present embodiment, a case will be described in which a DC component indicating the characteristics of a picture is stored in the buffer memory 135a or 135b. Thereafter, accumulation processing of physical change amounts is sequentially performed for each access unit, and comparison with physical change amounts accumulated in the past is performed. When performing image comparison, it is possible to classify whether the image is a segment or a shot based on the characteristics (spatial and temporal) based on the physical change amount. The segment boundary or shot boundary (scene change) is determined based on whether or not a preset threshold value (first threshold value α or second threshold value β) is exceeded from the comparison result acquired in this way. .

システム制御部１２０からの指示にしたがって、セグメント又はショットと判定された映像区間の代表画のサムネイル画像は、記録再生ドライブ部１３３内のサムネイル画像ファイルに記録される。同時に、記録されたサムネイル画像ファイルのファイル名、開始時刻、終了時刻、Ｉピクチャ位置、Ｉピクチャサイズは、記録再生ドライブ部１３３内のメタデータ制御ファイルに記録される。なお、上述したサムネイル画像は、Ｉピクチャから再エンコードして生成してもよいし、画像比較で用いたＤＣ成分をそのまま利用してもよい。 In accordance with an instruction from the system control unit 120, a thumbnail image of a representative image in a video section determined to be a segment or a shot is recorded in a thumbnail image file in the recording / playback drive unit 133. At the same time, the file name, start time, end time, I picture position, and I picture size of the recorded thumbnail image file are recorded in the metadata control file in the recording / playback drive unit 133. Note that the above-described thumbnail image may be generated by re-encoding from the I picture, or the DC component used in the image comparison may be used as it is.

また、セグメントとショットを特定する処理は、後述するサムネイル画像メニューの表示前に行われていればよく、記録再生ドライブ部１３３による記録中に行ってもよく、再生開始後に適宜行ってもよく、オフライン中に選択された番組に対して行ってもよい。 In addition, the process of specifying the segment and shot may be performed before the thumbnail image menu described later is displayed, may be performed during recording by the recording / reproducing drive unit 133, or may be appropriately performed after the reproduction is started, You may perform with respect to the program selected while offline.

また、本実施の形態においては、セグメント及びショットの抽出処理と、システム処理の負荷をバランスよく構成できる例として、アクセス単位の代表画であるＩピクチャに対して物理変化量の比較を行うが、システム処理性能が高ければ全てのフレームを対象に物理変化量の比較を行ってもよい。 In this embodiment, as an example in which the load of segment and shot extraction processing and system processing can be configured in a balanced manner, physical change amounts are compared with respect to an I picture that is a representative image of an access unit. If the system processing performance is high, the physical change amounts may be compared for all frames.

次に、本実施の形態における不要シーン特定処理を説明する。不要シーン特定部１３６は、ストリーム情報の記録中にコマーシャル映像、及びコマーシャル映像の前後に位置する映像重複シーンに関連する区間情報を抽出する。具体的には、まずストリーム制御部１３２から不要シーン特定部１３６にストリーム情報が入力される。その後、不要シーン特定部１３６は、入力されたストリーム情報をデコードし、音声情報から無音部分の時刻情報を特定する。そして、不要シーン特定部１３６は、特定された無音部分の時刻情報と、特徴シーン抽出部１３５で特定したシーンチェンジ点の時刻情報とを比較し、コマーシャル映像に特有の映像と音声の変化点が周期性を持って存在するかを判定する。システム制御部１２０からの指示にしたがって、コマーシャル映像と判定された区間を示す情報は、記録再生ドライブ部１３３内に、メタデータ制御ファイルとして記録される。ここで、無音部分とは、テレビジョン放送のＣＭの前後（すなわち、番組本編とＣＭ期間の間、及び、ＣＭ期間内に含まれる複数のＣＭ部分の境界）に存在する。また、このため、ＣＭ境界を検出する際に、テレビジョン放送の無音部分を検出する。また、コマーシャル映像に特有の映像と音声の変化点の周期性とは、例えば、１つのＣＭ部分の境界は一定の周期（例えば、日本のＴＶ放送においては、１５秒又は３０秒など）で現れるように決められていることであり、これを「ＣＭルール」とも言う。 Next, the unnecessary scene specifying process in the present embodiment will be described. The unnecessary scene specifying unit 136 extracts the commercial video and the section information related to the video overlap scene located before and after the commercial video during the recording of the stream information. Specifically, first, stream information is input from the stream control unit 132 to the unnecessary scene specifying unit 136. Thereafter, the unnecessary scene specifying unit 136 decodes the input stream information and specifies the time information of the silent part from the audio information. Then, the unnecessary scene specifying unit 136 compares the time information of the specified silent part with the time information of the scene change point specified by the feature scene extracting unit 135, and the change point of the video and audio unique to the commercial video is determined. Determine if it exists with periodicity. Information indicating a section determined to be a commercial video according to an instruction from the system control unit 120 is recorded in the recording / playback drive unit 133 as a metadata control file. Here, the silent part exists before and after the television broadcast CM (that is, between the main program and the CM period, and at the boundaries of a plurality of CM parts included in the CM period). For this reason, when detecting the CM boundary, the silent part of the television broadcast is detected. In addition, the periodicity of video and audio change points peculiar to commercial video, for example, the boundary of one CM portion appears at a constant cycle (for example, 15 seconds or 30 seconds in Japanese TV broadcasting). This is also called “CM rule”.

不要シーン特定部１３６は、コマーシャル映像の区間が特定された後、コマーシャル映像の区間の前後にショットが存在するか否かを、システム制御部１２０に問い合わせてもよい。システム制御部１２０は、記録再生ドライブ部１３３内のメタデータ制御ファイル及びサムネイル画像ファイルを読み込み、コマーシャル映像と判定された区間の前後に画像相関性の高い区間があるかを判定する。もしもコマーシャル映像と判定された区間の前後に画像相関性の高い区間が存在すれば、コマーシャル映像区間を挟んで表示された映像重複シーンと判定してもよい。このように判定された一方の映像重複シーンとコマーシャル映像シーンを、メタデータ制御ファイル中の不要シーンと判定してもよい。このように構成することで、コマーシャル映像付近に存在する同じショットを複数提示しないサムネイル画像メニューを生成することができる。 The unnecessary scene specifying unit 136 may inquire of the system control unit 120 whether or not there are shots before and after the commercial video section after the commercial video section is specified. The system control unit 120 reads the metadata control file and the thumbnail image file in the recording / playback drive unit 133, and determines whether there is a section with high image correlation before and after the section determined to be a commercial video. If there is a section with high image correlation before and after the section determined to be a commercial video, it may be determined that the video overlap scene is displayed across the commercial video section. One video overlap scene and commercial video scene determined as described above may be determined as unnecessary scenes in the metadata control file. By configuring in this way, it is possible to generate a thumbnail image menu that does not present a plurality of the same shots present in the vicinity of the commercial video.

次に、本実施の形態における再生処理を説明する。記録再生ドライブ部１３３に記録されたストリーム情報を再生させる場合には、システム制御部１２０は、再生対象のストリーム情報に関連する再生制御情報を記録再生ドライブ部１３３から予め読み出し、これをメモリ部１２１に保持する。その後、記録再生ドライブ１３３は、記録されているストリーム情報を読み出し、ストリーム制御部１３２を経由して、映像音声デコーダ部１３４に供給する。映像音声デコーダ部１３４は、ストリーム情報を逐次取り込んだ後に、符号化圧縮された映像ストリーム又は音声ストリームに分離する。その後、映像音声デコーダ部１３４は、ＭＰＥＧ−２方式などで符号化された映像ストリームをデコード処理して映像信号に復号する。一方、映像音声デコーダ部１３４は、ＡＣ−３方式などで符号化された音声ストリームをデコード処理して、音声信号に復号する。 Next, the reproduction process in the present embodiment will be described. When the stream information recorded on the recording / playback drive unit 133 is played back, the system control unit 120 reads out the playback control information related to the stream information to be played back from the recording / playback drive unit 133 in advance, and stores this information in the memory unit 121. Hold on. Thereafter, the recording / reproducing drive 133 reads the recorded stream information and supplies it to the video / audio decoder unit 134 via the stream control unit 132. The video / audio decoder unit 134 sequentially captures the stream information, and then separates it into an encoded and compressed video stream or audio stream. Thereafter, the video / audio decoder unit 134 decodes the video stream encoded by the MPEG-2 method or the like to decode it into a video signal. On the other hand, the video / audio decoder unit 134 decodes an audio stream encoded by the AC-3 system or the like, and decodes the audio stream.

ＯＳＤ生成部１３７は、システム制御部１２０からの指示にしたがって、サムネイル画像を利用した映像検索画面（後述する図７及び図８に示す）を生成する。生成した映像検索画面信号は、加算回路１３８によって、映像音声デコーダ部１３４から出力される映像信号に重畳される。このようにして出力された映像信号及び音声信号は、表示装置１４０に入力され、表示装置１４０は、映像検索画面が重畳された映像を表示する。 The OSD generation unit 137 generates a video search screen (shown in FIGS. 7 and 8 to be described later) using thumbnail images in accordance with an instruction from the system control unit 120. The generated video search screen signal is superimposed on the video signal output from the video / audio decoder unit 134 by the addition circuit 138. The video signal and audio signal output in this way are input to the display device 140, and the display device 140 displays the video on which the video search screen is superimposed.

操作入力部１１０は、符号化映像再生装置１００のフロントパネルに配置されている操作パネルやリモコンなどを指す。操作入力部１１０には、番組や映像シーンを選択するためのキー、例えば、上下左右キー、決定キー、が具備されている。システム制御部１２０は、操作入力部１１０によって要求された命令の内容を解釈し、デコーダブロック１３０を制御することで、任意のストリーム情報を再生する。 The operation input unit 110 refers to an operation panel, a remote control, or the like disposed on the front panel of the encoded video reproduction device 100. The operation input unit 110 includes keys for selecting programs and video scenes, for example, up / down / left / right keys, and a determination key. The system control unit 120 interprets the content of the command requested by the operation input unit 110 and controls the decoder block 130 to reproduce arbitrary stream information.

なお、本実施の形態においては、特徴シーン抽出部１３５及び不要シーン特定部１３６が、デコーダブロック１３０内のハードウェアとして構成されている場合を説明しているが、これらは、デコーダブロック１３０の外部に存在してもよく、また、同様の機能を有するファームウェアであってもよい。 In the present embodiment, the case where the feature scene extraction unit 135 and the unnecessary scene specification unit 136 are configured as hardware in the decoder block 130 has been described. Or firmware having the same function.

図２は、記録再生ドライブ部１３３の情報記録媒体１３３ａ内の論理ファイル構造を示す図である。図２に示されるように、この論理ファイルには、論理的に階層構造を成すファイル構造の最上位階層に、ディレクトリ構造であるルートディレクトリ２００が配置され、ルートディレクトリ２００の下位階層に、ディレクトリ構造であるマルチメディアディレクトリ２０１が配置され、マルチメディアディレクトリ２０１の下位階層に、ディレクトリ構造であるストリーム管理ディレクトリ２０２と、メタデータ管理ディレクトリ２０３とが配置されている。なお、メタデータ管理ディレクトリ２０３（及びメタデータ管理ディレクトリ２０３の中のファイル）を総称して、メタデータ記録領域と呼ぶ。本実施の形態においては、メタデータ記録領域内の情報が、番組記録時に生成される場合を説明するが、本発明における検索画面を表示する前に生成すればよく、例えば、再生開始前や再生開始中に生成してもよい。 FIG. 2 is a diagram showing a logical file structure in the information recording medium 133a of the recording / reproducing drive unit 133. As shown in FIG. As shown in FIG. 2, in this logical file, a root directory 200 that is a directory structure is arranged at the highest level of a file structure that logically forms a hierarchical structure, and a directory structure is arranged at a lower hierarchy of the root directory 200. A multimedia directory 201 is arranged, and a stream management directory 202 and a metadata management directory 203 having a directory structure are arranged in a lower hierarchy of the multimedia directory 201. Note that the metadata management directory 203 (and files in the metadata management directory 203) are collectively referred to as a metadata recording area. In the present embodiment, the case where the information in the metadata recording area is generated at the time of program recording will be described. However, it may be generated before the search screen in the present invention is displayed. It may be generated during the start.

また、図２に示される論理ファイルには、マルチメディアディレクトリ２０１の下位階層に、再生制御情報ファイル２１１が配置されている。再生制御情報ファイル２１１には、記録再生ドライブ部１３３内の録画番組の管理情報が記述されている。また、ストリーム情報ファイル２１２は、録画番組の映像信号又は音声信号の少なくとも一方を符号化圧縮し、再生時間情報と共に多重化したファイルであり、メタデータ管理ファイル２１３は、特徴シーンデータの管理情報を記述したファイルであり、サムネイル画像ファイル２１４は、録画番組の特徴シーンの代表画を記録したファイルである。 In the logical file shown in FIG. 2, a playback control information file 211 is arranged in the lower hierarchy of the multimedia directory 201. In the reproduction control information file 211, management information of a recorded program in the recording / reproducing drive unit 133 is described. The stream information file 212 is a file obtained by encoding and compressing at least one of a video signal or an audio signal of a recorded program, and multiplexing it together with reproduction time information. A metadata management file 213 stores management information of characteristic scene data. The thumbnail image file 214 is a file in which a representative image of a characteristic scene of a recorded program is recorded.

ストリーム情報ファイル２１２は、録画番組単位で１つのファイルが割り当てられており、番組を特定するためにユニークなファイル名が割り振られる。図２には、ファイル名として、５桁の数字が割り当てられている場合を示しているが、他のファイル名を採用することもできる。 In the stream information file 212, one file is allocated for each recorded program, and a unique file name is assigned to identify the program. Although FIG. 2 shows a case where a five-digit number is assigned as the file name, other file names may be employed.

また、サムネイル画像ファイル２１４においては、特徴シーン毎に１つのファイルが割り当てられる。図２には、サムネイル画像ファイル２１４として、アンダースコアによって区切られた前半５桁の数字と、後半６桁の数字から構成されファイル名が例示されている。前半５桁には、特徴シーンが含まれるストリーム情報ファイル２１２の名前が格納される。後半６桁には、当該ストリーム情報ファイル２１２の先頭から起算したフレーム番号が記録される。例えば、サムネイル画像ファイル「００００２＿０００１３５．ｄａｔ」は、名前が０２０００．ｍｔｓのストリーム情報ファイル２１２に関連付けて記録されており、先頭から１３５枚目のフレームのサムネイル画像が記録されているファイルである。 In the thumbnail image file 214, one file is assigned to each feature scene. In FIG. 2, the thumbnail image file 214 is exemplified by a file name composed of the first five digits and the latter six digits separated by underscores. In the first five digits, the name of the stream information file 212 including the feature scene is stored. In the last six digits, the frame number calculated from the head of the stream information file 212 is recorded. For example, the thumbnail image file “00002_000135.dat” has the name 02000. This file is recorded in association with the mts stream information file 212, and the thumbnail image of the 135th frame from the top is recorded.

ストリーム情報ファイル２１２とサムネイル画像ファイル２１４は、個別のディレクトリ内に配置する例を示したが、同一のディレクトリ内に配置されていてもよいし、メタデータ管理ファイル２１３及びサムネイル画像ファイル２１４がルートディレクトリ２００に直接配置されてもよい。また、メタデータ管理ファイル２１３に、単一のファイルに全ての録画番組の特徴情報がまとめて記録されている例を示しているが、録画番組の特徴情報を、複数のファイルに分割して記録してもよい。また、サムネイル画像ファイル２１４は、特徴シーン毎に分割してファイルを形成しているが、単一のファイルにまとめて管理してもよい。 Although the stream information file 212 and the thumbnail image file 214 are shown as examples arranged in separate directories, they may be arranged in the same directory, and the metadata management file 213 and the thumbnail image file 214 are the root directory. 200 may be directly arranged. Moreover, although the metadata management file 213 shows an example in which the feature information of all recorded programs is recorded together in a single file, the feature information of the recorded program is divided into a plurality of files and recorded. May be. The thumbnail image file 214 is divided into feature scenes to form a file. However, the thumbnail image file 214 may be managed as a single file.

本実施の形態におけるメタデータ管理ディレクトリ２０３は、記録再生ドライブ部１３３の所定物理アドレス区間に記録されていてもよい。このように構成すれば、特徴シーンの情報をまとめて記録や消去する場合に、ディスクシークの発生を抑えることができ、データの読み取り及び書き込みを高速にすることができる。 The metadata management directory 203 in the present embodiment may be recorded in a predetermined physical address section of the recording / playback drive unit 133. With this configuration, when information on feature scenes is recorded or erased collectively, the occurrence of disk seek can be suppressed, and data can be read and written at high speed.

また、メタデータ管理ファイル２１３のデータ形式は、テキスト形式であってもバイナリ形式であってもよく、これら以外の形式であってもよい。なお、第三者による改ざんや情報の流出を阻むために、メタデータ管理ファイル２１３に暗号化処理を施すこともできる。 The data format of the metadata management file 213 may be a text format, a binary format, or a format other than these. It should be noted that the metadata management file 213 can be encrypted in order to prevent a third party from falsifying or leaking information.

また、サムネイル画像ファイル２１４は、ストリーム情報ファイル２１２中の映像データを示す画像情報を復号できればよく、非可逆圧縮映像でも可逆圧縮映像であってもよい。また、メタデータ管理ファイル２１３と同様に、第三者による改ざんや情報の流出を阻むために、サムネイル画像ファイル２１４に暗号化処理を施すこともできる。 The thumbnail image file 214 only needs to be able to decode image information indicating the video data in the stream information file 212, and may be a lossy compressed video or a lossless compressed video. Similarly to the metadata management file 213, the thumbnail image file 214 can be encrypted in order to prevent a third party from falsifying or leaking information.

再生制御情報ファイル２１１に、メタデータ管理ディレクトリ２０３又はメタデータ管理ファイル２１３が存在しているか否か、又はメタデータ管理ファイル２１３又はサムネイル画像ファイル２１４が記述されていても、有効な値であるか否かの情報を記述しておくこともできる。このように構成すれば、システム制御部１２０は、メタデータ記録領域の情報を参照することなく、素早くメタデータ管理ファイル２１３又はサムネイル画像ファイル２１４の有無又は有効であるか否かを判断することができる。 Whether or not the metadata management directory 203 or the metadata management file 213 exists in the playback control information file 211, or is the value valid even if the metadata management file 213 or the thumbnail image file 214 is described It is also possible to describe information on whether or not. With this configuration, the system control unit 120 can quickly determine whether the metadata management file 213 or the thumbnail image file 214 exists or is valid without referring to the information of the metadata recording area. it can.

図３は、図２に示されるストリーム情報ファイル２１２の４階層から成るデータ管理構造を示す図である。図３に示されるように、ストリーム情報ファイル２１２は、４階層から成るデータ管理構造を持つ。最下層のフレーム層においては、ストリーム情報ファイル２１２は、映像フレーム単位に細分化されており、所定数の映像フレームは符号化圧縮単位毎にＧＯＰ３００というアクセス単位を形成する。なお、ＧＯＰ３００の先頭フレームは、Ｉピクチャである。フレーム層の上位層であるショット層においては、ストリーム情報ファイル２１２は、シーンチェンジ間の区間を示す複数のショット（図３には、Ｓｈｏｔ＃１〜＃７を示す。）３０１から構成されており、各ショットは、１つ以上のＧＯＰ３００から構成されている。また、ショット層の上位層であるセグメント層においては、ストリーム情報ファイル２１２は、意味的な連続区間を示す複数のセグメント（図３には、Ｓｅｇｍｅｎｔ＃１〜＃４を示す。）３０２から構成されており、各セグメントは、１つ以上のショット３０１から構成されている。 FIG. 3 is a diagram showing a data management structure including four layers of the stream information file 212 shown in FIG. As shown in FIG. 3, the stream information file 212 has a data management structure consisting of four layers. In the lowermost frame layer, the stream information file 212 is subdivided into video frame units, and a predetermined number of video frames form an access unit called GOP 300 for each encoded compression unit. Note that the first frame of the GOP 300 is an I picture. In the shot layer, which is the upper layer of the frame layer, the stream information file 212 is composed of a plurality of shots 301 (showing Shots # 1 to # 7 in FIG. 3) indicating sections between scene changes. Each shot is composed of one or more GOPs 300. In the segment layer, which is an upper layer of the shot layer, the stream information file 212 is composed of a plurality of segments (segments # 1 to # 4 are shown in FIG. 3) 302 indicating semantic continuous sections. Each segment is composed of one or more shots 301.

図４は、図２に示される再生制御情報ファイル２１１のシンタックスを示す図である。再生制御情報ファイル２１１は、ディスク一般情報４００と、録画番組情報４１０とを有する。図４に示されるように、ディスク一般情報４００には、ディスクの属性情報が含まれる。ディスク一般情報４００には、“ｍｅｔａ＿ｄｉｓｃ＿ｖａｌｉｄ＿ｆｌａｇ”４０１と、“ｄｉｓｃ＿ｎａｍｅ”４０２とが含まれる。“ｍｅｔａ＿ｄｉｓｃ＿ｖａｌｉｄ＿ｆｌａｇ”４０１は、記録されたメタデータ記録領域の情報が有効か無効かを示すフラグ情報である。“ｄｉｓｃ＿ｎａｍｅ”４０２は、ディスク名を示す情報である。 FIG. 4 is a diagram showing the syntax of the playback control information file 211 shown in FIG. The reproduction control information file 211 includes disc general information 400 and recorded program information 410. As shown in FIG. 4, the disc general information 400 includes disc attribute information. The disc general information 400 includes “meta_disc_valid_flag” 401 and “disc_name” 402. “Meta_disc_valid_flag” 401 is flag information indicating whether information of the recorded metadata recording area is valid or invalid. “Disc_name” 402 is information indicating a disk name.

また、図４に示されるように、録画番組情報４１０には、光ディスク内に記録されている録画番組の管理情報が含まれる。録画番組情報４１０には、録画番組の総数を示す“ｎｕｍ＿ｏｆ＿ｔｉｔｌｅ”４１１が含まれ、次のループ文「ｆｏｒ（ｉ＝０；ｉ＜ｎｕｍ＿ｏｆ＿ｔｉｔｌｅ；ｉ＋＋）以下」は、“ｎｕｍ＿ｏｆ＿ｔｉｔｌｅ”４１１が示す数だけ繰り返される。 As shown in FIG. 4, the recorded program information 410 includes management information of recorded programs recorded on the optical disc. The recorded program information 410 includes “num_of_title” 411 indicating the total number of recorded programs, and the next loop statement “for (i = 0; i <num_of_title; i ++) or less” is the number indicated by “num_of_title” 411. Repeated.

ストリーム情報ファイル名４１２は、録画番組が対応付けられているストリーム情報ファイル２１２の名前を示す５桁の数字情報である。また、開始時間情報４１３及び終了時間情報４１４は、録画番組に関連付けられたストリーム情報ファイル２１２に多重化されるシステム時刻を基準とした開始時刻と終了時刻である。なお、本実施の形態においては、システム時刻を記録した例を示しているが、単純に開始時間情報４１３として番組先頭を示す００時間００分００秒と設定し、終了時間情報４１４を番組終了までの時刻を設定してもよい。また、図４に示されるように、ストリーム情報ファイル名４１２、開始時間情報４１３、及び終了時間情報４１４は、ストリーム情報ファイル２１２を特定し、当該ストリーム情報ファイル２１２からの読出し位置を決定する情報であり、再生区間情報と呼ぶ。 The stream information file name 412 is 5-digit numerical information indicating the name of the stream information file 212 associated with the recorded program. The start time information 413 and the end time information 414 are a start time and an end time based on the system time multiplexed in the stream information file 212 associated with the recorded program. In this embodiment, an example is shown in which the system time is recorded. However, the start time information 413 is simply set to 00 hours 00 minutes 00 seconds indicating the beginning of the program, and the end time information 414 is set to the end of the program. May be set. As shown in FIG. 4, the stream information file name 412, the start time information 413, and the end time information 414 are information for specifying the stream information file 212 and determining the reading position from the stream information file 212. Yes, this is called playback section information.

また、“ｍｅｔａ＿ｔｉｔｌｅ＿ｖａｌｉｄ＿ｆｌａｇ”４１５は、メタデータ記録領域内に、録画番組が動画検索用のメタデータを保持しているか否かを示すフラグ情報である。ユーザーによって録画番組の再生が指示された際には、“ｍｅｔａ＿ｔｉｔｌｅ＿ｖａｌｉｄ＿ｆｌａｇ”４１５に基づいて、メタデータを作成する必要があるか否かを判断することができる。 Also, “meta_title_valid_flag” 415 is flag information indicating whether or not the recorded program holds metadata for moving image search in the metadata recording area. When the user instructs the reproduction of the recorded program, it can be determined whether or not the metadata needs to be created based on “meta_title_valid_flag” 415.

また、属性情報管理テーブル４２０には、ストリーム情報ファイル２１２中に多重化されている映像情報や音声情報などの属性情報が記録されている。また、属性情報管理テーブル４２０には、ストリームを構成している映像情報や音声情報毎にパケットＩＤなどを格納しており、当該パケットＩＤを用いて映像音声デコード部１３４は、映像データ、音声データ、ストリーム管理データなどに分離することができる。 In the attribute information management table 420, attribute information such as video information and audio information multiplexed in the stream information file 212 is recorded. The attribute information management table 420 stores packet IDs for each video information and audio information constituting the stream, and the video / audio decoding unit 134 uses the packet IDs to store video data and audio data. Can be separated into stream management data and the like.

また、アクセスポイント管理テーブル４３０には、アクセス単位毎のストリーム読み出し位置と再生開始時間を記録したリスト情報が記録されており、このリスト情報を用いてサーチや特殊再生などのランダムアクセス再生を行うことができる。例えば、映像データがＭＰＥＧ−２ビデオストリームでエンコードされている場合、ＧＯＰの先頭がアクセス単位に相当するものであり、当該ＧＯＰ毎に再生開始時間と再生開始アドレス（ストリームファイル先頭を起算とした位置）の情報が記述されている。符号化映像再生装置１００は、再生開始時間情報からストリーム情報ファイル２１２の再生開始アドレスを割り出し、ランダムアクセス再生を行う。 The access point management table 430 stores list information that records the stream read position and playback start time for each access unit. Random access playback such as search and special playback is performed using this list information. Can do. For example, when video data is encoded with an MPEG-2 video stream, the head of the GOP corresponds to the access unit, and the playback start time and playback start address (position starting from the stream file head are counted for each GOP. ) Information is described. The encoded video reproduction device 100 determines the reproduction start address of the stream information file 212 from the reproduction start time information, and performs random access reproduction.

図５は、図２に示されるメタデータ管理ファイル２１３のシンタックスを示す図である。まず、録画番組の総数を示す“ｎｕｍ＿ｏｆ＿ｔｉｔｌｅ”５０１が記録され、次のループ文「ｆｏｒ（ｉ＝０；ｉ＜ｎｕｍ＿ｏｆ＿Ｔｉｔｌｅ；ｉ＋＋）以下」は、“ｎｕｍ＿ｏｆ＿ｔｉｔｌｅ”５０１の数だけ繰り返される。このループ文「ｆｏｒ（ｉ＝０；ｉ＜ｎｕｍ＿ｏｆ＿Ｔｉｔｌｅ；ｉ＋＋）以下」には、サムネイル情報５１０と不要シーン情報５４０とが含まれる。 FIG. 5 is a diagram showing the syntax of the metadata management file 213 shown in FIG. First, “num_of_title” 501 indicating the total number of recorded programs is recorded, and the next loop sentence “for (i = 0; i <num_of_Title; i ++) or less” is repeated by the number of “num_of_title” 501. This loop sentence “for (i = 0; i <num_of_Title; i ++) or less” includes thumbnail information 510 and unnecessary scene information 540.

図５に示されるように、サムネイル情報５１０には、“ｎｕｍ＿ｏｆ＿ｓｅｇｍｅｎｔ”５１１と、“ｎｕｍ＿ｏｆ＿ｓｈｏｔ”５１２が含まれ、それぞれには、録画番組タイトルが持つセグメント数とショット数が記録される。 As shown in FIG. 5, the thumbnail information 510 includes “num_of_segment” 511 and “num_of_shot” 512, each of which records the number of segments and the number of shots of the recorded program title.

図５に示されるループ文「ｆｏｒ（ｊ＝０；ｊ＜ｎｕｍ＿ｏｆ＿ｓｅｇｍｅｎｔ；ｊ＋＋）以下」は、“ｎｕｍ＿ｏｆ＿ｓｅｇｍｅｎｔ”５１１の数だけ繰り返され、当該ループ文中の情報は、セグメント情報５２０と呼ばれる。セグメント情報５２０には、“ｒｅｆ＿ｔｏ＿ｓｈｏｔＩＤ”５２１と、“ｔｈｕｍｂｎａｉｌ＿ｓｅｇｍｅｎｔ＿ｎａｍｅ”５２３と、“Ｓｔａｒｔ＿ｓｅｇｍｅｎｔ＿ｔｉｍｅ”５２４と、“Ｅｎｄ＿ｓｅｇｍｅｎｔ＿ｔｉｍｅ”５２５とが含まれる。“ｒｅｆ＿ｔｏ＿ｓｈｏｔＩＤ”５２１は、当該セグメントと関連したショットを特定するためのＩＤ番号であり、“ｔｈｕｍｂｎａｉｌ＿ｓｅｇｍｅｎｔ＿ｎａｍｅ”５２３は、当該セグメントを示すサムネイル画像ファイル２１４のファイル名であり、“Ｓｔａｒｔ＿ｓｅｇｍｅｎｔ＿ｔｉｍｅ”５２４は当該セグメントの開始時間を示す時刻情報であり、“Ｅｎｄ＿ｓｅｇｍｅｎｔ＿ｔｉｍｅ”５２５は、当該セグメントの終了時間を示す時刻情報である。なお、“ｔｈｕｍｂｎａｉｌ＿ｓｅｇｍｅｎｔ＿ｎａｍｅ”５２３に、例えば、０ｘＦＦＦＦなどの特定値が設定されている場合には、当該セグメントに関連したサムネイル画像ファイル２１４が存在しないことを示す。その場合、ストリーム情報ファイル２１２から再読込みを行うことで、当該セグメントに関連したサムネイル画像ファイルを、別途生成するように構成してもよい。 The loop sentence “for (j = 0; j <num_of_segment; j ++) or less” shown in FIG. 5 is repeated by the number of “num_of_segment” 511, and the information in the loop sentence is called segment information 520. The segment information 520 includes “ref_to_shotID” 521, “thumbnail_segment_name” 523, “Start_segment_time” 524, and “End_segment_time” 525. “Ref_to_shotID” 521 is an ID number for identifying a shot associated with the segment, “thumbnail_segment_name” 523 is the file name of the thumbnail image file 214 indicating the segment, and “Start_segment_time” 524 is the name of the segment. Time information indicating the start time, and “End_segment_time” 525 is time information indicating the end time of the segment. Note that when a specific value such as 0xFFFF is set in “thumbnail_segment_name” 523, for example, it indicates that the thumbnail image file 214 related to the segment does not exist. In this case, a thumbnail image file related to the segment may be separately generated by re-reading from the stream information file 212.

また、図５に示されるループ文「ｆｏｒ（ｋ＝０；ｋ＜ｎｕｍ＿ｏｆ＿ｓｈｏｔ；ｋ＋＋）以下」は、“ｎｕｍ＿ｏｆ＿ｓｈｏｔ”５１２の数だけ繰り返され、当該ループ文中の情報は、ショット情報５３０と呼ばれる。ショット情報５３０には、“ｔｈｕｍｂｎａｉｌ＿ｓｈｏｔ＿ｎａｍｅ”５３１と、“Ｓｔａｒｔ＿ｓｈｏｔ＿ｔｉｍｅ”５３２と、“Ｅｎｄ＿ｓｈｏｔ＿ｔｉｍｅ”５３３と、“Ｉ＿ｐｉｃｔｕｒｅ＿ｐｏｓｉｔｉｏｎ”５３４と、“Ｉ＿ｐｉｃｔｕｒｅ＿ｓｉｚｅ”５３５とが含まれる。“ｔｈｕｍｂｎａｉｌ＿ｓｈｏｔ＿ｎａｍｅ”５３１は、当該セグメントを示すサムネイル画像ファイル２１４のファイル名であり、“Ｓｔａｒｔ＿ｓｈｏｔ＿ｔｉｍｅ”５３２は、当該ショットの開始時間を示す時刻情報であり、“Ｅｎｄ＿ｓｈｏｔ＿ｔｉｍｅ”５３３は、当該ショットの終了時間を示す時刻情報であり、“Ｉ＿ｐｉｃｔｕｒｅ＿ｐｏｓｉｔｉｏｎ”５３４は、当該ショット内の先頭Ｉピクチャの開始位置情報であり、“Ｉ＿ｐｉｃｔｕｒｅ＿ｓｉｚｅ”５３５は、当該ショット内の先頭Ｉピクチャのサイズである。なお、“Ｉ＿ｐｉｃｔｕｒｅ＿ｐｏｓｉｔｉｏｎ”５３４は、ストリーム映像ファイル２１２の先頭から起算したバイト数を示している。“Ｉ＿ｐｉｃｔｕｒｅ＿ｓｉｚｅ”５３５は、当該Ｉピクチャ先頭から起算したバイト数を示す。また、“Ｉ＿ｐｉｃｔｕｒｅ＿ｐｏｓｉｔｉｏｎ”５３４と“Ｉ＿ｐｉｃｔｕｒｅ＿ｓｉｚｅ”５３５は、位置を特定できる情報であればよく、バイト数でもセクタ数でもパケット数などの計測単位で記録されていてもよい。なお、“ｔｈｕｍｂｎａｉｌ＿ｓｈｏｔ＿ｎａｍｅ”５３１に、例えば、０ｘＦＦＦＦなどの特定値が設定されている場合には、当該ショットに関連したサムネイル画像ファイル２１４が存在しないことを示す。その場合、ストリーム情報ファイル２１２から再読込みを行うことで、当該ショットに関連したサムネイル画像ファイルを、別途生成するように構成してもよい。 Also, the loop sentence “for (k = 0; k <num_of_shot; k ++) or less” shown in FIG. 5 is repeated by the number of “num_of_shot” 512, and the information in the loop sentence is called shot information 530. The shot information 530 includes “thumbnail_shot_name” 531, “Start_shot_time” 532, “End_shot_time” 533, “I_picture_position” 534, and “I_picture_size” 535. “Thumbnail_shot_name” 531 is the file name of the thumbnail image file 214 indicating the segment, “Start_shot_time” 532 is time information indicating the start time of the shot, and “End_shot_time” 533 is the end time of the shot. "I_picture_position" 534 is the start position information of the first I picture in the shot, and "I_picture_size" 535 is the size of the first I picture in the shot. Note that “I_picture_position” 534 indicates the number of bytes calculated from the head of the stream video file 212. “I_picture_size” 535 indicates the number of bytes calculated from the head of the I picture. Also, “I_picture_position” 534 and “I_picture_size” 535 may be any information that can specify the position, and may be recorded in units of measurement such as the number of bytes, the number of sectors, or the number of packets. Note that when a specific value such as 0xFFFF is set in “thumbnail_shot_name” 531, for example, it indicates that there is no thumbnail image file 214 associated with the shot. In that case, a thumbnail image file related to the shot may be separately generated by re-reading from the stream information file 212.

また、図５に示されるように、不要シーン情報５４０は、“ｎｕｍ＿ｏｆ＿ｓｃｅｎｅ”５４１情報を含む。“ｎｕｍ＿ｏｆ＿ｓｃｅｎｅ”５４１は、当該録画番組に含まれるコマーシャル映像などの不要区間数を示す情報である。図５に示されるループ文「ｆｏｒ（ｌ＝０；ｌ＜ｎｕｍ＿ｏｆ＿ｓｃｅｎｅ；ｌ＋＋）以下」は、“ｎｕｍ＿ｏｆ＿ｓｃｅｎｅ”５４１の数だけ繰り返され、当該ループ文中の情報は、不要シーン情報５５０と呼ばれる。不要シーン情報５５０には、“Ｓｔａｒｔ＿ｓｃｅｎｅ＿ｔｉｍｅ”５５１と、“Ｅｎｄ＿ｓｃｅｎｅ＿ｔｉｍｅ”５５２とが含まれる。“Ｓｔａｒｔ＿ｓｃｅｎｅ＿ｔｉｍｅ”５５１は、コマーシャル映像などの番組把握には不要な映像シーンの開始時間の時刻情報を示し、“Ｅｎｄ＿ｓｃｅｎｅ＿ｔｉｍｅ”５５２は、当該区間の終了時間の時刻情報を示す。 Further, as shown in FIG. 5, the unnecessary scene information 540 includes “num_of_scene” 541 information. “Num_of_scene” 541 is information indicating the number of unnecessary sections such as a commercial video included in the recorded program. The loop sentence “for (l = 0; l <num_of_scene; l ++) or less” shown in FIG. 5 is repeated by the number of “num_of_scene” 541, and the information in the loop sentence is called unnecessary scene information 550. The unnecessary scene information 550 includes “Start_scene_time” 551 and “End_scene_time” 552. “Start_scene_time” 551 indicates time information on the start time of a video scene that is unnecessary for grasping a program such as a commercial video, and “End_scene_time” 552 indicates time information on the end time of the section.

なお、メタデータ管理ファイル２１３に含まれる情報は、１つのファイルで構成する例を示したが、情報の特性に合わせて複数のファイルに分割してもよい。 Note that the information included in the metadata management file 213 is an example of a single file, but may be divided into a plurality of files in accordance with the characteristics of the information.

図６は、図５に示されるメタデータ管理ファイル２１３のデータ構造におけるセグメントとショットの関連図である。図６に示されるように、セグメントは、セグメント情報５２０毎にリスト化されたテーブルとして記述されている。また、セグメント情報５２０毎に、「０」から起算した“Ｓｅｇｍｅｎｔ＿ＩＤ”６０１が割り振られている。同様に、ショットは、ショット情報５３０毎にリスト化されたテーブルとして記述されている。また、ショット情報５３０毎に、「０」から起算した“Ｓｈｏｔ＿ＩＤ”６０２が割り振られている。図６には、Ｎｓｅｇｍｅｎｔ個の“Ｓｅｇｍｅｎｔ＿ＩＤ”６０１と、Ｎｓｈｏｔ個の“Ｓｈｏｔ＿ＩＤ”６０２とが記録されている場合が示されている。 FIG. 6 is a diagram showing the relationship between segments and shots in the data structure of the metadata management file 213 shown in FIG. As shown in FIG. 6, the segments are described as a table listed for each segment information 520. Further, “Segment_ID” 601 calculated from “0” is assigned to each segment information 520. Similarly, shots are described as a table listed for each shot information 530. Further, “Shot_ID” 602 calculated from “0” is assigned to each shot information 530. FIG. 6 shows a case where Nsegment “Segment_ID” 601 and Nshot “Shot_ID” 602 are recorded.

“ｒｅｆ＿ｔｏ＿ｓｈｏｔＩＤ”５２１の設定値としては、当該セグメントが開始されるショットを特定するＩＤ番号が記録される。例えば、図６においては、“Ｓｅｇｍｅｎｔ＿ＩＤ”６０１が「３」（“Ｓｅｇｍｅｎｔ＿ＩＤ”＝３）であるセグメント情報５２０においては、“ｒｅｆ＿ｔｏ＿ｓｈｏｔＩＤ”５２１に「７」の値が設定されている。すなわち、当該セグメント情報５２０が開始されるショットは、ＳｈｏｔＩＤ６０２が「７」（“Ｓｈｏｔ＿ＩＤ”＝７）で設定されたショット情報５３０である。 As a setting value of “ref_to_shotID” 521, an ID number that identifies a shot where the segment starts is recorded. For example, in FIG. 6, in the segment information 520 in which “Segment_ID” 601 is “3” (“Segment_ID” = 3), a value “7” is set in “ref_to_shotID” 521. That is, the shot in which the segment information 520 starts is shot information 530 in which Shot ID 602 is set to “7” (“Shot_ID” = 7).

こうすれば、セグメント情報５２０に関連づけられたショット情報５３０を特定できることに加え、セグメント情報５２０の代表画であるＩピクチャの記録位置とサイズ情報は、当該セグメント情報５２０に関連付けられたショット情報５３０に記録される“Ｉ＿ｐｉｃｔｕｒｅ＿ｐｏｓｉｔｉｏｎ”５３４と“Ｉ＿ｐｉｃｔｕｒｅ＿ｓｉｚｅ”５３５から識別できる。このようにセグメントとショット間の制御情報を構成することで、セグメントからショットをすぐに特定することができる。このため、セグメントを指定すれば、ショットに関連付けられたサムネイル情報などを瞬時に読み出すことができる。セグメント及びショットにサムネイル画像が関連付けて記録されていない場合であっても、素早くＩピクチャの記録位置とサイズが特定できるので、短時間でサムネイル画像ファイルが生成できる。 In this way, in addition to specifying the shot information 530 associated with the segment information 520, the recording position and size information of the I picture that is the representative image of the segment information 520 are stored in the shot information 530 associated with the segment information 520. It can be identified from the recorded “I_picture_position” 534 and “I_picture_size” 535. By configuring the control information between the segment and the shot in this way, it is possible to immediately specify the shot from the segment. For this reason, if a segment is designated, thumbnail information associated with the shot can be instantaneously read out. Even when the thumbnail images are not recorded in association with the segments and shots, the recording position and size of the I picture can be quickly identified, so that a thumbnail image file can be generated in a short time.

なお、セグメントとショットのサムネイル画像のファイル名に関する情報を共用させる場合、“ｔｈｕｍｂｎａｉｌ＿ｓｅｇｍｅｎｔ＿ｎａｍｅ”５２３は保持しなくてもよい。例えば、図６においては、セグメント情報５２０（“Ｓｅｇｍｅｎｔ＿ＩＤ”＝３）のサムネイル画像のファイル名は、ショット情報５３０（“Ｓｈｏｔ＿ＩＤ”＝７）の“ｔｈｕｍｂｎａｉｌ＿ｓｈｏｔ＿ｎａｍｅ”５３１の設定値である“０００３０＿００１８００”と特定できる。こうすればセグメントとショット間で無駄な情報を持たず、より少ないデータ量でサムネイル情報を制御することができる。 Note that “thumbnail_segment_name” 523 does not have to be held when information regarding the file names of the segment and shot thumbnail images is shared. For example, in FIG. 6, the file name of the thumbnail image of the segment information 520 (“Segment_ID” = 3) is specified as “00030_001800” which is the setting value of “thumbnail_shot_name” 531 of the shot information 530 (“Shot_ID” = 7). it can. In this way, it is possible to control thumbnail information with a smaller amount of data without having unnecessary information between segments and shots.

図７は、本実施の形態において表示装置１４０に表示される映像検索画面（基本）の一例を示す図である。図４に示されるように、映像検索画面（基本）は、録画番組を再生中に映像シーン検索を行った際に、表示装置１４０に出力される画面イメージを示している。例えば、録画番組の再生中に、操作入力部１１０の一時停止キー、又は映像検索用の専用キーなどを押下することにより、映像検索画面（基本）が表示される。なお、映像検索画面（基本）の表示は、録画番組の選択が行われていればよく、番組再生中に限らない。 FIG. 7 is a diagram showing an example of a video search screen (basic) displayed on the display device 140 in the present embodiment. As shown in FIG. 4, the video search screen (basic) shows a screen image output to the display device 140 when a video scene search is performed during playback of a recorded program. For example, during playback of a recorded program, a video search screen (basic) is displayed by pressing a pause key of the operation input unit 110 or a dedicated key for video search. Note that the display of the video search screen (basic) is not limited to the time of program playback as long as a recorded program is selected.

図７に示されるように、映像検索画面（基本）は、録画番組映像（一時停止中）上に、再生時間バー領域７０１と、セグメント画像情報領域７０２と、セグメント属性情報領域７０３とを表示することで構成される。なお、図７において、録画番組映像に重畳されるこれらの領域が不透過な状態である例を示しているが、録画番組映像の視認性を高めるために、一定の透過率を設定してもよい。 As shown in FIG. 7, the video search screen (basic) displays a playback time bar area 701, a segment image information area 702, and a segment attribute information area 703 on a recorded program video (paused). Consists of. Although FIG. 7 shows an example in which these areas superimposed on the recorded program video are in an opaque state, in order to increase the visibility of the recorded program video, a certain transmittance may be set. Good.

再生時間バー領域７０１には、現在選択されている録画番組における開始時刻、終了時刻、及び現在時刻などの情報が表示される。図７においては、開始時刻として「００：００：００」、すなわち、「００時００分００秒」を示し、終了時刻として「００：３０：００」、すなわち、「００時３０分００秒」を示し、現在時刻として「００：２２：１５」、すなわち、「００時２２分１５秒」を示している。 The reproduction time bar area 701 displays information such as the start time, end time, and current time of the currently selected recorded program. In FIG. 7, “00:00:00” as the start time, that is, “00:00:00” is shown, and “00:30:30” as the end time, that is, “00:30:30” The current time is “00:22:15”, that is, “00:22:15”.

再生時間バー領域７０１は、開始時刻から終了時刻までを棒状の時間バーで示しており、時間バーの内の区間７０１ａ，７０１ｂ，７０１ｃはコマーシャル映像や重複シーンなどの不要シーンを示す。また、棒状の時間バーの内の区間７０１ｄは、現在選択されているセグメントの再生区間を示す。なお、一般的な録画番組はコマーシャル映像直後に視聴価値の高い映像シーンを配置することが多い。そこで、コマーシャル映像を示す区間をあえてユーザーに提示することで、コマーシャル映像を視聴しないように検索性を向上させることができ、加えて、視聴価値の高い見所となる映像シーンを素早く特定することもできる。 The playback time bar area 701 shows a bar-like time bar from the start time to the end time, and sections 701a, 701b, and 701c in the time bar indicate unnecessary scenes such as commercial videos and overlapping scenes. A section 701d in the bar-shaped time bar indicates a playback section of the currently selected segment. Note that a general recorded program often arranges a video scene having a high viewing value immediately after a commercial video. Therefore, it is possible to improve the searchability so as not to view the commercial video by deliberately presenting the section showing the commercial video to the user, and in addition, it is also possible to quickly identify video scenes that have high viewing value. it can.

次に、セグメント画像情報領域７０２には、現在選択されているセグメントを示すサムネイル画像を中央に配置し、現在選択されているセグメントの直前の２個分のセグメントを示すサムネイル画像と現在選択されているセグメントの直後の２個分のセグメントを示すサムネイル画像を表示する。図７の例においては、時間的に未来方向のセグメントを、現在選択されているセグメントより右側に配置し、時間的に過去方向のセグメントを、現在選択されているセグメントより左側に配置している。すなわち、図７の例においては、セグメント画像情報領域７０２は、時間軸の順番に、サムネイル画像が左から右に配列される。なお、中央に配置される現在選択されるセグメントは、他のセグメントよりも目立つようにするため、枠を付けたり、画面の彩度を変更したり、画像サイズを一回り大きくするなどの強調処置を採用してもよい。 Next, in the segment image information area 702, a thumbnail image indicating the currently selected segment is arranged in the center, and the thumbnail image indicating the two segments immediately before the currently selected segment and the currently selected thumbnail image are displayed. A thumbnail image showing two segments immediately after the existing segment is displayed. In the example of FIG. 7, the segment in the future direction in time is arranged on the right side of the currently selected segment, and the segment in the past direction in time is arranged on the left side of the currently selected segment. . That is, in the example of FIG. 7, in the segment image information area 702, thumbnail images are arranged from left to right in the order of the time axis. In order to make the currently selected segment located in the center stand out more than other segments, emphasis measures such as adding a frame, changing the saturation of the screen, or increasing the image size by one step. May be adopted.

なお、セグメント画像情報領域７０２に表示するサムネイル画像は、サムネイル画像ファイル２１４で記録された画像解像度を変更させないほうが望ましい。なぜならば、サムネイル画像ファイル２１４をスケーリング表示させると、符号化映像再生装置１００は画像拡大縮小処理を行う必要があり、システムへの負荷をかけるとともに応答性が悪くなる可能性があるからである。また、セグメント画像情報領域７０２に配置するサムネイル画像の個数は奇数であることが望ましい。なぜならば、ユーザーの認知性を向上させるために、現在選択されている画像を中央に表示させるためである。現状、サムネイル画像ファイルは、ＭＰＥＧ圧縮時のＤＣ成分を利用しているため、画像サイズは１／８である。サムネイル画像ファイルをスケーリングせずに表示する場合、セグメント画像情報領域７０２には８個分のサムネイル画像を格納可能である。一方、この値は偶数であることに加え、サムネイル画像間に隙間なく配置されるため、視認性が極端に悪くなる。そのためこのような条件下においては７個分又は５個分のサムネイル画像を配置するのが最適と考えられる。 Note that it is desirable that the thumbnail image displayed in the segment image information area 702 does not change the image resolution recorded in the thumbnail image file 214. This is because when the thumbnail image file 214 is scaled and displayed, the encoded video reproduction apparatus 100 needs to perform image enlargement / reduction processing, which may impose a load on the system and deteriorate the responsiveness. In addition, the number of thumbnail images arranged in the segment image information area 702 is desirably an odd number. This is because the currently selected image is displayed in the center in order to improve the user's cognition. At present, the thumbnail image file uses the DC component at the time of MPEG compression, so the image size is 1/8. When displaying thumbnail image files without scaling, the segment image information area 702 can store eight thumbnail images. On the other hand, since this value is an even number and is arranged without a gap between thumbnail images, the visibility is extremely deteriorated. Therefore, under such conditions, it is considered optimal to arrange 7 or 5 thumbnail images.

セグメント画像情報領域７０２においては、操作入力部１１０の左右キー押下によって現在選択されているセグメントを移動することができる。例えば、操作入力部１１０の右キーを押下すると、セグメント画像情報領域７０２に表示されているサムネイル画像が全体的に左にシフトする。図７上のＤに示す図（ビルと人間が表示）が中央の位置に移動し、当該サムネイル画像が選択されることとなる。全体的にサムネイル画像が左にシフトするため、図７上のＡに示す図（太陽とビル）が表示されなくなり、図７上のＥの後方に存在するセグメントを示すサムネイル画像が新たに読み込まれて、最右端の位置に表示される。 In the segment image information area 702, the currently selected segment can be moved by pressing the left / right key of the operation input unit 110. For example, when the right key of the operation input unit 110 is pressed, the thumbnail images displayed in the segment image information area 702 are shifted to the left as a whole. The figure shown in D on FIG. 7 (displayed by a building and a person) moves to the center position, and the thumbnail image is selected. Since the thumbnail image shifts to the left as a whole, the diagram (sun and building) shown in A on FIG. 7 is not displayed, and a thumbnail image indicating a segment existing behind E on FIG. 7 is newly read. Displayed at the rightmost position.

また、操作入力部１１０の決定キーを押下することにより、当該セグメントが示す映像シーンにサーチし、当該再生位置から通常再生を行うことができる。 In addition, by pressing the enter key of the operation input unit 110, it is possible to search for the video scene indicated by the segment and perform normal playback from the playback position.

セグメント属性情報領域７０２には、現在選択されているセグメントの順番を示す番号とセグメント総数、及び現在選択されているセグメント開始時刻と番組総再生時間、セグメントの時間長などが表示される。 In the segment attribute information area 702, a number indicating the order of the currently selected segment, the total number of segments, the currently selected segment start time, the total program playback time, the time length of the segment, and the like are displayed.

なお、操作入力部１１０の左右キー押下によって、セグメント画像情報領域７０２内のサムネイル画像は逐次更新されるが、それに連動して再生時間バー領域７０１、及びセグメント属性情報領域７０３の表示内容も更新される。再生時間バー領域７０１においては、現在時刻及び現在時刻が属するセグメント表示色の位置が更新される。同様にセグメント属性情報領域７０３においては、次セグメントの情報に更新される。なお、現在一時停止中のバックグラウンドに表示されている再生画像については、次セグメントの映像を表示してもよいし、当該映像検索時に一時停止した際の映像を表示し続けていてもよい。 Note that the thumbnail images in the segment image information area 702 are sequentially updated by pressing the left and right keys of the operation input unit 110, but the display contents of the playback time bar area 701 and the segment attribute information area 703 are also updated accordingly. The In the playback time bar area 701, the current time and the position of the segment display color to which the current time belongs are updated. Similarly, the segment attribute information area 703 is updated with information on the next segment. As for the playback image displayed in the currently paused background, the video of the next segment may be displayed, or the video at the time of pause during the video search may be continuously displayed.

図８は、本実施の形態における表示装置１４０に表示される映像検索画面（詳細）の一例を示す図である。図８に示されるように、映像検索画面（詳細）は、映像検索画面（基本）から、さらに細分化した映像シーン検索を行った際に、表示装置１４０に出力される画面イメージを示している。この映像検索画面（詳細）は、図７に示される映像検索画面（基本）に表示されるセグメントを示す第１段階のサムネイル画像を選択した際に表示される第２段階のサムネイル画像メニューである。映像検索画面（詳細）は、映像検索画面（基本）のセグメントを選択した状態で、操作入力部１１０の上キー、又は映像検索用の専用キーなどを押下することにより、映像検索画面（詳細）が表示される。 FIG. 8 is a diagram illustrating an example of a video search screen (details) displayed on the display device 140 according to the present embodiment. As shown in FIG. 8, the video search screen (details) shows a screen image output to the display device 140 when a video scene search further subdivided from the video search screen (basic) is performed. . This video search screen (details) is a second-stage thumbnail image menu displayed when a first-stage thumbnail image indicating a segment displayed on the video search screen (basic) shown in FIG. 7 is selected. . The video search screen (details) is displayed by pressing the upper key of the operation input unit 110 or a dedicated key for video search in a state where the segment of the video search screen (basic) is selected. Is displayed.

図８に示されるように、映像検索画面（詳細）は、映像検索画面（基本）の情報に加え、ショット画像情報領域８０１が表示される。また、セグメント属性情報領域７０３は、ショット属性情報領域８０２に表示が変更される。なお、図８に示されるように、選択されたセグメントのサムネイル画像以外については、選択されていないことを明示するようサムネイル画像の彩度を変えて、区別しやすいように表示してもよい。また、図７と同様に、ショット画像情報領域８０１について、再生映像の視認性を高めるために、ある一定の透過率を設定してもよい。 As shown in FIG. 8, the video search screen (details) displays a shot image information area 801 in addition to the information on the video search screen (basic). Further, the display of the segment attribute information area 703 is changed to the shot attribute information area 802. Note that, as shown in FIG. 8, the thumbnail images other than the thumbnail images of the selected segment may be displayed so as to be easily distinguished by changing the saturation of the thumbnail images so as to clearly indicate that they are not selected. Similarly to FIG. 7, a certain transmittance may be set for the shot image information area 801 in order to improve the visibility of the reproduced video.

ショット画像情報領域８０１には、現在選択されているショット（図８においては、現在選択されているセグメントのサムネイル画像（領域７０２内のＣ）と同じ画像）を示すサムネイル画像を中央に配置し、現在選択されているショットを示すサムネイル画像の直前の３個分のショットを示すサムネイル画像と、現在選択されているショットを示すサムネイル画像の直後の３個分のショットを示すサムネイル画像とを表示する。ショット画像情報領域８０１におけるサムネイル画像の配置ルールは、セグメント画像情報領域７０２におけるサムネイル画像の配置ルールと同様である。なお、ショット画像情報領域８０１には、選択されているセグメント内のショットのみを表示することも可能であるが、図８に示される例においては、配置可能な個数が許す限り、前後のセグメント内に存在するショットを示すサムネイル画像を配置している。 In the shot image information area 801, a thumbnail image indicating the currently selected shot (in FIG. 8, the same image as the thumbnail image of the currently selected segment (C in the area 702)) is arranged in the center. A thumbnail image indicating three shots immediately before the thumbnail image indicating the currently selected shot and a thumbnail image indicating three shots immediately following the thumbnail image indicating the currently selected shot are displayed. . The arrangement rule for thumbnail images in the shot image information area 801 is the same as the arrangement rule for thumbnail images in the segment image information area 702. In the shot image information area 801, it is possible to display only shots in the selected segment. However, in the example shown in FIG. A thumbnail image indicating a shot existing in is arranged.

ショット画像情報領域８０１において、現在選択されているショットは、操作入力部１１０の左右キー押下によって、移動させることができる。この移動ルールは、セグメント画像情報領域７０２において、現在選択されているセグメントに適用される移動ルールと同様である。 In the shot image information area 801, the currently selected shot can be moved by pressing the left / right key of the operation input unit 110. This movement rule is the same as the movement rule applied to the currently selected segment in the segment image information area 702.

また、操作入力部１１０の決定キーを押下することによって、ショット画像情報領域８０１において、現在選択されているショットが示す映像シーンをサーチし、現在選択されているショットが示す映像シーンの位置から通常再生を行うことができる。 In addition, by pressing the enter key of the operation input unit 110, the video image indicated by the currently selected shot is searched in the shot image information area 801, and the normal position is determined from the position of the video scene indicated by the currently selected shot. Playback can be performed.

ショット属性情報領域８０２には、現在選択されているショットの順番を示すショット番号及びショット総数、現在選択されているショットの開始時刻及び番組総時間、並びに、現在選択されているショットの時間長が表示される。 The shot attribute information area 802 includes a shot number indicating the order of the currently selected shot and the total number of shots, a start time and total program time of the currently selected shot, and a time length of the currently selected shot. Is displayed.

なお、操作入力部１１０の左右キー押下によって、ショット画像情報領域８０１が更新されるが、それに連動して、再生時間バー領域７０１、セグメント画像情報領域７０２、及びショット属性情報領域８０２の表示内容も更新される。再生時間バー領域７０１においては、現在時刻及び現在時刻が属するショット表示色の位置も更新される。同様に、セグメント画像情報領域７０２においては、変更後のショットが属するセグメントが中央に配置されるよう情報内容が更新される。また、ショット属性情報領域８０１においては、次ショットの情報内容に更新される。なお、現在一時停止中のバックグラウンドに表示されている再生画像については、次ショットの映像を表示してもよいし、当該映像検索時に一時停止した際の映像を表示し続けていてもよい。 Note that the shot image information area 801 is updated by pressing the left and right keys of the operation input unit 110. In conjunction with this, the display contents of the playback time bar area 701, the segment image information area 702, and the shot attribute information area 802 are also displayed. Updated. In the playback time bar area 701, the current time and the position of the shot display color to which the current time belongs are also updated. Similarly, in the segment image information area 702, the information content is updated so that the segment to which the changed shot belongs is arranged in the center. The shot attribute information area 801 is updated to the information content of the next shot. Note that for the playback image displayed in the currently paused background, the video of the next shot may be displayed, or the video that was paused during the video search may be continuously displayed.

図９（Ａ）乃至（Ｅ）は、本実施の形態における不要シーン特定部１３６によって特定された不要シーンを除いて、サムネイル画面を生成する処理の説明図である。図９（Ａ）乃至（Ｃ）は、メタデータ管理ファイル２１３に記録されているセグメント及びショットのデータ構造の一例であり、図９（Ｄ）及び（Ｅ）は、メタデータ管理ファイル２１３中の不要シーン情報に基づいたデータモデルである。図７及び図８に示される映像検索画面は、不要シーン情報である図９（Ｃ）を加味したサムネイル画像の選出が行われることが望ましい。なぜならば、コマーシャル映像における映像の物理変化量は大きく、上記セグメントやショットを数多く検出する可能性が高く、加えて、コマーシャル映像は、ユーザーにとって視聴価値が低く、コマーシャル映像のシーンをサムネイル画像で表示することは、映像検索性という観点からも非常に効率が悪いからである。 FIGS. 9A to 9E are explanatory diagrams of processing for generating a thumbnail screen excluding unnecessary scenes specified by the unnecessary scene specifying unit 136 in the present embodiment. 9A to 9C are examples of the data structure of the segments and shots recorded in the metadata management file 213. FIGS. 9D and 9E are diagrams showing the metadata management file 213. This is a data model based on unnecessary scene information. In the video search screens shown in FIGS. 7 and 8, it is preferable that thumbnail images are selected in consideration of unnecessary scene information shown in FIG. 9C. This is because the amount of physical change of the video in the commercial video is large, and there is a high possibility of detecting many of the above segments and shots. In addition, the commercial video has low viewing value for the user, and the scene of the commercial video is displayed as a thumbnail image. This is because the efficiency is very low from the viewpoint of video searchability.

図９（Ｄ）及び（Ｅ）に示されるように、補正後のセグメント及び補正後のショットの代表画であるサムネイル画像選出に補正処理を施すことによって、ユーザーにとって視聴価値の高い映像シーンに限定して映像検索が行うことができる。具体的には、図９（Ｄ）及び（Ｅ）に示されるように不要シーンで示される映像が一部でも含まれるショットは、映像検索には無効なショットとして認識する。図９（Ａ）乃至（Ｅ）に示されるように、不要シーンを考慮に入れたサムネイル画像を選出する場合、Ｓｅｇｍｅｎｔ＃２の代表画は、Ｓｈｏｔ＃４のサムネイル画像からＳｈｏｔ＃６のサムネイル画像に変更されることになる。このように，セグメントの代表画であるサムネイル画像が、不要シーンの間に含まれている場合でも、映像検索画面におけるサムネイル画像を表示する際に、番組映像検索には無意味な映像シーンをユーザーに提示することないため検索性を向上させることができる。 As shown in FIGS. 9D and 9E, by performing a correction process on selection of thumbnail images that are representative images of the corrected segment and the corrected shot, the video scene is limited to a video scene having high viewing value for the user. Video search can be performed. Specifically, as shown in FIGS. 9D and 9E, a shot including at least a part of the video indicated by the unnecessary scene is recognized as an invalid shot for video search. As shown in FIGS. 9A to 9E, when selecting a thumbnail image in consideration of unnecessary scenes, the representative image of Segment # 2 is the thumbnail image of Shot # 6 from the thumbnail image of Shot # 4. Will be changed. In this way, even when a thumbnail image that is a representative image of a segment is included between unnecessary scenes, when displaying a thumbnail image on the video search screen, a user can create a meaningless video scene for program video search. Therefore, searchability can be improved.

図９（Ａ）乃至（Ｅ）においては、不要シーンが一部でも存在するショットを無効なショットと判定し、３つのショット（Ｓｈｏｔ＃３，＃４，＃５）が無効なショットと判定される場合を説明したが、無効なショットの判定方法として他の方法を採用することもできる。 In FIGS. 9A to 9E, a shot having some unnecessary scenes is determined as an invalid shot, and three shots (Shot # 3, # 4, and # 5) are determined as invalid shots. However, other methods may be employed as a method for determining invalid shots.

図１０（Ａ）乃至（Ｅ）に示されるように、不要シーンがショット全体を包含する場合についてのみ、無効なショットと判定する方法を採用してもよい。この場合には、図１０（Ａ）乃至（Ｅ）に示されるように、１つのショット（Ｓｈｏｔ＃４）が無効なショットと判定される。 As shown in FIGS. 10A to 10E, a method of determining an invalid shot may be adopted only when the unnecessary scene includes the entire shot. In this case, as shown in FIGS. 10A to 10E, one shot (Shot # 4) is determined as an invalid shot.

また、図１１（Ａ）乃至（Ｅ）に示されるように、不要シーンがショットの一部だけに存在し且つ一定の基準時間Ｔ０以上の不要シーンが含まれる場合に、無効なショットと判定する方法を採用してもよい。図１１（Ｂ）及び（Ｃ）に示されるように、不要シーンの期間Ｔ３は基準時間Ｔ０より短いので、Ｓｈｏｔ＃３は無効なショットと判定されず、不要シーンの期間Ｔ５は基準時間Ｔ０より長いので、Ｓｈｏｔ＃５は無効なショットと判定される。したがって、２つのショット（Ｓｈｏｔ＃４，＃５）が無効なショットと判定される。 Further, as shown in FIGS. 11A to 11E, when an unnecessary scene exists only in a part of the shot and an unnecessary scene having a certain reference time T0 or more is included, it is determined as an invalid shot. A method may be adopted. As shown in FIGS. 11B and 11C, since the unnecessary scene period T3 is shorter than the reference time T0, Shot # 3 is not determined as an invalid shot, and the unnecessary scene period T5 is shorter than the reference time T0. Since it is long, Shot # 5 is determined to be an invalid shot. Therefore, two shots (Shot # 4 and # 5) are determined as invalid shots.

図１２は、符号化映像再生装置１００におけるセグメント情報及びショット情報の取得処理を示すフローチャートである。図１２を用いて、特徴シーン抽出部１３５におけるセグメントとショットの取得動作を詳細に説明する。特徴シーン抽出部１３５内には２つのバッファメモリ（図１における１３５ａ，１３５ｂ）が存在しており、一方（例えば、バッファメモリ１３５ａ）に基準映像フレームのＩピクチャ画像の物理変化量を示す情報を格納し、他方（例えば、バッファメモリ１３５ｂ）に順次読込んだ映像フレームのＩピクチャ画像の物理変化量を格納する。このように２つのバッファメモリ１３５ａ，１３５ｂに読込んだ画像情報を比較し、画像類似性を算出することにより、セグメント及びショットであるか否かを判定する。 FIG. 12 is a flowchart showing segment information and shot information acquisition processing in the encoded video reproduction device 100. The segment and shot acquisition operation in the feature scene extraction unit 135 will be described in detail with reference to FIG. Two buffer memories (135a and 135b in FIG. 1) exist in the feature scene extraction unit 135, and information indicating the physical change amount of the I picture image of the reference video frame is stored in one (for example, the buffer memory 135a). The physical change amount of the I-picture image of the video frame that is stored and sequentially read into the other (for example, the buffer memory 135b) is stored. In this way, by comparing the image information read into the two buffer memories 135a and 135b and calculating the image similarity, it is determined whether or not it is a segment and a shot.

図１２に示されるように、番組の記録動作が開始されると、システム制御部１２０は、常時、ストリーム制御部１３２を監視し、録画中のストリーム情報中の符号化圧縮映像がＧＯＰ開始点か否かを判定する（ステップＳ１０１）。ＧＯＰを検出した場合、システム制御部１２０は、当該ＧＯＰの先頭画像であるＩピクチャ画像を、特徴シーン抽出部１３５に転送させる。特徴シーン抽出部１３５は、２つのバッファメモリ１３５ａ，１３５ｂを内蔵しており、基準フレーム画像の物理変化量を記録する１つ目のバッファメモリ（以下「第１のバッファメモリ」と言う。）１３５ａが使用されているか否かを判定する（ステップＳ１０２）。もし第１のバッファメモリ１３５ａが空いていれば、Ｉピクチャの物理変化量であるＭＰＥＧのＤＣ成分を第１のバッファメモリ１３５ａに格納し（ステップＳ１０３）、次のＧＯＰ検出を待つ。本実施の形態においては、映像の物理変化量としてＤＣ成分を記録しているが、画像のカラーヒストグラムや動きベクトルなどでもよいし、それらの組み合わせであってもよい。 As shown in FIG. 12, when the program recording operation is started, the system control unit 120 always monitors the stream control unit 132, and whether the encoded compressed video in the stream information being recorded is the GOP start point. It is determined whether or not (step S101). When the GOP is detected, the system control unit 120 causes the feature scene extraction unit 135 to transfer the I picture image that is the head image of the GOP. The feature scene extraction unit 135 includes two buffer memories 135a and 135b, and a first buffer memory (hereinafter referred to as “first buffer memory”) 135a that records the physical change amount of the reference frame image. It is determined whether or not is used (step S102). If the first buffer memory 135a is empty, the MPEG DC component, which is the physical change amount of the I picture, is stored in the first buffer memory 135a (step S103), and the next GOP detection is awaited. In the present embodiment, the DC component is recorded as the physical change amount of the video, but it may be a color histogram or a motion vector of the image, or a combination thereof.

ステップＳ１０２において、第１のバッファメモリ１３５ａが使用されていると判断された場合、他方のバッファメモリ（以下「第２のバッファメモリ」と言う。）１３５ｂに、当該映像情報の物理変化量であるＤＣ成分を格納する（ステップＳ１０４）。その後、第１のバッファメモリ１３５ａと第２のバッファメモリ１３５ｂに記録されているＤＣ成分について、画素毎の差分を算出することで、画像類似性を算出する。そして、この算出された値が、予め決められた第１の閾値α以上か否かを判定する（ステップＳ１０５）。 If it is determined in step S102 that the first buffer memory 135a is used, the physical change amount of the video information is stored in the other buffer memory (hereinafter referred to as "second buffer memory") 135b. The DC component is stored (step S104). Thereafter, the image similarity is calculated by calculating a difference for each pixel with respect to the DC components recorded in the first buffer memory 135a and the second buffer memory 135b. Then, it is determined whether or not the calculated value is greater than or equal to a predetermined first threshold value α (step S105).

映像情報の物理変化量が第１の閾値αよりも高い場合、当該Ｉピクチャはセグメントであると判定され（ステップＳ１０６）、当該Ｉピクチャの物理変化量であるＤＣ成分からサムネイル画像を生成し、当該サムネイル画像をサムネイル画像ファイル２１４として記録する。同時に、このセグメントの開始時間情報、サムネイル画像ファイル名をメタデータ管理ファイル２１３中に記録する。なお、この検出されたセグメントより１つ前のセグメントがあった場合、当該セグメントの終了時間情報を記録する。 When the physical change amount of the video information is higher than the first threshold value α, it is determined that the I picture is a segment (step S106), a thumbnail image is generated from the DC component that is the physical change amount of the I picture, The thumbnail image is recorded as a thumbnail image file 214. At the same time, the start time information of this segment and the thumbnail image file name are recorded in the metadata management file 213. If there is a segment immediately before the detected segment, the end time information of the segment is recorded.

セグメントはショットを集約したものであるため、セグメントが検出されるとそれに対応するショットの情報も生成される（ステップＳ１０７）。ショットが生成されると、同様にサムネイル画像ファイル２１４を生成し、ショット開始時刻、Ｉピクチャ位置情報、Ｉピクチャサイズ、サムネイル画像ファイル名などをメタデータ管理ファイル２１３中に記録する。なお、この検出されたショットより１つ前のショットがある場合、当該ショットの終了時間情報を記録する。 Since a segment is an aggregate of shots, when a segment is detected, information about the corresponding shot is also generated (step S107). When a shot is generated, a thumbnail image file 214 is similarly generated, and shot start time, I picture position information, I picture size, thumbnail image file name, and the like are recorded in the metadata management file 213. When there is a shot immediately before the detected shot, end time information of the shot is recorded.

その後、基準フレームの情報を更新するため、第２のバッファメモリ１３５ｂの内容が第１のバッファメモリ１３５ａの内容になるようデータ更新し（ステップＳ１０８）、処理をステップＳ１１０に進める。 Thereafter, in order to update the information of the reference frame, the data is updated so that the content of the second buffer memory 135b becomes the content of the first buffer memory 135a (step S108), and the process proceeds to step S110.

一方、セグメント判定処理ステップであるステップＳ１０５において、物理変化量が第１の閾値α以下の場合、ショット判定ステップであるステップＳ１０９が実施される。ここでは、物理変化量を、第１の閾値αよりも低い閾値である第２の閾値βと比較する。物理変化量が第２の閾値βよりも高い場合、ショット情報取得フローであるステップＳ１０７以降の処理が行われ、処理をステップＳ１１０に進める。ステップＳ１０８において、物理変化量が第２の閾値β以下の場合、処理をステップＳ１１０に進める。ステップＳ１１０においては、記録終了になるまで、次のＧＯＰについてステップＳ１０１〜Ｓ１０９を繰り返す。 On the other hand, if the physical change amount is equal to or smaller than the first threshold value α in step S105, which is a segment determination processing step, step S109, which is a shot determination step, is performed. Here, the physical change amount is compared with a second threshold value β that is a threshold value lower than the first threshold value α. If the physical change amount is higher than the second threshold value β, the process after step S107, which is a shot information acquisition flow, is performed, and the process proceeds to step S110. If the physical change amount is equal to or smaller than the second threshold value β in step S108, the process proceeds to step S110. In step S110, steps S101 to S109 are repeated for the next GOP until the end of recording.

なお、本実施の形態においては、単純に前後に配置された２枚のＩピクチャの画像類似性のみでセグメントか否かを判定しているが、一定時間内の物理変化量をバッファメモリに格納しておき、この一定時間内の物理変化量の連続性に基づいてセグメントか否かを判断してもよい。このように構成することで、ある連続した映像シーンに一瞬シーンチェンジが入ったとしても、セグメントの連続性が保たれていることを判別できるため、より意味的な不連続点をより適切に判断することができる。 In this embodiment, whether or not a segment is determined based on the image similarity of two I pictures arranged in front and back is simply stored in the buffer memory. In addition, it may be determined whether the segment is based on the continuity of the physical change amount within the predetermined time. By configuring in this way, even if a scene change occurs for a certain continuous video scene, it can be determined that the continuity of the segment is maintained, so more meaningful discontinuities can be determined more appropriately. can do.

図１３は、符号化映像再生装置１００における不要シーン情報の取得処理を示すフローチャートである。図１３を用いて、不要シーン特定部１３６におけるコマーシャル映像である不要区間の特定動作を詳細に説明する。ここでは、一般に、テレビ放送におけるコマーシャルは、番組本編とコマーシャルの間、及び、コマーシャルと次のコマーシャルとの間に無音部分を有し、この無音部分は、固定の周期性を持って現れる（すなわち、無音部分の間隔は所定時間間隔である）ことを利用して、コマーシャル部分か否かを判定する。 FIG. 13 is a flowchart showing an unnecessary scene information acquisition process in the encoded video reproduction device 100. With reference to FIG. 13, an operation for specifying an unnecessary section which is a commercial video in the unnecessary scene specifying unit 136 will be described in detail. Here, in general, commercials in television broadcasting have a silent part between the main program and the commercial and between the commercial and the next commercial, and this silent part appears with a fixed periodicity (ie, The interval between the silent portions is a predetermined time interval).

記録動作が開始されると、システム制御部１２０からの指示にしたがって、ストリーム制御部１３２は、ストリーム情報を不要シーン特定部１３６に転送する。不要シーン特定部１３６は、受け取ったストリーム情報から音声情報を抽出し（ステップＳ２０１）、抽出された音声情報から、音声出力レベルを検出する（ステップＳ２０２）。その後、不要シーン特定部１３６は、音声出力レベルが閾値ε以下である場合に、この部分を無音部分であると判定して、処理をステップＳ２０４に進める。無音部分でない場合は、音声情報が無音部分を検出するまでステップＳ２０１以降の処理を行う。 When the recording operation is started, the stream control unit 132 transfers the stream information to the unnecessary scene specifying unit 136 in accordance with an instruction from the system control unit 120. The unnecessary scene specifying unit 136 extracts audio information from the received stream information (step S201), and detects an audio output level from the extracted audio information (step S202). Thereafter, when the audio output level is equal to or less than the threshold value ε, the unnecessary scene specifying unit 136 determines that this portion is a silent portion, and advances the process to step S204. If it is not a silent part, the processing from step S201 is performed until the voice information detects the silent part.

ステップＳ２０４においては、無音と判断された時点の時刻情報と、特徴シーン抽出部１３５で得られたシーンチェンジ点を示すショットの時刻情報を比較し、シーンチェンジ点の近傍に無音部分があるか否かを比較する（ステップＳ２０４）。そして、ステップＳ２０４で得られた結果と、無音部分の長さ、無音部分の周期性から、コマーシャル映像（ＣＭ映像）のルールに合致するか否かを判定する（ステップＳ２０５）。コマーシャルと次のコマーシャルとの間には、無音部分が存在する特徴があるため、非常に短い時間の無音部分が一定の周期で検出されるとコマーシャル映像と判定できる。 In step S204, the time information at the time when it is determined that there is no sound is compared with the time information of the shot indicating the scene change point obtained by the feature scene extraction unit 135, and whether there is a silence portion near the scene change point. These are compared (step S204). Then, from the result obtained in step S204, the length of the silent part, and the periodicity of the silent part, it is determined whether or not the rule of the commercial video (CM video) is met (step S205). Since there is a feature that there is a silent part between the commercial and the next commercial, it can be determined as a commercial video if a silent part for a very short time is detected at a certain period.

判定の結果、コマーシャル映像と判定された場合、当該区間を不要シーン特定部１３６内のバッファメモリに保持する（ステップＳ２０６）とともに、メタデータ管理ファイル２１３に記録する。そして、記録終了になるまで（ステップＳ２０７）、ステップＳ２０１以降の処理を続ける。 If it is determined as a commercial video as a result of the determination, the section is stored in the buffer memory in the unnecessary scene specifying unit 136 (step S206) and recorded in the metadata management file 213. Then, the processing from step S201 is continued until the recording is finished (step S207).

なお、コマーシャル映像区間が判定された後、コマーシャル映像区間前後のショットを示すサムネイル画像ファイル２１４を比較し、重複した映像であると判断された場合、いずれかの映像区間を不要シーンに含めるよう構成してもよい。 In addition, after the commercial video section is determined, the thumbnail image file 214 indicating shots before and after the commercial video section is compared, and if it is determined that the video is an overlapping video, one of the video sections is included in the unnecessary scene. May be.

図１４は、符号化映像再生装置１００における映像検索処理を示すフローチャートである。図１４においては、録画番組再生中に、ユーザーが視聴したい映像シーンを検索する際に図７、図８に示される映像検索画面を表示する処理を説明している。 FIG. 14 is a flowchart showing video search processing in the encoded video playback device 100. FIG. 14 illustrates processing for displaying the video search screen shown in FIGS. 7 and 8 when the user searches for a video scene that the user wants to view during playback of a recorded program.

まず、録画番組再生中に、操作入力部１１０の一時停止キーが押下されると、システム制御部１２０は、再生制御情報ファイル２１１に記録されている、再生中の番組の開始時間と終了時間を取得する。そして、システム制御部１２０が保持する現在再生している時刻情報を取得する（ステップＳ３０１）。 First, when the pause key of the operation input unit 110 is pressed during playback of a recorded program, the system control unit 120 sets the start time and end time of the program being played recorded in the playback control information file 211. get. Then, the currently reproduced time information held by the system control unit 120 is acquired (step S301).

その後、システム制御部１２０は、図７に示される映像検索画面（基本）を生成するようＯＳＤ生成部１３７に指示する。ＯＳＤ生成部１３７においては、現在再生している番組の開始時刻、終了時刻、現在再生時刻情報から、図７に示される再生時間バー領域７０１を生成する。そして、ＯＳＤ生成部１３７においては、メタデータ管理ファイル２１３内の不要シーン情報５４０から、番組中の不要シーンの位置を割り出し、再生時間バー領域７０１の黒色で示される領域を記述する（ステップＳ３０２）。 Thereafter, the system control unit 120 instructs the OSD generation unit 137 to generate the video search screen (basic) shown in FIG. The OSD generation unit 137 generates a playback time bar area 701 shown in FIG. 7 from the start time, end time, and current playback time information of the currently played program. Then, the OSD generation unit 137 determines the position of the unnecessary scene in the program from the unnecessary scene information 540 in the metadata management file 213, and describes the area indicated by black in the reproduction time bar area 701 (step S302). .

その後、システム制御部１２０は、メタデータ管理ファイル２１３から現在の再生時刻が属するセグメントを特定する。セグメントの特定は、セグメント情報５２０に記録されている“Ｓｔａｒｔ＿ｓｅｇｍｅｎｔ＿ｔｉｍｅ”５２４から“Ｅｎｄ＿ｓｅｇｍｅｎｔ＿ｔｉｍｅ”５２５までの区間に含まれるか否かを、全てのセグメントに対して実施することによって行われる。 Thereafter, the system control unit 120 specifies the segment to which the current playback time belongs from the metadata management file 213. The segment is specified by performing for all the segments whether or not they are included in the section from “Start_segment_time” 524 to “End_segment_time” 525 recorded in the segment information 520.

セグメントが特定されると、当該セグメント情報５２０から“ｔｈｕｍｂｎａｉｌ＿ｓｅｇｍｅｎｔ＿ｎａｍｅ”５２３を取得し、サムネイル画像ファイル２１４から同名の画像データを取得し、映像検索画面（基本）の中央のセグメント画像として配置する。同様の手順で、現在セグメントの前後各２個のセグメントについても同様の処理を行うことで、セグメント画像情報領域７０２を表示及び更新する（ステップＳ３０３）。なお、サムネイル画像ファイル２１４の抽出時に、図９（Ａ）乃至（Ｅ）、図１０（Ａ）乃至（Ｅ））、図１１（Ａ）乃至（Ｅ）で示される不要シーンを除いた区間に対して、画像ファイルの選出を行ってもよい。 When the segment is specified, “thumbnail_segment_name” 523 is acquired from the segment information 520, image data with the same name is acquired from the thumbnail image file 214, and is arranged as the central segment image on the video search screen (basic). In the same procedure, the segment image information area 702 is displayed and updated by performing the same process for each of the two segments before and after the current segment (step S303). Note that when the thumbnail image file 214 is extracted, it is in the section excluding unnecessary scenes shown in FIGS. 9A to 9E, 10A to 10E, and 11A to 11E. On the other hand, an image file may be selected.

そして、操作入力部１１０の左右キーが押下されると、選択されているセグメントの位置を更新する（ステップＳ３０４）。同様に、操作入力部１１０の上キーが押下されると、ステップＳ３０７で示される映像検索画面（詳細）の表示ステップに進む（ステップＳ３０５）。操作入力部１１０の決定キーが押下される（ステップＳ３０６）と、ステップＳ３１１に示される映像検索実行の前処理を行う。なお、操作入力部１１０からの入力があるまで、ステップＳ３０３以降の処理を繰り返す。 When the left / right key of the operation input unit 110 is pressed, the position of the selected segment is updated (step S304). Similarly, when the upper key of the operation input unit 110 is pressed, the process proceeds to the display step of the video search screen (details) shown in step S307 (step S305). When the enter key of the operation input unit 110 is pressed (step S306), pre-processing for video search execution shown in step S311 is performed. Note that the processes in and after step S303 are repeated until there is an input from the operation input unit 110.

ステップＳ３０７においては映像検索画面（詳細）を表示する処理であるが、システム制御部１２０は、まず現在選択されているセグメント情報５２０の“ｒｅｆ＿ｔｏ＿ｓｈｏｔＩＤ”５２１を取得する。その後、システム制御部１２０は、当該“ｒｅｆ＿ｔｏ＿ｓｈｏｔＩＤ”５２１と同じ“Ｓｈｏｔ＿ＩＤ”６０２を持つショット情報５３０を取得し、映像検索画面（詳細）中の中央のショット画像として配置する。それ以降は、セグメント画像情報領域７０２中のサムネイル配置ルールと同様の手順で、ショット画像情報領域８０１内のサムネイル画像ファイル２１４の配置が行われる。なお、サムネイル画像ファイル２１４の抽出時に、図９（Ａ）乃至（Ｅ）、図１０（Ａ）乃至（Ｅ））、図１１（Ａ）乃至（Ｅ）で示される不要シーンを除いた区間に対して、画像ファイルの選出を行ってもよい。 In step S307, the video search screen (details) is displayed. First, the system control unit 120 acquires “ref_to_shotID” 521 of the currently selected segment information 520. Thereafter, the system control unit 120 acquires shot information 530 having the same “Shot_ID” 602 as the “ref_to_shotID” 521 and arranges it as a central shot image in the video search screen (details). Thereafter, the thumbnail image file 214 in the shot image information area 801 is arranged in the same procedure as the thumbnail arrangement rule in the segment image information area 702. Note that when the thumbnail image file 214 is extracted, it is in the section excluding unnecessary scenes shown in FIGS. 9A to 9E, 10A to 10E, and 11A to 11E. On the other hand, an image file may be selected.

そして、操作入力部１１０の左右キーが押下されると、選択されているショットの位置を更新する（ステップＳ３０８）。同様に、操作入力部１１０の下キーが押下されると、ステップＳ３０３で示される映像検索画面（基本）の表示ステップに進む（ステップＳ３０９）。操作入力部１１０の決定キーが押下される（ステップＳ３１０）と、ステップＳ３１１に示される映像検索実行の前処理を行う。なお、操作入力部１１０からの入力があるまで、ステップＳ３０７以降の処理を繰り返す。 When the left / right key of the operation input unit 110 is pressed, the position of the selected shot is updated (step S308). Similarly, when the down key of the operation input unit 110 is pressed, the process proceeds to the video search screen (basic) display step shown in step S303 (step S309). When the enter key of the operation input unit 110 is pressed (step S310), the video search execution preprocessing shown in step S311 is performed. Note that the processes in and after step S307 are repeated until there is an input from the operation input unit 110.

ステップＳ３１１においては、操作入力部１１０の決定キーが押下されたセグメント又はショットに関連付けられた開始時間情報を、メタデータ管理ファイル２１３中に記録されている“Ｓｔａｒｔ＿ｓｅｇｍｅｎｔ＿ｔｉｍｅ”５２４又は“Ｓｔａｒｔ＿ｓｈｏｔ＿ｔｉｍｅ”５３２から取得する（ステップＳ３１１）。そして、システム制御部１２０からの指示により、当該開始時間情報の位置にタイムサーチを行い、当該地点から通常再生を行うことで視聴したい映像シーンの検索処理を実行する（ステップＳ３１０）。なお、タイムサーチ後は、通常再生に移行しても、一時停止を継続してもよい。 In step S311, the start time information associated with the segment or shot for which the enter key of the operation input unit 110 is pressed is acquired from “Start_segment_time” 524 or “Start_shot_time” 532 recorded in the metadata management file 213. (Step S311). Then, in response to an instruction from the system control unit 120, a time search is performed at the position of the start time information, and a video scene search process to be viewed is performed by performing normal playback from the point (step S310). Note that after the time search, the normal playback may be continued or the pause may be continued.

本実施の形態の装置及び方法によれば、番組把握のために意味的重要性が低いシーン、例えば、コマーシャル映像、又は、コマーシャル前後に位置する重複シーン、を除いた区間を対象に、意味的重要性に基づいて階層化されたサムネイル画像を表示及び選択することができる。このため、本実施の形態の装置及び方法によれば、ユーザーは、階層化されたサムネイル画像を選択することによって、視聴したい映像シーンを迅速、簡単、且つ確実に検索することが可能になる。 According to the apparatus and method of the present embodiment, semantics are targeted for a section excluding a scene having low semantic importance for grasping a program, for example, a commercial video or overlapping scenes positioned before and after the commercial. A thumbnail image hierarchized based on importance can be displayed and selected. For this reason, according to the apparatus and method of the present embodiment, the user can quickly, easily and reliably search for a video scene desired to be viewed by selecting a hierarchical thumbnail image.

また、本実施の形態の装置及び方法によれば、符号化映像像再生装置１００内に保持するサムネイル画像は、ショットとセグメントを示す画像情報のみであればよく、全ての符号化圧縮単位のサムネイル画像を保持する必要はない。また、本実施の形態の装置及び方法によれば、サムネイル画像としては、ＭＰＥＧで利用される画像全体の平均値を表す成分であるＤＣ成分から生成することで、画像間の比較を容易にするとともに、画像圧縮効率に優れたサムネイル画像の情報を記録することができる。このため、符号化映像再生装置１００が取り扱う画像情報量は少なくなり、符号化映像再生装置１００のハードウェア及び／又はソフトウェアには高い情報処理能力が要求されることはなく、回路規模が小さくても、視聴したい映像シーンを効率的に検索することが可能となる。 Further, according to the apparatus and method of the present embodiment, the thumbnail image held in the encoded video image reproduction apparatus 100 only needs to be image information indicating shots and segments, and thumbnails of all encoding compression units. There is no need to keep an image. Further, according to the apparatus and method of the present embodiment, thumbnail images are generated from DC components, which are components representing the average value of all images used in MPEG, thereby facilitating comparison between images. At the same time, it is possible to record thumbnail image information with excellent image compression efficiency. For this reason, the amount of image information handled by the encoded video reproduction apparatus 100 is reduced, the hardware and / or software of the encoded video reproduction apparatus 100 is not required to have high information processing capability, and the circuit scale is small. However, it is possible to efficiently search for a video scene to be viewed.

本発明の実施の形態に係る符号化映像再生装置を含むシステムの構成を概略的に示すブロック図である。1 is a block diagram schematically showing the configuration of a system including an encoded video reproduction device according to an embodiment of the present invention. 実施の形態に係る符号化映像再生装置の記録再生ドライブ部内の論理ファイル構造を示す図である。It is a figure which shows the logical file structure in the recording / reproducing drive part of the encoded video reproducing | regenerating apparatus which concerns on embodiment. 図２に示されるストリーム情報ファイルの４階層から成るデータ管理構造を示す図である。It is a figure which shows the data management structure which consists of four layers of the stream information file shown by FIG. 図２に示される再生制御情報ファイルのシンタックスを示す図である。FIG. 3 is a diagram illustrating the syntax of a playback control information file shown in FIG. 2. 図２に示されるメタデータ管理ファイルのシンタックスを示す図である。FIG. 3 is a diagram illustrating syntax of a metadata management file shown in FIG. 2. 図５に示されるメタデータ管理ファイルのデータ構造におけるセグメントとショットの関連図である。FIG. 6 is a diagram showing the relationship between segments and shots in the data structure of the metadata management file shown in FIG. 5. 実施の形態に係る符号化映像再生装置によって表示装置に表示される映像検索画面（基本）の一例を示す図である。It is a figure which shows an example of the video search screen (basic) displayed on a display apparatus by the encoded video reproduction apparatus which concerns on embodiment. 実施の形態に係る符号化映像再生装置によって表示装置に表示される映像検索画面（詳細）の一例を示す図である。It is a figure which shows an example of the video search screen (details) displayed on a display apparatus by the encoded video reproduction apparatus which concerns on embodiment. （Ａ）乃至（Ｅ）は、実施の形態に係る符号化映像再生装置の不要シーン特定部によって特定された不要シーンを除いて、サムネイル画面を生成する処理の説明図である。(A) thru | or (E) are explanatory drawings of the process which produces | generates a thumbnail screen except for the unnecessary scene specified by the unnecessary scene specific | specification part of the encoded video reproduction apparatus which concerns on embodiment. （Ａ）乃至（Ｅ）は、実施の形態に係る符号化映像再生装置の不要シーン特定部によって特定された不要シーンを除いて、サムネイル画面を生成する他の処理の説明図である。(A) thru | or (E) are explanatory drawings of the other process which produces | generates a thumbnail screen except the unnecessary scene specified by the unnecessary scene specific | specification part of the encoded video reproduction apparatus which concerns on embodiment. （Ａ）乃至（Ｅ）は、実施の形態に係る符号化映像再生装置の不要シーン特定部によって特定された不要シーンを除いて、サムネイル画面を生成する他の処理の説明図である。(A) thru | or (E) are explanatory drawings of the other process which produces | generates a thumbnail screen except the unnecessary scene specified by the unnecessary scene specific | specification part of the encoded video reproduction apparatus which concerns on embodiment. 実施の形態に係る符号化映像再生装置におけるセグメント情報及びショット情報の取得処理を示すフローチャートである。It is a flowchart which shows the acquisition process of the segment information and shot information in the encoded video reproduction device according to the embodiment. 実施の形態に係る符号化映像再生装置における不要シーン情報の取得処理を示すフローチャートである。It is a flowchart which shows the acquisition process of the unnecessary scene information in the encoded video reproduction apparatus which concerns on embodiment. 実施の形態に係る符号化映像再生装置における映像検索処理を示すフローチャートである。It is a flowchart which shows the video search process in the encoded video reproduction apparatus which concerns on embodiment.

Explanation of symbols

１００符号化映像再生装置、１１０操作入力部、１２０システム制御部、１２１メモリ部、１３０デコーダブロック、１３１放送受信部、１３２ストリーム制御部、１３３記録再生ドライブ部、１３４映像音声デコーダ部、１３５特徴シーン抽出部、１３６不要シーン特定部、１４０表示装置、２００ルートディレクトリ、２０１マルチメディアディレクトリ、２０２ストリーム管理ディレクトリ、２０３メタデータ管理ディレクトリ、２１１再生制御情報ファイル、２１２ストリーム情報ファイル、２１３メタデータ管理ファイル、２１４サムネイル画像ファイル、３００ＧＯＰ、３０１ショット、３０２セグメント、７０１再生時間バー領域、７０２セグメント画像情報領域、７０３セグメント属性情報領域、８０１ショット画像情報領域、８０２ショット属性情報領域。
100 encoded video playback device, 110 operation input unit, 120 system control unit, 121 memory unit, 130 decoder block, 131 broadcast reception unit, 132 stream control unit, 133 recording / playback drive unit, 134 video / audio decoder unit, 135 feature scene Extraction unit, 136 Unnecessary scene identification unit, 140 display device, 200 root directory, 201 multimedia directory, 202 stream management directory, 203 metadata management directory, 211 playback control information file, 212 stream information file, 213 metadata management file, 214 thumbnail image file, 300 GOP, 301 shots, 302 segments, 701 playback time bar area, 702 segment image information area, 703 segment attribute Distribution region, 801 shot image information area 802 shot attribute information area.

Claims

Recording / reproducing means for recording / reproducing moving image data obtained by multiplexing the encoded and compressed video signal and audio signal;
From a physical change amount of the video signal, using a plurality of determination criteria, detect video sections of a plurality of layers corresponding to the plurality of determination criteria, respectively, and extract feature information representing each video section of each layer. A feature scene extraction means;
Search screen generating means for generating a video search menu including thumbnail images representing each video section of each layer based on the feature information;
An operation input means for receiving an operation input by a user;
Control means for causing the recording / playback means to start playback from the video scene associated with the selected thumbnail image by selecting one of the thumbnail images in the video search menu at the operation input unit. An encoded video reproduction apparatus characterized by comprising:

From an amount of physical change in both the video signal and the audio signal, an unnecessary scene specifying unit for specifying an unnecessary scene is provided,
The encoded video reproduction apparatus according to claim 1, wherein the search screen generation unit generates the video search menu for a video section excluding the unnecessary scene.

The plurality of determination criteria include a first threshold value and a second threshold value smaller than the first threshold value,
The video section of the hierarchy corresponding to the first threshold is a segment,
The encoded video reproduction device according to claim 1, wherein the video section of the hierarchy corresponding to the second threshold is a shot that is a section that matches the segment or a section obtained by dividing the segment. .

The video search menu includes a segment image information area including a plurality of thumbnail images representing each segment, and a shot image information area including a plurality of thumbnail images representing each shot,
The thumbnail image included in the shot image information area includes thumbnail images of a plurality of shots selected in order of time relative to one of the thumbnail images included in the segment image information area. The encoded video reproduction apparatus according to claim 3.

The encoded video reproduction apparatus according to any one of claims 1 to 4, wherein the number of thumbnail images in the same hierarchy included in the video search menu is an odd number.

From the physical change amount of the video signal of the video data multiplexed by the encoded and compressed video signal and audio signal recorded by the recording / reproducing means, the feature scene extracting means uses a plurality of judgment criteria, and Detecting video sections of a plurality of hierarchies corresponding to a plurality of criteria, respectively, and extracting feature information representing each video section of each hierarchy;
Generating a video search menu including a thumbnail image representing each video section of each layer based on the feature information by a search screen generating means;
An operation input unit that receives an operation input by a user selects one of the thumbnail images in the video search menu, and plays back the video scene associated with the selected thumbnail image on the recording / playback unit. And a method for reproducing the encoded video.

A step of specifying an unnecessary scene from the physical change amount of both the video signal and the audio signal by an unnecessary scene specifying unit;
The encoded video reproduction method according to claim 6, wherein, in the step of generating the video search menu, the video search menu is generated for a video section excluding the unnecessary scene.

The plurality of determination criteria include a first threshold value and a second threshold value smaller than the first threshold value,
The video section of the hierarchy corresponding to the first threshold is a segment,
The encoded video reproduction method according to claim 6 or 7, wherein the video section of the hierarchy corresponding to the second threshold is a shot that is a section that matches the segment or a section obtained by dividing the segment. .

The video search menu includes a segment image information area including a plurality of thumbnail images representing each segment, and a shot image information area including a plurality of thumbnail images representing each shot,
The thumbnail images included in the shot image information area include thumbnail images of a plurality of shots selected in order of time relative to one of the thumbnail images included in the segment image information area. The encoded video reproduction method according to claim 8.

10. The encoded video reproduction method according to claim 6, wherein the number of thumbnail images in the same hierarchy included in the video search menu is an odd number.