JP2011004197A

JP2011004197A - Recording and reproducing apparatus and reproducing method

Info

Publication number: JP2011004197A
Application number: JP2009145856A
Authority: JP
Inventors: Takashi Kasano; 孝志笠野
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2009-06-18
Filing date: 2009-06-18
Publication date: 2011-01-06

Abstract

PROBLEM TO BE SOLVED: To achieve viewing and listening to a music section in content while omitting an overlapped music section therein.SOLUTION: A theme song detector 14 consists of a music section detector 101, a provider scene detector 102, a shot detector 103, a telop detector 104, a closed caption detector 105, a language analyzer 106, and a detected information manager 107. The music section detector 101 detects an original story music section from an audio signal. The provider scene detector 102 detects a scene representing a provider from a video before and behind of the original story music section. The shot detector 103 detects a section whose average recording period of time of a shot interval in the original story music section is within a predetermined period of time. The language analyzer 106 detects the section in which a predetermined keyword exists by using the telop and closed caption of the original story music section detected by the telop detector 104 and the closed caption detector 105. The detected information manager 107 decides a theme song section for which reproduction is omitted on the basis of the detected result.

Description

本発明は録画再生装置及び録画再生装置に記憶されたコンテンツの再生方法に関する。 The present invention relates to a recording / playback apparatus and a method for playing back content stored in the recording / playback apparatus.

録画再生装置に搭載される記録装置の容量が増大したことにより、多数のコンテンツを
記録することが可能となった。この多数コンテンツの中から、ユーザが視聴を望む部分を
抽出し、効率良くコンテンツを視聴する工夫がされている。 Since the capacity of the recording device mounted on the recording / playback device has increased, it has become possible to record a large number of contents. From this large number of contents, a part that the user desires to view is extracted and the contents are viewed efficiently.

例えば、特許文献１に開示されている技術によると、シリーズ化した複数話構成のコン
テンツにおいて、一定区間共通するデータがあるか否かの判別を行い、その結果、共通し
ている一定区間についてはスキップする。これにより、同一シリーズ中で共通する区間を
省略し、コンテンツ本編のみを再生させることができ、視聴時間を短縮することができる
。 For example, according to the technology disclosed in Patent Document 1, it is determined whether or not there is data common to a certain section in the content of a series of multiple stories, and as a result, for a certain section that is common, skip. Thereby, a common section in the same series can be omitted, and only the main content can be played back, and the viewing time can be shortened.

特開２００８−１０３５８５号JP 2008-103585 A

上述の技術は、複数のコンテンツ間で比較を行うことでスキップする区間を判別するた
め、同一シリーズを複数話録画しなければ、視聴時間の短縮をすることができない。また
、上述の技術では、複数話の間で時刻に沿って構成のデータを比較する。即ち、コンテン
ツの構成において同一の時刻に録画されていなければ、共通するシーンとして検出できな
い。しかし、シリーズ化した複数話構成のコンテンツにおいては、初回や最終回はコンテ
ンツの構成が異なる場合があり、共通するシーンとして検出されない問題があった。そこ
で、コンテンツの構成が異なる場合にも共通するシーンを検出し、そのシーンを省略して
視聴することが望まれる。 Since the above-described technology determines a skipped section by comparing a plurality of contents, the viewing time cannot be shortened unless a plurality of episodes of the same series are recorded. Further, in the above-described technique, the data of the configuration is compared between the plurality of stories along the time. In other words, if the content is not recorded at the same time, it cannot be detected as a common scene. However, in the case of a series of multi-story content, the content configuration may be different at the first time and the last time, and there is a problem that it is not detected as a common scene. Therefore, it is desirable to detect a common scene even when the content configuration is different and to view the scene without the scene.

本発明は、上述した点に鑑みてなされたものであり、コンテンツ中の音楽区間から、重
複する音楽区間を省略して視聴することを目的とする。 The present invention has been made in view of the above-described points, and an object thereof is to view a music section in content while omitting overlapping music sections.

上記目的を達成するために、本発明に係る録画再生装置は、コンテンツを受信するコン
テンツ受信部と、前記コンテンツ受信部にて受信したコンテンツを記憶する記憶装置と、
前記コンテンツ受信部にて受信したコンテンツを構成する信号のうち、少なくとも映像信
号、音声信号及び字幕信号を分離するデコーダ部と、前記デコーダ部により分離された音
声信号を用いて、コンテンツ本編中にて音楽が含まれる区間を検出する第１の検出部と、
前記第１の検出部にて検出された区間に存在する字幕信号により構成される文字列中から
、シリーズ化したコンテンツに重複する区間に存在する文字を検出する第２の検出部と、
前記第２の検出部により検出された検出結果を記憶し、記憶された検出結果を用いて再生
を省略する区間を決定する検出情報管理部と、前記コンテンツを再生する場合、前記検出
情報管理部にて決定された区間を省略して再生する再生制御部とを有することを特徴とし
ている。 In order to achieve the above object, a recording / reproducing apparatus according to the present invention includes a content receiving unit that receives content, a storage device that stores the content received by the content receiving unit,
In the main part of the content, a decoder unit that separates at least a video signal, an audio signal, and a caption signal among the signals constituting the content received by the content receiving unit and the audio signal separated by the decoder unit A first detection unit for detecting a section including music;
A second detection unit that detects characters existing in a section that overlaps with the serialized content from a character string configured by a caption signal that exists in the section detected by the first detection unit;
A detection information management unit that stores a detection result detected by the second detection unit, determines a section in which reproduction is omitted using the stored detection result, and a detection information management unit when reproducing the content And a reproduction control unit that reproduces data while omitting the section determined in (1).

また、本発明に係る再生方法は、コンテンツを受信し、前記コンテンツを構成する信号
のうち、少なくとも映像信号、音声信号及び字幕信号を分離し、前記音声信号を用いてコ
ンテンツ本編中で音楽が含まれる区間を検出し、検出した前記区間に存在する字幕信号に
より構成される文字列中から、シリーズ化したコンテンツに重複する区間を示す文字が存
在する区間を検出し、前記字幕信号を用いた検出結果により再生を省略する区間を決定し
、決定した前記区間を前記コンテンツ中から省略して再生することを特徴としている。 Also, the playback method according to the present invention receives content, separates at least a video signal, an audio signal, and a caption signal from signals constituting the content, and includes music in the main content by using the audio signal. Detected from the character string composed of the caption signal existing in the detected section, the section where the character indicating the section overlapping the serialized content exists, and detection using the caption signal A section in which reproduction is omitted is determined based on the result, and the determined section is reproduced from the content by omitting it.

コンテンツ中の音楽区間から、重複する音楽区間を省略して視聴することが実現する。 It is possible to perform viewing by omitting overlapping music sections from the music sections in the content.

本実施の形態における録画再生装置の機能ブロック図。The functional block diagram of the recording / reproducing apparatus in this Embodiment. 本実施の形態における主題歌検出部の機能ブロック図。The functional block diagram of the theme song detection part in this Embodiment. 本実施の形態における提供元シーンによる検出手順を表すフローチャート。The flowchart showing the detection procedure by the provider scene in this Embodiment. 本実施の形態におけるショット間隔による検出手順を表すフローチャート。The flowchart showing the detection procedure by the shot space | interval in this Embodiment. 本実施の形態におけるテロップによる検出手順を表すフローチャート。The flowchart showing the detection procedure by the telop in this Embodiment. 本実施の形態における字幕による検出手順を表すフローチャート。The flowchart showing the detection procedure by the caption in this Embodiment. 本実施の形態における無字幕区間による検出手順を表すフローチャート。The flowchart showing the detection procedure by the non-caption section in this Embodiment. 本実施の形態における検出情報管理部により出力する主題歌区間を決定する手順を表すフローチャート。The flowchart showing the procedure which determines the theme song area output by the detection information management part in this Embodiment. 本実施の形態における検出情報管理部により出力する主題歌区間を決定する手順の変形例を表すフローチャート。The flowchart showing the modification of the procedure which determines the theme song area output by the detection information management part in this Embodiment.

以下に、図１〜図９を参照して、本発明の実施の形態について説明する。録画再生装置
１は、放送局やインターネット上のサーバ等から受信したコンテンツを録画し、再生する
装置である。例えば、テレビジョン受像機やパーソナルコンピュータである。まず、図１
を用いて、録画再生装置１の機能について説明する。図１は、本実施の形態における録画
再生装置１の機能ブロック図である。 Hereinafter, an embodiment of the present invention will be described with reference to FIGS. The recording / reproducing apparatus 1 is an apparatus that records and reproduces content received from a broadcasting station, a server on the Internet, or the like. For example, a television receiver or a personal computer. First, FIG.
The function of the recording / playback apparatus 1 will be described with reference to FIG. FIG. 1 is a functional block diagram of a recording / playback apparatus 1 according to the present embodiment.

録画再生装置１は、コンテンツ受信部１０と、記録再生部１１と、記録装置１２と、デ
コーダ部１３と、主題歌検出部１４と、ＣＭ検出部１５と、受信部１６と、再生制御部１
７と、Ｉ／Ｆ（Ｉｎｔｅｒｆａｃｅ）部１８とから構成されている。 The recording / playback apparatus 1 includes a content reception unit 10, a recording / playback unit 11, a recording device 12, a decoder unit 13, a theme song detection unit 14, a CM detection unit 15, a reception unit 16, and a playback control unit 1.
7 and an I / F (Interface) unit 18.

コンテンツ受信部１０は、外部のアンテナを介して受信されるコンテンツの信号を復調
し、映像音声や字幕等の信号を、記録再生部１１と、デコーダ部１３とに伝送する。コン
テンツとは、放送局から放送されるものや、インターネット上で配信されるもの等がある
。 The content receiving unit 10 demodulates a content signal received via an external antenna, and transmits signals such as video and audio and captions to the recording / reproducing unit 11 and the decoder unit 13. Content includes those broadcast from broadcast stations and those distributed over the Internet.

記録再生部１１は、コンテンツ受信部１０から各種信号を受信し、それらを記録装置１
２に記録する形式に変換する。また、記録再生部１１は、主題歌検出部１４から入力され
る主題歌区間を特定するメタデータを各種信号に付加して、記録装置１２に記録する。主
題歌区間を特定するメタデータとは、主題歌区間の時刻情報と、主題歌区間を検出するた
めに用いたアルゴリズムと、アルゴリズムを用いた際に各種信号と該当したキーワードや
映像等である。また、記録再生部１１は、再生制御部１７から再生の動作を表す操作信号
を受信した場合、記録装置１２から各種信号及び付加されたメタデータを読出し、再生制
御部１７で扱う形式に変換して再生制御部１７へ送信する。 The recording / reproducing unit 11 receives various signals from the content receiving unit 10 and outputs them to the recording device 1.
2. Convert to the format recorded in 2. In addition, the recording / reproducing unit 11 adds metadata specifying the theme song section input from the theme song detection unit 14 to various signals and records them in the recording device 12. The metadata specifying the theme song section includes time information of the theme song section, an algorithm used for detecting the theme song section, and various keywords and videos corresponding to various signals when the algorithm is used. Further, when receiving an operation signal representing a reproduction operation from the reproduction control unit 17, the recording / reproducing unit 11 reads various signals and added metadata from the recording device 12 and converts them into a format handled by the reproduction control unit 17. To the playback control unit 17.

記録装置１２は、記録再生部１１により変換された映像音声や字幕等のコンテンツを構
成する各種信号を記録する。また、記録装置１２は、主題歌検出部１４により決定された
主題歌区間を特定するメタデータを各種信号と共に記録する。記録装置１２は、例えばハ
ードディスク、半導体メモリ、光ディスク、磁気ディスク等である。 The recording device 12 records various signals constituting the content such as video / audio and subtitles converted by the recording / reproducing unit 11. Moreover, the recording device 12 records the metadata specifying the theme song section determined by the theme song detection unit 14 together with various signals. The recording device 12 is, for example, a hard disk, a semiconductor memory, an optical disk, a magnetic disk, or the like.

デコーダ部１３は、コンテンツ受信部１０から受信したコンテンツを構成する各種信号
から、映像信号と音声信号と字幕信号とに分離する。そして、デコーダ部１３は、分離し
た映像信号と音声信号と字幕信号とを、主題歌検出部１４にて処理できる形式に変換して
、主題歌検出部１４に送信する。また、デコーダ部１３は、分離した音声信号をＣＭ検出
部１５にて処理できる形式に変換して、ＣＭ検出部１５に送信する。デコーダ部１３は、
主題歌検出部１４及びＣＭ検出部１５に各種信号を送信する際に、分離した各種の信号同
士で同期が取れるように時刻情報を付加する。 The decoder unit 13 separates various signals constituting the content received from the content receiving unit 10 into a video signal, an audio signal, and a caption signal. Then, the decoder unit 13 converts the separated video signal, audio signal, and subtitle signal into a format that can be processed by the theme song detection unit 14 and transmits the converted signal to the theme song detection unit 14. The decoder unit 13 converts the separated audio signal into a format that can be processed by the CM detection unit 15 and transmits the converted signal to the CM detection unit 15. The decoder unit 13
When various signals are transmitted to the theme song detector 14 and the CM detector 15, time information is added so that the separated signals can be synchronized with each other.

主題歌検出部１４は、デコーダ部１３から受信した各種信号を用いて、後述する各機能
ブロックにより実行される検出手法に従い主題歌区間を検出する。主題歌検出部１４は、
検出した主題歌区間を特定するメタデータに基づき、出力する主題歌区間を決定し、決定
された主題歌区間を記録再生部１１に送信する。主題歌検出部１４の機能については、図
２を用いて後述する。 The theme song detection unit 14 uses the various signals received from the decoder unit 13 to detect a theme song section according to a detection method executed by each functional block described later. The theme song detection unit 14
Based on the metadata specifying the detected theme song section, the theme song section to be output is determined, and the determined theme song section is transmitted to the recording / reproducing unit 11. The function of the theme song detection unit 14 will be described later with reference to FIG.

ＣＭ検出部１５は、デコーダ部１３から受信した音声信号を用いて、本編音楽区間とＣ
Ｍ区間とを検出する。ＣＭ検出部１５は、検出したＣＭ区間の時刻情報を主題歌検出部１
４に送信する。ＣＭ区間の検出手法としては特開平３−１８４４８３号の手法を用いるこ
とができるため、この点について詳細な説明は省略する。 The CM detection unit 15 uses the audio signal received from the decoder unit 13 to perform the main music section and C
M section is detected. The CM detection unit 15 obtains the time information of the detected CM section as the theme song detection unit 1
4 to send. As a method for detecting the CM section, the method disclosed in Japanese Patent Laid-Open No. 3-184443 can be used, and detailed description thereof will be omitted.

受信部１６は、録画再生装置１を操作するためのリモートコントローラ（図示しない）
等から操作信号を受信し、再生制御部１７にて処理できる形式に変換して再生制御部１７
へ送信する。操作信号とは例えばコンテンツの再生や各種の特殊再生を指示するもの等で
ある。 The receiving unit 16 is a remote controller (not shown) for operating the recording / reproducing apparatus 1.
The operation control signal is received from the control unit 17 and converted into a format that can be processed by the reproduction control unit 17.
Send to. The operation signal is, for example, a signal for instructing content reproduction or various special reproductions.

再生制御部１７は、受信部１６から受信した操作信号の表す処理を実行する。例えば、
受信した操作信号がコンテンツの再生を表す場合、記録再生部１１に記録装置１２からコ
ンテンツを読み出すように指示する。主題歌をスキップする特殊再生が指示された場合、
コンテンツと共に記憶された主題歌区間を特定するメタデータを共に読出し、特定された
主題歌区間をスキップする再生を実行する。 The reproduction control unit 17 executes processing represented by the operation signal received from the receiving unit 16. For example,
When the received operation signal indicates content playback, the recording / playback unit 11 is instructed to read the content from the recording device 12. If you are instructed to play a special song that skips the theme song,
The metadata specifying the theme song section stored together with the content is read together, and reproduction is performed to skip the specified theme song section.

Ｉ／Ｆ部１８は、再生制御部１７と出力装置１９との間で各種信号の送信の仲介を行う
インターフェースである。再生制御部１７で再生されるコンテンツを、出力装置１９で出
力する形式に変換して出力装置１９へ送信する。 The I / F unit 18 is an interface that mediates transmission of various signals between the reproduction control unit 17 and the output device 19. The content reproduced by the reproduction control unit 17 is converted into a format output by the output device 19 and transmitted to the output device 19.

出力装置１９は、Ｉ／Ｆ部１８から送信された各種信号を表示するための外部装置であ
る。出力装置１９は、例えばディスプレイ、スピーカー等が挙げられる。 The output device 19 is an external device for displaying various signals transmitted from the I / F unit 18. Examples of the output device 19 include a display and a speaker.

次に、図２を用いて主題歌検出部１４の機能について説明する。図２は、本実施の形態
における主題歌検出部１４の機能ブロック図である。 Next, the function of the theme song detection part 14 is demonstrated using FIG. FIG. 2 is a functional block diagram of the theme song detection unit 14 in the present embodiment.

主題歌検出部１４は、音楽区間検出部１０１と、提供元シーン検出部１０２と、ショッ
ト検出部１０３と、テロップ検出部１０４と、字幕検出部１０５と、言語解析部１０６と
、検出情報管理部１０７と、メモリ部１０７ａとから構成されている。 The theme song detection unit 14 includes a music section detection unit 101, a provider scene detection unit 102, a shot detection unit 103, a telop detection unit 104, a caption detection unit 105, a language analysis unit 106, and a detection information management unit. 107 and a memory unit 107a.

音楽区間検出部１０１は、デコーダ部１３から音声信号、ＣＭ検出部１５からＣＭ区間
情報をそれぞれ受信する。まず、音楽区間検出部１０１は、受信したＣＭ区間をコンテン
ツから除いたコンテンツ本編を検出する。次に、音楽区間検出部１０１は、コンテンツ本
編区間から、音楽が挿入されている区間を検出する。音楽が挿入されている区間を本編音
楽区間と呼べば、コンテンツ本編区間に比べて本編音楽区間の収録時間が短くなる。ここ
では、コンテンツ本編区間を大区間、本編音楽区間は小区間と言うこともできる。 The music section detection unit 101 receives an audio signal from the decoder unit 13 and CM section information from the CM detection unit 15. First, the music section detection unit 101 detects the main content that excludes the received CM section from the content. Next, the music section detection unit 101 detects a section in which music is inserted from the main content section. If a section in which music is inserted is called a main part music section, the recording time of the main part music section is shorter than the main part of the content. Here, it can be said that the main content section is a large section and the main music section is a small section.

次に、音楽区間検出部１０１は、検出された小区間を提供元シーン検出部１０２と、シ
ョット検出部１０３と、テロップ検出部１０４と、字幕検出部１０５とに送信する。即ち
、提供元シーン検出部１０２と、ショット検出部１０３と、テロップ検出部１０４と、字
幕検出部１０５とで扱う区間は、音楽区間検出部１０１で検出された本編の音楽区間であ
る。音楽区間を検出する手法は、例えば特開２００８−３１０３３８号の手法を用いるこ
とができるため、この点について詳細な説明は省略する。ただし、音楽区間検出の手法は
これに限定されるものではない。 Next, the music section detecting unit 101 transmits the detected small section to the providing scene detecting unit 102, the shot detecting unit 103, the telop detecting unit 104, and the caption detecting unit 105. That is, the section handled by the source scene detection unit 102, the shot detection unit 103, the telop detection unit 104, and the caption detection unit 105 is the main music section detected by the music section detection unit 101. As a method for detecting a music section, for example, the method disclosed in Japanese Patent Application Laid-Open No. 2008-310338 can be used, and detailed description thereof will be omitted. However, the method of music section detection is not limited to this.

提供元シーン検出部１０２は、音楽区間検出部１０１から受信した本編音楽区間に存在
する映像信号及び音声信号をデコーダ部１３から受信し、予め登録された提供元シーンの
画像又は音声とパターンマッチングを行う。提供元シーンとは、ドラマやアニメなどでは
コンテンツの主題歌区間の前後に挿入され、提供元のロゴ等を表示するものである。音楽
区間検出部１０１から受信した本編音楽区間の前後の一定区間中から、提供元シーンを検
出することで検出された本編音楽区間が主題歌区間か否かを判別する。提供元シーン検出
部１０２は、提供元を表すシーンを検出した場合、提供元シーン検出部１０２によって主
題歌区間を検出したこと、検出された提供元シーン、主題歌区間として検出した本編音楽
区間の時刻情報とが検出情報管理部１０７に送信する。提供元シーンによる主題歌区間を
検出する手順については、図３を用いて後述する。 The source scene detection unit 102 receives the video signal and audio signal existing in the main music segment received from the music segment detection unit 101 from the decoder unit 13, and performs pattern matching with the image or audio of the source scene registered in advance. Do. The provider scene is inserted before and after the theme song section of content in drama or animation, and displays a provider's logo or the like. It is determined whether or not the main music section detected by detecting the providing source scene is a theme song section from among certain sections before and after the main music section received from the music section detection unit 101. When the scene representing the provider is detected, the provider scene detector 102 detects that the theme song section has been detected by the provider scene detector 102, the detected provider scene, and the main music section detected as the theme song section. The time information is transmitted to the detection information management unit 107. The procedure for detecting the theme song section by the provider scene will be described later with reference to FIG.

ショット検出部１０３は、音楽区間検出部１０１から受信した本編音楽区間に存在する
映像信号をデコーダ部１３から受信し、映像信号により構成される連続した映像をショッ
ト単位に分割するカット点を検出する。ここでショットとは、切れ目なしに連続して撮影
された映像のことである。ショット検出の手段は例えば、１枚毎に入力された画像フレー
ムに対し、直前に入力された画像フレームとの類似度を計算し、画像の内容が切り替わる
点を検出する手法を用いることができる。ただし、ショット検出の方法はこれに限定され
るものではない。オープニングシーン及びエンディングシーン（以後、ＯＰ／ＥＤと表す
）ではコンテンツ本編に比べてショットの間隔が小さいことが多い。即ち、ＯＰ／ＥＤは
、そのコンテンツの前回までのあらすじ若しくは次回の予告等を表すような役割を担うも
のであり、象徴的なシーンを集約させた構成であることが多いためである。そこで、検出
したショット間隔を複数の本編音楽区間で比較し、ショット間隔の平均が最も短い区間を
主題歌区間として検出することができる。ショット検出部１０３は、ショット間隔を検出
した場合、ショット検出部１０３によってショット間隔を検出したこと、主題歌区間とし
て検出した本編音楽区間の時刻情報とが検出情報管理部１０７に送信する。 The shot detection unit 103 receives a video signal existing in the main music segment received from the music segment detection unit 101 from the decoder unit 13 and detects a cut point that divides a continuous video constituted by the video signal into shot units. . Here, a shot is an image taken continuously without a break. As the shot detection means, for example, it is possible to use a technique of calculating the similarity between the image frame input for each image and the image frame input immediately before and detecting the point at which the image content is switched. However, the shot detection method is not limited to this. In the opening scene and the ending scene (hereinafter referred to as OP / ED), the shot interval is often smaller than the main content. That is, the OP / ED plays a role of representing the outline of the content up to the previous time or the next notice, and is often configured to collect symbolic scenes. Therefore, the detected shot intervals can be compared among a plurality of main music sections, and the section with the shortest average shot interval can be detected as the theme song section. When the shot detection unit 103 detects the shot interval, the shot detection unit 103 transmits the detection of the shot interval and the time information of the main music section detected as the theme song section to the detection information management unit 107.

テロップ検出部１０４は、音楽区間検出部１０１から受信した本編音楽区間に存在する
映像信号をデコーダ部１３から受信し、映像信号により構成される映像からテロップを抽
出する。そして、テロップ検出部１０４は、抽出したテロップを文字列に変換し言語解析
部１０６に送信する。ここでテロップとは画像中に埋め込まれた文字列であり、オープン
キャプションとも呼ばれるものである。即ちテロップとは、映像信号と別に送信される字
幕信号から構成されるクローズドキャプションとは区別している。テロップを検出する手
法は例えば、特登３６９２０１８号の手法を用いることができるため、この点について詳
細な説明は省略する。ただし、テロップ検出の手法はこれに限定されるものではない。 The telop detection unit 104 receives the video signal existing in the main music segment received from the music segment detection unit 101 from the decoder unit 13 and extracts the telop from the video composed of the video signal. The telop detection unit 104 converts the extracted telop into a character string and transmits the character string to the language analysis unit 106. Here, a telop is a character string embedded in an image, and is also called an open caption. That is, the telop is distinguished from the closed caption composed of the caption signal transmitted separately from the video signal. As a technique for detecting a telop, for example, the technique of No. 3692018 can be used, and a detailed description thereof will be omitted. However, the telop detection method is not limited to this.

字幕検出部１０５は、音楽区間検出部１０１から受信した本編音楽区間に存在する字幕
信号（クローズドキャプションとも呼ぶ）をデコーダ部１３から受信し、その字幕信号を
文字列に変換し言語解析部１０６に送信する。尚、字幕検出部１０５は、字幕信号には図
情報も含まれる場合、その図の意味を解析して文字列に変換する。 The caption detection unit 105 receives a caption signal (also referred to as closed caption) present in the main music segment received from the music segment detection unit 101 from the decoder unit 13, converts the caption signal into a character string, and sends it to the language analysis unit 106. Send. When the caption signal includes graphic information, the caption detection unit 105 analyzes the meaning of the graphic and converts it into a character string.

言語解析部１０６は、テロップ検出部１０４及び字幕検出部１０５から受信した文字列
を処理し、それらの文字列中に予め登録された主題歌、音楽及び提供元を表すキーワード
が含まれているか否かを判別する。テロップ検出部１０４から取得したテロップにより、
ＯＰ／ＥＤに含まれるキーワード、及び提供元を表すキーワードを検出する手法は、図５
を用いて後述する。ＯＰ／ＥＤに含まれるキーワードとは、例えば、「コンテンツタイト
ル」、「スタッフ」、「製作」、「キャスト」、「音楽」、「オープニングテーマ」、「
エンディングテーマ」等である。また、提供元を表すキーワードとは、例えば、「提供」
、「ｓｐｏｎｓｏｒｅｄｂｙ」等である。 The language analysis unit 106 processes the character strings received from the telop detection unit 104 and the caption detection unit 105, and whether or not keywords representing the theme song, music, and provider registered in advance are included in the character strings. Is determined. By the telop acquired from the telop detection unit 104,
FIG. 5 shows a technique for detecting a keyword included in OP / ED and a keyword representing a provider.
Will be described later. Keywords included in OP / ED include, for example, “content title”, “staff”, “production”, “cast”, “music”, “opening theme”, “
Ending theme ". The keyword representing the provider is, for example, “provide”
, “Sponsored by” and the like.

また、字幕検出部１０５から取得した字幕により主題歌及び音楽を表すキーワードを検
出する手法は、図６を用いて後述する。主題歌を表すキーワードとは、例えば「テーマ音
楽」、「テーマ」、「オープニング」、「エンディング」等である。また、音楽を表すキ
ーワードとは、「♪〜」や「音楽」等である。 A method for detecting a keyword representing a theme song and music from subtitles acquired from the subtitle detection unit 105 will be described later with reference to FIG. The keyword representing the theme song is, for example, “theme music”, “theme”, “opening”, “ending”, and the like. In addition, keywords representing music include “♪ ˜” and “music”.

言語解析部１０６は、特定のキーワードを検出した場合、テロップ若しくは字幕の何れ
を用いて主題歌区間を検出したかを示す情報、検出されたキーワード、主題歌区間として
検出した本編音楽区間の時刻情報を検出情報管理部１０７に送信する。一方、特定のキー
ワードが含まれない場合、字幕が無い区間（無字幕区間と呼ぶ）を主題歌区間として検出
したこと、主題歌区間として検出した本編音楽区間の時刻情報を検出情報管理部１０７に
送信する。無字幕区間により主題歌区間を検出する手法は、図７を用いて後述する。 When a specific keyword is detected, the language analysis unit 106 indicates information indicating whether the theme song segment is detected using telop or subtitle, the detected keyword, and time information of the main music segment detected as the theme song segment Is transmitted to the detection information management unit 107. On the other hand, when a specific keyword is not included, the detection information management unit 107 detects that a section without subtitles (referred to as a non-subtitle section) is detected as a theme song section, and time information of a main music section detected as a theme song section. Send. A method of detecting the theme song section by the non-caption section will be described later with reference to FIG.

検出情報管理部１０７は、提供元シーン検出部１０２と、ショット検出部１０３と、言
語解析部１０６とから受信した主題歌区間を特定するメタデータをメモリ部１０７ａに記
憶する。また検出情報管理部１０７は、メモリ部１０７ａからメタデータを読み出し、記
録再生部１１に主題歌区間を出力する。この主題歌区間の決定については、図８及び図９
を用いて後述する。 The detection information management unit 107 stores, in the memory unit 107 a, metadata that identifies the theme song section received from the source scene detection unit 102, the shot detection unit 103, and the language analysis unit 106. The detection information management unit 107 reads the metadata from the memory unit 107 a and outputs the theme song section to the recording / playback unit 11. For the determination of the theme song section, FIG. 8 and FIG.
Will be described later.

次に、提供元シーンの画像又は音声を用いた提供元シーン検出部１０２による主題歌区
間の検出手順について、図３を用いて説明する。図３は、本実施の形態における提供元シ
ーンによる検出手順を表すフローチャートである。 Next, a procedure for detecting a theme song section by the provider scene detection unit 102 using an image or sound of the provider scene will be described with reference to FIG. FIG. 3 is a flowchart showing a detection procedure by a source scene in the present embodiment.

まず、事前に音楽区間検出部１０１において検出した本編音楽区間の内から、ステップ
Ｓ１０１以降の処理が行われていない音楽区間があるか否かを判別する（ステップＳ１０
０）。その結果、ステップＳ１０１以降の処理が行われていない本編音楽区間が無いと判
別した場合（ステップＳ１００のＮｏ）、この手順を終了する。一方、ステップＳ１０１
以降の処理が行われていない本編音楽区間があると判別した場合（ステップＳ１００のＹ
ｅｓ）、次にステップＳ１０１以降の処理が行われていない本編音楽区間の１つを取得す
る（ステップＳ１０１）。 First, it is determined whether or not there is a music section that has not been subjected to the processing from step S101 onward among the main music sections detected in advance by the music section detection unit 101 (step S10).
0). As a result, when it is determined that there is no main part music section for which the processing after step S101 is not performed (No in step S100), this procedure is ended. On the other hand, step S101
When it is determined that there is a main music section that has not been processed thereafter (Y in step S100)
es) Next, one of the main music sections in which the processes in and after step S101 are not performed is acquired (step S101).

次に、取得した本編音楽区間の長さが所定の時間内に収まっているか否かを判別する（
ステップＳ１０２）。通常、主題歌区間は数分であるから、時間に閾値を設けて本編音楽
区間の中から主題歌区間の候補を選出する。 Next, it is determined whether or not the length of the acquired main music section is within a predetermined time (
Step S102). Usually, since the theme song section is several minutes, a candidate for the theme song section is selected from the main music section by setting a threshold for the time.

その結果、その取得した本編音楽区間が所定の時間内に収まっていないと判別した場合
（ステップＳ１０２のＮｏ）、前述したステップＳ１００に戻る。一方、取得した本編音
楽区間が所定の時間内であると判別した場合（ステップＳ１０２のＹｅｓ）、取得した本
編音楽区間の前後１ショットの取得を実行する（ステップＳ１０３）。主題歌区間の前後
に提供元のロゴ等を表すシーンが挿入されることが多いため、前後のショットを取得して
提供元シーンの検出を行う。 As a result, when it is determined that the acquired main music section is not within the predetermined time (No in step S102), the process returns to step S100 described above. On the other hand, when it is determined that the acquired main music section is within the predetermined time (Yes in step S102), acquisition of one shot before and after the acquired main music section is executed (step S103). Since scenes representing the provider's logo and the like are often inserted before and after the theme song section, the source scene is detected by acquiring the previous and subsequent shots.

次に、取得を実行したショットと予め登録していおいた提供元を表す画像及び音声のマ
ッチングを行い、取得を実行したショット内に提供元シーンが存在するか否かを判別する
（ステップＳ１０４）。 Next, the shot that has been acquired is matched with an image and sound representing the provider that has been registered in advance, and it is determined whether or not the source scene is present in the shot that has been acquired (step S104). .

その結果、ショットの取得を実行した区間に提供元を表す画像が存在しないと判別した
場合（ステップＳ１０４のＮｏ）、前述したステップＳ１００に戻る。一方、ショットの
取得を実行した区間に提供元を表す画像が存在すると判別した場合（ステップＳ１０４の
Ｙｅｓ）、ステップＳ１０１で取得した区間を主題歌区間として検出する（ステップＳ１
０５）。即ち、提供元シーン検出部１０２は、主題歌区間を検出した際に、提供元シーン
の画像又は音声を用いて主題歌区間を検出したことを示す情報、検出された提供元シーン
の画像又は音声、主題歌区間として検出した本編音楽区間の時刻情報を検出情報管理部１
０７に送信する。そして、前述のステップＳ１００に戻る。即ち、全ての本編音楽区間に
ついて処理が行われるまで、以上の手順が繰り返される。 As a result, when it is determined that there is no image representing the provider in the section where the shot is acquired (No in step S104), the process returns to step S100 described above. On the other hand, when it is determined that an image representing the provider exists in the section where the shot is acquired (Yes in step S104), the section acquired in step S101 is detected as the theme song section (step S1).
05). That is, when the source scene detection unit 102 detects the theme song section, the information indicating that the theme song section is detected using the image or sound of the source scene, the detected image or sound of the source scene The time information of the main music section detected as the theme song section is detected information management unit 1
Send to 07. And it returns to above-mentioned step S100. That is, the above procedure is repeated until the processing is performed for all main music sections.

次に、図４を用いて、ショット間隔の算出結果を用いたショット検出部１０３による主
題歌区間の検出を説明する。図４は、本実施の形態におけるショット間隔による検出手順
を表すフローチャートである。 Next, the detection of the theme song section by the shot detection unit 103 using the calculation result of the shot interval will be described with reference to FIG. FIG. 4 is a flowchart showing a detection procedure based on shot intervals in the present embodiment.

まず、事前に音楽区間検出部１０１において検出した本編音楽区間の内、ステップＳ１
１１以降の処理が行われていない音楽区間があるか否かを判別する（ステップＳ１１０）
。その結果、ステップＳ１１１以降の処理が行われていない本編音楽区間が無いと判別し
た場合（ステップＳ１１０のＮｏ）、後述するステップＳ１１４に進む。一方、ステップ
Ｓ１１１以降の処理が行われていない本編音楽区間があると判別した場合（ステップＳ１
１０のＹｅｓ）、次にステップＳ１１１以降の処理が行われていない本編音楽区間の１つ
を取得する（ステップＳ１１１）。 First, in the main music section detected by the music section detecting unit 101 in advance, step S1.
It is determined whether or not there is a music section for which the process after 11 has not been performed (step S110).
. As a result, when it is determined that there is no main music section for which the processing from step S111 is not performed (No in step S110), the process proceeds to step S114 described later. On the other hand, when it is determined that there is a main music section that has not been processed after step S111 (step S1).
Next, one of the main part music sections for which the process after step S111 is not performed is acquired (step S111).

次に、取得した本編音楽区間の長さが所定の時間内に収まっているか否かを判別する（
ステップＳ１１２）。その結果、その取得した本編音楽区間が所定の時間内に収まってい
ないと判別した場合（ステップＳ１１２のＮｏ）、前述したステップＳ１１０に戻る。一
方、取得した本編音楽区間が所定の時間内であると判別した場合（ステップＳ１１２のＹ
ｅｓ）、取得した本編音楽区間のショット間隔の平均を算出する（ステップＳ１１３）。 Next, it is determined whether or not the length of the acquired main music section is within a predetermined time (
Step S112). As a result, when it is determined that the acquired main music section is not within the predetermined time (No in step S112), the process returns to step S110 described above. On the other hand, when it is determined that the acquired main music section is within a predetermined time (Y in step S112)
es), the average shot interval of the acquired main music section is calculated (step S113).

そして、ステップＳ１１０に戻る。即ち、全ての本編音楽区間についてのショット間隔の
算出が行われるまで、以上の手順が繰り返される。 Then, the process returns to step S110. In other words, the above procedure is repeated until the shot intervals for all the main music sections are calculated.

次に、全ての本編音楽区間についてのショット間隔の算出が完了した場合、（ステップ
Ｓ１１０のＮｏ）、ショット間隔の平均が所定時間以内である区間を主題歌区間として検
出する（ステップＳ１１４）。即ち、ショット検出部１０３は、主題歌区間を検出する際
に、ショット間隔の検出結果を用いて主題歌区間を検出したことを示す情報、主題歌区間
として検出した本編音楽区間の時刻情報を検出情報管理部１０７に送信する。本実施の形
態においては、ショット間隔を算出することで主題歌区間を検出するとしたが、ショット
数によって検出するとしても良い。即ち、ある区間においてショット数が所定回数よりも
多い場合、主題歌区間として検出されるとしても良い。また、所定時間及び所定回数とは
、従来のＯＰ／ＥＤのショット間隔及びショット数の平均値から設定されるとする。 Next, when calculation of shot intervals for all main music sections is completed (No in step S110), a section in which the average shot interval is within a predetermined time is detected as a theme song section (step S114). That is, when detecting the theme song section, the shot detection unit 103 detects information indicating that the theme song section is detected using the detection result of the shot interval, and detects time information of the main music section detected as the theme song section. The information is transmitted to the information management unit 107. In the present embodiment, the theme song section is detected by calculating the shot interval, but it may be detected by the number of shots. That is, when the number of shots is greater than a predetermined number in a certain section, it may be detected as a theme song section. The predetermined time and the predetermined number of times are set from the average value of the shot interval and the number of shots of the conventional OP / ED.

次に、図５を用いて、テロップを用いた言語解析部１０６による主題歌区間の検出を説
明する。図５は、本実施の形態におけるテロップによる検出手順を表すフローチャートで
ある。 Next, the detection of the theme song section by the language analysis unit 106 using a telop will be described with reference to FIG. FIG. 5 is a flowchart showing a detection procedure using a telop in this embodiment.

まず、事前に音楽区間検出部１０１において検出した本編音楽区間の内、ステップＳ１
２１以降の処理が行われていない音楽区間があるか否かを判別する（ステップＳ１２０）
。その結果、ステップＳ１２１以降の処理が行われていない本編音楽区間が無いと判別し
た場合（ステップＳ１２０のＮｏ）、この手順を終了する。一方、ステップＳ１２１以降
の処理が行われていない本編音楽区間があると判別した場合（ステップＳ１２０のＹｅｓ
）、次にステップＳ１２１以降の処理が行われていない本編音楽区間の１つを取得する（
ステップＳ１２１）。 First, in the main music section detected by the music section detecting unit 101 in advance, step S1.
It is determined whether or not there is a music section for which the processing after 21 has not been performed (step S120).
. As a result, when it is determined that there is no main part music section for which the processing after step S121 is not performed (No in step S120), this procedure is ended. On the other hand, when it is determined that there is a main music section for which the processing from step S121 onward is not performed (Yes in step S120)
) Next, one of the main music sections that have not been processed in step S121 and thereafter is acquired (
Step S121).

次に、取得した本編音楽区間の長さが所定の時間内に収まっているか否かを判別する（
ステップＳ１２２）。その結果、その取得した本編音楽区間が所定の時間内に収まってい
ないと判別した場合（ステップＳ１２２のＮｏ）、前述したステップＳ１２０に戻る。一
方、取得した本編音楽区間が所定の時間内であると判別した場合（ステップＳ１２２のＹ
ｅｓ）、取得した本編音楽区間に対応するテロップの取得を実行する（ステップＳ１２３
）。 Next, it is determined whether or not the length of the acquired main music section is within a predetermined time (
Step S122). As a result, when it is determined that the acquired main music section is not within the predetermined time (No in step S122), the process returns to step S120 described above. On the other hand, if it is determined that the acquired main music section is within a predetermined time (Y in step S122)
es), the telop corresponding to the acquired main music section is acquired (step S123).
).

次に、テロップの取得を実行した区間に、ＯＰ／ＥＤ及び提供元を表すキーワードが存
在するか否かを判別する（ステップＳ１２４）。その結果、テロップの取得を実行した区
間にキーワードが存在しないと判別した場合（ステップＳ１２４のＮｏ）、前述したステ
ップＳ１２０に戻る。一方、テロップの取得を実行した区間にキーワードが存在すると判
別した場合（ステップＳ１２４のＹｅｓ）、その区間を主題歌区間として検出する（ステ
ップＳ１２５）。即ち、言語解析部１０６は、主題歌区間を検出する際に、テロップを用
いて主題歌区間を検出したことを示す情報、検出されたキーワード、主題歌区間として検
出した本編音楽区間の時刻情報を検出情報管理部１０７に送信する。そして、前述のステ
ップＳ１２０に戻る。即ち、全ての本編音楽区間について処理が行われるまで、以上の手
順が繰り返される。 Next, it is determined whether or not a keyword representing the OP / ED and the provider exists in the section in which the telop is acquired (step S124). As a result, when it is determined that there is no keyword in the section in which the telop acquisition is executed (No in step S124), the process returns to step S120 described above. On the other hand, when it is determined that the keyword exists in the section in which the telop acquisition is executed (Yes in step S124), the section is detected as the theme song section (step S125). That is, when detecting the theme song section, the language analysis unit 106 uses information indicating that the theme song section has been detected using the telop, the detected keyword, and the time information of the main music section detected as the theme song section. It transmits to the detection information management part 107. And it returns to above-mentioned step S120. That is, the above procedure is repeated until the processing is performed for all main music sections.

次に、図６を用いて、字幕を用いた言語解析部１０６による主題歌区間の検出を説明す
る。図６は、本実施の形態における字幕による検出手順を表すフローチャートである。 Next, the detection of the theme song section by the language analysis unit 106 using subtitles will be described with reference to FIG. FIG. 6 is a flowchart showing a detection procedure using captions in the present embodiment.

まず、事前に音楽区間検出部１０１において検出した本編音楽区間の内、ステップＳ１
３１以降の処理が行われていない音楽区間があるか否かを判別する（ステップＳ１３０）
。その結果、ステップＳ１３１以降の処理が行われていない本編音楽区間が無いと判別し
た場合（ステップＳ１３０のＮｏ）、この手順を終了する。一方、ステップＳ１３１以降
の処理が行われていない本編音楽区間があると判別した場合（ステップＳ１３０のＹｅｓ
）、次にステップＳ１３１以降の処理が行われていない本編音楽区間の１つを取得する（
ステップＳ１３１）。 First, in the main music section detected by the music section detecting unit 101 in advance, step S1.
It is determined whether or not there is a music section that has not been processed after step 31 (step S130).
. As a result, when it is determined that there is no main part music section for which the processing from step S131 is not performed (No in step S130), this procedure is ended. On the other hand, when it is determined that there is a main music section that has not been processed after step S131 (Yes in step S130).
) Next, one of the main music sections that have not been processed in step S131 and thereafter is acquired (
Step S131).

次に、取得した本編音楽区間の長さが所定の時間内に収まっているか否かを判別する（
ステップＳ１３２）。その結果、その取得した本編音楽区間が所定の時間内に収まってい
ないと判別した場合（ステップＳ１３２のＮｏ）、前述したステップＳ１３０に戻る。一
方、取得した本編音楽区間が所定の時間内であると判別した場合（ステップＳ１３２のＹ
ｅｓ）、次に、取得した本編音楽区間に対応した字幕の取得を実行する（ステップＳ１３
３）。 Next, it is determined whether or not the length of the acquired main music section is within a predetermined time (
Step S132). As a result, when it is determined that the acquired main music section is not within the predetermined time (No in step S132), the process returns to step S130 described above. On the other hand, if it is determined that the acquired main music section is within a predetermined time (Y in step S132)
es) Next, acquisition of subtitles corresponding to the acquired main music section is executed (step S13).
3).

次に、取得した字幕に主題歌及び音楽を表すキーワードが存在するか否かを判別する（
ステップＳ１３４）。その結果、取得した字幕にキーワードが存在しないと判別した場合
（ステップＳ１３４のＮｏ）、前述したステップＳ１３０に戻る。一方、取得した字幕に
キーワードが存在すると判別した場合（ステップＳ１３４のＹｅｓ）、その区間を主題歌
区間として検出する（ステップＳ１３５）。即ち、言語解析部１０６は、主題歌区間を検
出する際に、字幕を用いて主題歌区間を検出したことを示す情報、検出されたキーワード
、主題歌区間として検出した本編音楽区間の時刻情報を検出情報管理部１０７に送信する
。前述のステップＳ１３０に戻る。即ち、全ての本編音楽区間について処理が行われるま
で、以上の手順が繰り返される。 Next, it is determined whether or not a keyword representing a theme song and music exists in the acquired subtitle (
Step S134). As a result, when it is determined that there is no keyword in the acquired subtitle (No in step S134), the process returns to step S130 described above. On the other hand, when it is determined that a keyword is present in the acquired subtitle (Yes in step S134), the section is detected as a theme song section (step S135). That is, when the language analysis unit 106 detects the theme song section, information indicating that the theme song section has been detected using subtitles, the detected keyword, and time information of the main music section detected as the theme song section. It transmits to the detection information management part 107. The process returns to step S130 described above. That is, the above procedure is repeated until the processing is performed for all main music sections.

次に、図７を用いて無字幕区間による主題歌区間の検出を説明する。図７は、本実施の
形態における無字幕区間による検出手順を表すフローチャートである。 Next, the detection of the theme song section by the non-subtitle section will be described with reference to FIG. FIG. 7 is a flowchart showing a detection procedure based on a non-caption section in the present embodiment.

まず、事前に音楽区間検出部１０１において検出した本編音楽区間の内、ステップＳ１
４１以降の処理が行われていない音楽区間があるか否かを判別する（ステップＳ１４０）
。その結果、ステップＳ１４１以降の処理が行われていない本編音楽区間が無いと判別し
た場合（ステップＳ１４０のＮｏ）、この手順を終了する。一方、ステップＳ１４１以降
の処理が行われていない本編音楽区間があると判別した場合（ステップＳ１４０のＹｅｓ
）、次にステップＳ１４１以降の処理が行われていない本編音楽区間の１つを取得する（
ステップＳ１４１）。 First, in the main music section detected by the music section detecting unit 101 in advance, step S1.
It is determined whether or not there is a music section for which the processing after 41 is not performed (step S140).
. As a result, when it is determined that there is no main part music section for which the processing after step S141 is not performed (No in step S140), this procedure is ended. On the other hand, when it is determined that there is a main part music section for which the process after step S141 is not performed (Yes in step S140)
) Next, one of the main music sections for which the processing from step S141 onward is not performed is acquired (
Step S141).

次に、取得した本編音楽区間の長さが所定の時間内に収まっているか否かを判別する（
ステップＳ１４２）。その結果、その取得した本編音楽区間が所定の時間内に収まってい
ないと判別した場合（ステップＳ１４２のＮｏ）、前述したステップＳ１４０に戻る。一
方、取得した本編音楽区間が所定の時間内であると判別した場合（ステップＳ１４２のＹ
ｅｓ）、取得した本編音楽区間に対応する字幕の取得を実行する（ステップＳ１４３）。 Next, it is determined whether or not the length of the acquired main music section is within a predetermined time (
Step S142). As a result, when it is determined that the acquired main music section is not within the predetermined time (No in step S142), the process returns to step S140 described above. On the other hand, if it is determined that the acquired main music section is within a predetermined time (Y in step S142)
es), acquisition of subtitles corresponding to the acquired main music section is executed (step S143).

次に、字幕の取得を実行した区間に、字幕が存在するか否かを判別する（ステップＳ１
４４）。その結果、字幕の取得を実行した区間に字幕が存在すると判別した場合（ステッ
プＳ１４４のＹｅｓ）、前述したステップＳ１４０に戻る。一方、字幕の取得を実行した
区間に字幕が存在しないと判別した場合（ステップＳ１４４のＮｏ）、その区間を主題歌
区間として検出する（ステップＳ１４５）。即ち、言語解析部１０６は、主題歌区間を検
出する際に、無字幕区間を用いて主題歌区間を検出したことを示す情報、主題歌区間とし
て検出した本編音楽区間の時刻情報を検出情報管理部１０７に送信する。そして、前述の
ステップＳ１４０に戻る。即ち、全ての本編音楽区間について処理が行われるまで、以上
の手順が繰り返される。 Next, it is determined whether or not there is a caption in the section in which caption acquisition is executed (step S1).
44). As a result, when it is determined that there is a caption in the section in which caption acquisition is performed (Yes in step S144), the process returns to step S140 described above. On the other hand, when it is determined that there is no caption in the section in which caption acquisition is executed (No in step S144), the section is detected as a theme song section (step S145). That is, when detecting the theme song section, the language analysis unit 106 detects information indicating that the theme song section has been detected using the non-captioned section, and time information of the main music section detected as the theme song section. To the unit 107. And it returns to above-mentioned step S140. That is, the above procedure is repeated until the processing is performed for all main music sections.

次に、検出情報管理部１０７がメモリ部１０７ａに記憶した主題歌区間を特定するメタ
データを用いて、記録再生部１１へ出力する主題歌区間を決定する手順について説明する
。出力する主題歌区間をどのメタデータを用いて決定するかは、予め設定されているか若
しくはユーザが設定できるとする。図８は、本実施の形態における検出情報管理部１０７
により出力する主題歌区間を決定する手順を表すフローチャートである。 Next, a procedure for determining the theme song section to be output to the recording / playback unit 11 using the metadata that identifies the theme song section stored in the memory unit 107a by the detection information management unit 107 will be described. It is assumed that which metadata is used to determine the theme song section to be output is set in advance or can be set by the user. FIG. 8 shows the detection information management unit 107 in the present embodiment.
It is a flowchart showing the procedure which determines the theme song area output by this.

まず、検出情報管理部１０７は、字幕中から主題歌を表すキーワードを検出した結果（
主題歌字幕による検出結果）を出力すると設定されているか否かを判別する（ステップＳ
１０）。その結果、主題歌字幕による検出結果を出力すると設定されていると判別した場
合（ステップＳ１０のＹｅｓ）、後述するステップＳ１７に進む。 First, the detection information management unit 107 detects a keyword representing a theme song from subtitles (
It is determined whether or not it is set to output the detection result of the theme song subtitle (step S).
10). As a result, when it is determined that the detection result based on the theme song subtitle is set (Yes in step S10), the process proceeds to step S17 described later.

一方、主題歌字幕による検出結果を出力しないと設定されていると判別した場合（ステ
ップＳ１０のＮｏ）、次に、字幕中から音楽を表すキーワードを検出した結果（音楽字幕
による検出結果）を出力すると設定されているか否かを判別する（ステップＳ１１）。そ
の結果、音楽字幕による検出結果を出力すると設定されていると判別した場合（ステップ
Ｓ１１のＹｅｓ）、後述するステップＳ１７に進む。 On the other hand, if it is determined that the detection result based on the theme song subtitle is not output (No in step S10), then the result of detecting a keyword representing music from the subtitle (the detection result based on the music subtitle) is output. Then, it is determined whether or not it is set (step S11). As a result, when it is determined that the detection result based on music subtitles is set to be output (Yes in step S11), the process proceeds to step S17 described later.

一方、音楽字幕による検出結果を出力しないと設定されていると判別した場合（ステッ
プＳ１１のＮｏ）、次に、無字幕区間による検出結果を出力すると設定されているか否か
を判別する（ステップＳ１２）。その結果、無字幕区間による検出結果を出力すると設定
されていると判別した場合（ステップＳ１２のＹｅｓ）、後述するステップＳ１７に進む
。 On the other hand, if it is determined that the detection result based on the music caption is not output (No in step S11), it is then determined whether or not the detection result based on the non-caption section is set (step S12). ). As a result, when it is determined that the detection result by the non-caption section is set to be output (Yes in Step S12), the process proceeds to Step S17 described later.

一方、無字幕区間による検出結果を出力しないと設定されていると判別した場合（ステ
ップＳ１２のＮｏ）、次に、テロップ中からＯＰ／ＥＤを表すキーワードを検出した結果
（ＯＰ／ＥＤテロップによる検出結果）を出力すると設定されているか否かを判別する（
ステップＳ１３）。その結果、ＯＰ／ＥＤテロップによる検出結果を出力すると設定され
ていると判別した場合（ステップＳ１３のＹｅｓ）、後述するステップＳ１７に進む。 On the other hand, when it is determined that the detection result based on the non-captioned section is not output (No in step S12), the result of detecting the keyword representing OP / ED from the telop (detection by OP / ED telop) Whether or not it is set to output (result)
Step S13). As a result, if it is determined that the detection result by the OP / ED telop is set to be output (Yes in step S13), the process proceeds to step S17 described later.

一方、ＯＰ／ＥＤテロップによる検出結果を出力しないと設定されていると判別した場
合（ステップＳ１３のＮｏ）、次に、テロップ中から提供元を表すキーワードを検出した
結果（提供元テロップによる検出結果）を出力すると設定されているか否かを判別する（
ステップＳ１４）。その結果、提供元テロップによる検出結果を出力すると設定されてい
ると判別した場合（ステップＳ１４のＹｅｓ）、後述するステップＳ１７に進む。 On the other hand, when it is determined that the detection result by the OP / ED telop is not output (No in step S13), the result of detecting the keyword representing the provider from the telop (detection result by the provider telop) ) To determine whether or not it is set (
Step S14). As a result, when it is determined that the detection result by the providing source telop is set to be output (Yes in step S14), the process proceeds to step S17 described later.

一方、提供元テロップによる検出結果を出力しないと設定されていると判別した場合（
ステップＳ１４のＮｏ）、次に、提供元シーンによる検出結果を出力すると設定されてい
るか否かを判別する（ステップＳ１５）。その結果、提供元シーンによる検出結果を出力
すると設定されていると判別した場合（ステップＳ１５のＹｅｓ）、後述するステップＳ
１７に進む。 On the other hand, if it is determined that the detection result by the provider telop is not output (
Next, it is determined whether or not it is set to output the detection result based on the providing source scene (step S15). As a result, when it is determined that the detection result by the providing source scene is set to be output (Yes in Step S15), Step S described later is performed.
Proceed to 17.

一方、提供元シーンによる検出結果を出力しないと設定されていると判別した場合（ス
テップＳ１５のＮｏ）、次に、ショット間隔による検出結果を出力すると設定されている
か否かを判別する（ステップＳ１６）。その結果、ショット間隔による検出結果を出力す
ると設定されていると判別した場合（ステップＳ１６のＹｅｓ）、設定された検出結果に
基づいて主題歌区間を記録再生部１１に出力する（ステップＳ１７）。そして、この手順
を終了する。 On the other hand, if it is determined that the detection result based on the source scene is not output (No in step S15), it is then determined whether or not the detection result based on the shot interval is set (step S16). ). As a result, when it is determined that the detection result based on the shot interval is set to be output (Yes in step S16), the theme song section is output to the recording / reproducing unit 11 based on the set detection result (step S17). Then, this procedure is finished.

本実施の形態においては、検出情報管理部１０７の決定手順を以上のような順番にした
が、これに限られることはない。即ち、ステップＳ１０〜ステップＳ１６の手順は順不同
である。 In the present embodiment, the determination procedure of the detection information management unit 107 is performed in the order as described above, but the present invention is not limited to this. That is, the order of steps S10 to S16 is in no particular order.

また、本実施の形態の変形例として、図９に表すような手順も考えられる。図９は、本
実施の形態における検出情報管理部１０７により出力する主題歌区間を決定する手順の変
形例を表すフローチャートである。この変形例においては、ステップＳ１０〜ステップＳ
１６までの検出結果を全て用いて、出力する主題歌区間を決定する。 Further, as a modification of the present embodiment, a procedure as shown in FIG. 9 is also conceivable. FIG. 9 is a flowchart showing a modification of the procedure for determining the theme song section to be output by the detection information management unit 107 in the present embodiment. In this modification, steps S10 to S
Using all the detection results up to 16, the theme song section to be output is determined.

尚、ステップＳ１０〜ステップＳ１６の全ての検出結果を用いることなく、何れかの検
出結果については用いないとしても良い。 In addition, it is not necessary to use about all the detection results, without using all the detection results of step S10-step S16.

以上のように構成される本実施の形態及びその変形例においては、シリーズ化した複数
話構成のコンテンツにおいて、主題歌の部分を検出することができる。従って、シリーズ
化したコンテンツに挿入されている主題歌区間を重複して視聴することなく、より効率的
な視聴が実現する。 In the present embodiment configured as described above and its modifications, it is possible to detect a theme song portion in a series of multi-story content. Therefore, more efficient viewing can be realized without duplicating viewing of the theme song section inserted in the serialized content.

また検出した主題歌区間のみを集めてプレイリストを作成することができる。主題歌区
間の検出に用いたキーワードと共に記憶することで、主題歌区間が記録装置１２に記憶し
ている他の音楽コンテンツ等で記憶されている場合、検索することが可能となる。 Also, it is possible to create a playlist by collecting only the detected theme song sections. By storing together with the keywords used for the detection of the theme song section, it is possible to search when the theme song section is stored in another music content or the like stored in the recording device 12.

尚、本発明は上記実施の形態そのままに限定されるものではなく、実施段階ではその要
旨を逸脱しない範囲で構成要素を変形して具現化できる。また、上記実施の形態に開示さ
れている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、
実施の形態に示される全構成要素から幾つかの構成要素を削除してもよい。更に異なる実
施の形態にわたる構成要素を適宜組み合わせても良い。 Note that the present invention is not limited to the above-described embodiments as they are, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. Various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the embodiments. For example,
Some constituent elements may be deleted from all the constituent elements shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

１録画再生装置
１０コンテンツ受信部
１１記録再生部
１２記録装置
１３デコーダ部
１４主題歌検出部
１５ＣＭ検出部
１６受信部
１７再生制御部
１８Ｉ／Ｆ部
１９出力装置
１０１音楽区間検出部
１０２提供元シーン検出部
１０３ショット検出部
１０４テロップ検出部
１０５字幕検出部
１０６言語解析部
１０７検出情報管理部
１０７ａメモリ部 DESCRIPTION OF SYMBOLS 1 Recording / reproducing apparatus 10 Content receiving part 11 Recording / reproducing part 12 Recording apparatus 13 Decoder part 14 Theme song detecting part 15 CM detecting part 16 Receiving part 17 Playback control part 18 I / F part 19 Output device 101 Music section detecting part 102 Provider Scene detection unit 103 Shot detection unit 104 Telop detection unit 105 Subtitle detection unit 106 Language analysis unit 107 Detection information management unit 107a Memory unit

Claims

A content receiver for receiving content;
A storage device for storing the content received by the content receiving unit;
A decoder unit that separates at least a video signal, an audio signal, and a caption signal among the signals constituting the content received by the content receiving unit;
A first detection unit that detects a first section including music in the main content using the audio signal separated by the decoder unit;
A second detection unit for detecting a second section in which a character indicating a section overlapping with the serialized content exists from a character string configured by the subtitle signal existing in the first section;
A detection information management unit that determines a section in which reproduction is omitted based on a detection result by the second detection unit;
A recording / playback apparatus comprising: a playback control unit that plays back the content determined by the detection information management unit when the content is played back.

A third detection for extracting a telop from the video composed of the video signal existing in the first section, and detecting a third section in which a character indicating a section overlapping the content serialized from the telop exists. Further comprising
The said detection information management part determines the area which skips reproduction | regeneration based on at least any one of the detection result by the said 2nd detection part or the said 3rd detection part. Recording / playback device.

A scene indicating a content providing source is detected using at least one of a video signal and an audio signal existing in a certain section before and after the first section, and when the scene is detected, the first section is detected. And a fourth detection unit for detecting as a fourth section,
The detection information management unit determines a section in which reproduction is omitted based on at least one of detection results from the second detection unit to the fourth detection unit. Recording / playback device.

When the video of the first section is switched, the first section is divided into a plurality of fifth sections having a recording time shorter than that of the first section, and recording of the plurality of fifth sections is performed. A fifth detection unit that detects the first section having an average time value within a predetermined time as a sixth section;
The said detection information management part determines the area which skips reproduction | regeneration based on at least any one of the detection result by the said 2nd detection part thru | or a 5th detection part. Recording / playback device.

Separation means for separating at least a video signal, an audio signal, and a caption signal among signals constituting the received content;
First detection means for detecting a first section including music in the main content using the audio signal separated by the separation means;
Second detection means for detecting a second section in which a character indicating a section overlapping with the serialized content exists from a character string configured by the caption signal existing in the first section;
Detection information management means for determining a section for which reproduction is to be omitted, using the detection result of the second detection means;
And a playback control means for playing back by omitting the section determined by the detection information management means.

A telop is extracted from the video in the first section using the video signal separated by the separating means, and a third section in which characters indicating a section overlapping with the serialized content exists is detected. And further detecting means
The said reproduction | regeneration control means determines the area which a reproduction | regeneration is skipped using either one of the detection results of the said 2nd detection means or the said 3rd detection means. How to play.

A discriminating means for discriminating whether or not a scene indicating a content providing source exists by using at least one of the video signal and the audio signal existing in a fixed section before and after the first section;
And a fourth detection means for detecting the first section as a fourth section when the determination section determines that the scene exists.
The reproduction control means determines a section in which reproduction is omitted by using any one of detection results of the second detection means to the fourth detection means. How to play.

A dividing means for dividing the first section into a plurality of fifth sections having a recording time shorter than that of the first section at the time when the signal of the first section is switched;
A fifth detecting means for detecting, as a sixth section, the first section in which an average value of recording times of the fifth section is within a predetermined time;
8. The reproduction control unit according to claim 7, wherein the reproduction control unit determines a section in which reproduction is omitted by using any one of detection results of the second detection unit to the fifth detection unit. How to play.