JP2000023062A

JP2000023062A - Digest production system

Info

Publication number: JP2000023062A
Application number: JP10199517A
Authority: JP
Inventors: Shozo Abe; 省三阿部; Shigeru Maeda; 茂前田; Tetsuya Abe; 哲也阿部; Hidenori Okita; 秀紀大喜多; Kazunobu Konta; 和宣紺田
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1998-06-30
Filing date: 1998-06-30
Publication date: 2000-01-21

Abstract

PROBLEM TO BE SOLVED: To efficiently produce a digest image from video information such as a video obtd. by recording a program based on audio information that synchronizes with the video information or a telop video in the video. SOLUTION: A telop extracting part 104 extracts a telop video from a program video fetched by a video fetching part 101 and on the other hand, an audio information extracting part 105 extracts a characteristic image of a characteristic scene based on, e.g. a volume value of the audio information that is synchronized with the program video and is added. A combination processing part 106 extracts a video part that newly becomes characteristic due to the step wise combination of characteristics parts which correspond to the extracted telop video and audio information and a digest producing part 108 produces a digest video based on the extracted video.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、映像情報からダイ
ジェスト映像を作成するダイジェスト作成方法に係り、
特に放送番組を自動的に録画する自動録画システムに用
いて好適なダイジェスト作成方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a digest creation method for creating a digest video from video information,
In particular, the present invention relates to a digest creation method suitable for use in an automatic recording system for automatically recording a broadcast program.

【０００２】[0002]

【従来の技術】近年、地上波やＢＳ放送の他に幾つかの
ＣＳ放送が開始され、より多数の番組の視聴が可能にな
ってきた。今後チャンネル数は放送のデジタル化により
ますます増加することが予想される。これにより利用者
は、より多様化したチャンネルから自分の趣味趣向にあ
った番組を選択する環境が整いつつある。また、多チャ
ンネル化と同時にＨＤ−ＴＶを始めとした高品質化の技
術環境も整備されつつある。2. Description of the Related Art In recent years, some CS broadcasts have begun in addition to terrestrial and BS broadcasts, so that more programs can be viewed. In the future, the number of channels is expected to increase with the digitalization of broadcasting. As a result, an environment for users to select a program that suits their tastes and preferences from more diversified channels is being prepared. At the same time as increasing the number of channels, a technical environment for high quality such as HD-TV is being prepared.

【０００３】[0003]

【発明が解決しようとする課題】しかし、反面チャンネ
ル数の増加に伴いどのような番組が放送されているのか
を網羅するのが困難となり、結果として自分の見たいと
思う番組があるかどうかを把握することが難しくなって
きた。このため、利用者が興味を持っている番組でも放
送していることに気づかずに見逃してしまうなどの問題
があった。However, with the increase in the number of channels, it is difficult to cover what programs are being broadcast. As a result, it is difficult to determine whether there is a program that one wants to watch. It has become difficult to grasp. For this reason, there has been a problem that the user misses the program he or she is interested in without even knowing that it is being broadcast.

【０００４】同様に放送された番組を録画する場合にお
いても、多数の番組の中から視聴したい番組を選択しな
ければならないため、見逃しや録画の取り忘れなどが多
くなるという問題があった。[0004] Similarly, when recording a broadcasted program, there is a problem that a program to be watched must be selected from a large number of programs, so that the user often misses or forgets to record.

【０００５】このように、多チャンネル化時代に向け
て、個人の趣味趣向に沿った番組録画を選択するために
は、いろいろと問題があるが、更に、一旦番組録画を行
った映像に対して、自分が見たい番組を見るためには、
従来は番組録画した時間をかけたり、手動で早送りする
などして、また最初から見なければならず、時間がかか
るという問題があった。もし、その番組が自分が興味を
持って見たい番組でなかった場合には大変な時間の無駄
となっていた。[0005] As described above, there are various problems in selecting program recording in accordance with personal tastes and preferences for the multi-channel era. In order to watch the program you want to watch,
Conventionally, there is a problem in that it is necessary to spend time for recording a program or to fast-forward manually to watch the program again from the beginning, which takes time. If the program was not the one I wanted to watch with interest, I wasted a lot of time.

【０００６】本発明は上記事情を考慮してなされたもの
でその目的は、番組録画した映像などの映像情報から、
当該映像情報に同期した音情報、または当該映像中のテ
ロップ映像をもとに、ダイジェスト映像を効率的に作成
することができるダイジェスト作成システムを提供する
ことにある。[0006] The present invention has been made in consideration of the above circumstances, and its object is to obtain video information such as video recorded in a program.
An object of the present invention is to provide a digest creation system capable of efficiently creating a digest video based on sound information synchronized with the video information or a telop video in the video.

【０００７】[0007]

【課題を解決するための手段】本発明のダイジェスト作
成システムは、映像情報に同期して付加されている音情
報をもとに、映像情報の特徴画像または特徴シーンを抽
出する音情報抽出手段と、この音情報抽出手段により抽
出された特徴画像または特徴シーンをもとにダイジェス
ト映像を作成するダイジェスト作成手段とを備えたこと
を特徴とする。このような構成においては、映像情報に
同期して付加されている音情報をもとに、映像情報の特
徴画像または特徴シーンが抽出されるので、この特徴画
像または特徴シーンを利用して映像情報からのダイジェ
スト作成を行うことにより、音情報に関連した特徴部分
を含むダイジェスト映像を簡単に作成することができ
る。SUMMARY OF THE INVENTION A digest creation system according to the present invention includes a sound information extracting means for extracting a characteristic image or a characteristic scene of video information based on sound information added in synchronization with the video information. And a digest creation unit for creating a digest video based on the feature image or feature scene extracted by the sound information extraction unit. In such a configuration, the characteristic image or the characteristic scene of the video information is extracted based on the sound information added in synchronization with the video information. By creating a digest from the above, it is possible to easily create a digest video including a characteristic portion related to the sound information.

【０００８】ここで、音情報抽出手段による特徴画像ま
たは特徴シーンの抽出には、上記映像情報に同期して付
加されている音情報の音量値を用い、当該音量値が連続
して設定値以上となる時間（時間帯）が設定時間以上と
なる状態を検出すればよく、例えば当該状態の開始時点
の画像を特徴画像として抽出するとか、当該状態（時間
帯）における連続画像を特徴シーンとして抽出すればよ
い。Here, the characteristic image or characteristic scene is extracted by the sound information extracting means using the volume value of the sound information added in synchronization with the video information, and the volume value is continuously equal to or greater than the set value. It is only necessary to detect a state in which the time (time zone) becomes equal to or longer than the set time. For example, an image at the start of the state is extracted as a feature image, or a continuous image in the state (time zone) is extracted as a feature scene. do it.

【０００９】また、音情報の音量値の他に、音情報の種
類を、音声、効果音、音楽音に分類して、例えばユーザ
の指定する、或いはユーザの趣味趣向に応じた特徴音に
対応する画像部分を特徴画像として抽出することも可能
である。特に、音声については、大人、子供、男性、女
性といった属性識別を行い、効果音及び音楽について
は、対応する音パターンを予め用意しておくことで、ユ
ーザの指定する、或いはユーザの趣味趣向に応じた特徴
音に対応する画像部分（例えば、ドライブが趣味のユー
ザであれば、車両の走行音に対応した映像シーンなど）
を一層細かなレベルで抽出し、時間軸上で切り張りする
ことで、ユーザの趣味趣向に応じたダイジェスト映像に
編集できる。In addition to the volume value of the sound information, the type of the sound information is classified into voice, sound effect, and music sound, and corresponds to, for example, a characteristic sound specified by the user or according to the taste of the user. It is also possible to extract an image portion to be processed as a feature image. In particular, for voices, attributes such as adults, children, men, and women are identified, and for sound effects and music, corresponding sound patterns are prepared in advance, so that the user can specify or match the tastes of the user. Image portion corresponding to the characteristic sound (eg, if the drive is a hobby user, a video scene corresponding to the running sound of the vehicle, etc.)
Is extracted at a finer level and cut out on the time axis, so that it can be edited into a digest video according to the user's taste and taste.

【００１０】また本発明は、映像情報から文字情報を含
むテロップ映像を抽出するテロップ抽出手段と、このテ
ロップ抽出手段により抽出されたテロップ映像をもとに
ダイジェスト映像を作成するダイジェスト作成手段とを
備えたことを特徴とする。Further, the present invention comprises telop extracting means for extracting a telop video including character information from the video information, and digest generating means for generating a digest video based on the telop video extracted by the telop extracting means. It is characterized by having.

【００１１】このような構成においては、映像情報から
文字情報を含むテロップ映像が抽出されるので、このテ
ロップ映像を利用してダイジェスト映像作成を行うこと
により、テロップに関連した特徴部分を含むダイジェス
ト映像を簡単に作成することができる。In such a configuration, since a telop video including character information is extracted from the video information, a digest video is created by using the telop video, whereby a digest video including a characteristic portion related to the telop is created. Can be easily created.

【００１２】ここで、テロップが表示される映像画面上
の領域が、その映像の種類またはテロップの種類に固有
の領域であることを考慮し、その領域を対象に当該領域
内の画像データの時間軸上の変化を検出することにより
テロップ映像を抽出するならば、効率的なテロップ映像
抽出が実現できる。Here, considering that the region on the video screen where the telop is displayed is a region specific to the type of the video or the type of the telop, the time of the image data in the region is targeted for the region. If a telop image is extracted by detecting a change on the axis, efficient telop image extraction can be realized.

【００１３】上記映像情報が番組映像の場合には、ニュ
ース番組等に用いられる速報用テロップと、バラエティ
番組等で用いられる強調テロップの２種を識別する必要
がある。速報用テロップは画面の上部または下部の所定
領域に表示され、強調テロップは画面全体に大きな文字
で且つ特定の目立つ色で表示されるというように、両テ
ロップは異なる特徴を有している。そこで、速報用テロ
ップについては、番組映像画面上の上部及び下部の所定
領域を対象に時間軸上の輝度情報の変化をチェックする
ことで、変化の大きい時点を速報用テロップの開始時点
または終了時点として検出できる。一方、強調テロップ
については、番組映像画面全体を対象に色情報の時間軸
上の変化状態をチェックし、時間軸上の過去に抽出した
色と離れた色が設定比率以上となった状態を検出するこ
とで、強調テロップの開始時点または終了時点として検
出できる。When the video information is a program video, it is necessary to identify two types of telops, such as breaking news telops used for news programs and the like, and enhanced telops used for variety programs and the like. Both the telops have different characteristics, such that the flash telop is displayed in a predetermined area at the top or bottom of the screen, and the emphasized telop is displayed in a large character over the entire screen in a specific conspicuous color. Therefore, for the flash telop, by checking the change in the luminance information on the time axis in the upper and lower predetermined areas on the program video screen, the time point at which the change is large is determined at the start time or the end time of the flash telop. Can be detected as On the other hand, for highlighted telops, the change state on the time axis of the color information is checked for the entire program video screen, and the state where the color separated from the past extracted colors on the time axis exceeds the set ratio is detected. By doing so, it can be detected as the start point or the end point of the emphasized telop.

【００１４】また本発明は、上記音情報抽出手段により
抽出された特徴画像または特徴シーン、及び上記テロッ
プ抽出手段により抽出されたテロップ映像を組み合わせ
て新たな特徴映像を抽出する組み合わせ処理手段を設
け、この組み合わせ処理手段により抽出された特徴映像
をもとに、上記ダイジェスト作成手段がダイジェスト映
像を作成するようにしたことを特徴とする。The present invention further comprises a combination processing means for extracting a new characteristic image by combining the characteristic image or the characteristic scene extracted by the sound information extracting means and the telop image extracted by the telop extracting means, The digest creating means creates a digest video based on the characteristic video extracted by the combination processing means.

【００１５】このような構成においては、２種の特徴を
組み合わせてダイジェスト映像を作成することで、高品
質のダイジェスト映像の作成が可能となる。特に、両特
徴を段階的に使用することで、例えばテロップ表示によ
る特徴映像情報である特徴シーンの時間範囲を決定し、
その時間範囲内で、音量値が設定値以上となる映像シー
ンを抽出することで、より品質の高いダイジェスト映像
を作成することができる。In such a configuration, a high-quality digest image can be created by creating a digest image by combining two types of features. In particular, by using both features in stages, for example, determine the time range of the feature scene, which is feature video information by telop display,
By extracting a video scene whose volume value is equal to or higher than the set value within the time range, a higher-quality digest video can be created.

【００１６】この他に、抽出したダイジェスト作成用の
映像素材を上記ダイジェスト作成手段にて時間軸上で差
し替えることで、ユーザにとってより有用なダイジェス
ト映像を作成できる。In addition, by replacing the extracted video material for creating a digest on the time axis by the above-described digest creating means, a digest video more useful to the user can be created.

【００１７】また、デジタル化した映像を扱う場合に
は、抽出した特徴に対応する映像情報を静止画像として
取得し、目的とする各特徴条件によってグルーピングす
ることで、より応用範囲の広いダイジェスト映像を作成
できる。In the case of handling a digitized video, video information corresponding to the extracted features is obtained as a still image, and grouped according to each desired feature condition, so that a digest video having a wider application range can be obtained. Can be created.

【００１８】[0018]

【発明の実施の形態】以下、本発明の実施の形態につき
図面を参照して説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１９】図１は本発明の一実施形態に係るダイジェ
スト作成システムを備えた情報システムのブロック構成
図である。FIG. 1 is a block diagram of an information system having a digest creation system according to an embodiment of the present invention.

【００２０】ダイジェスト作成システム１０は、ＴＶ番
組予約システム２０にてＴＶ番組映像を磁気テープ媒体
としてのＶＴＲテープに録画し、そのＶＴＲテープに記
録された映像（ビデオ映像）に対してダイジェスト作成
処理を行い、その結果、ダイジェスト映像としてＴＶ番
組予約システム２０に出力するものである。The digest creation system 10 records a TV program video on a VTR tape as a magnetic tape medium by the TV program reservation system 20, and performs a digest creation process on the video (video video) recorded on the VTR tape. The result is output to the TV program reservation system 20 as a digest video.

【００２１】ＴＶ番組予約システム２０は、ダイジェス
ト作成システム１０と連携して動作する。ＴＶ番組予約
システム２０は、ＴＶ（地上放送、衛星放送など）のリ
アル映像とＴＶ番組情報（テレビ番組表）を入力とし
て、ユーザが設定したＴＶ番組に沿って番組映像の録画
処理を行うものである。The TV program reservation system 20 operates in cooperation with the digest creation system 10. The TV program reservation system 20 receives a real video of a TV (terrestrial broadcast, satellite broadcast, etc.) and TV program information (TV program guide), and performs program video recording processing along with a TV program set by a user. is there.

【００２２】上記２つのシステム１０，２０を組み合わ
せて使用することによって、効率的なＴＶ番組予約シス
テムが可能となる。By using the above two systems 10 and 20 in combination, an efficient TV program reservation system can be realized.

【００２３】例えば、大量に録画した映像をユーザが見
る場合に、録画した時間と同じ時間をかけて見るのは非
効率であり時間の無駄となる。そこで、将来のデジタル
化時代の多チャンネルに向けて、録画した映像に対し
て、ダイジェスト映像を作成する必要性が益々強くなっ
てくると予想される。こういった時代において、ダイジ
ェスト映像を作成する意味は、番組映像の内容が分かる
程度に時間的に短くすることに加えて、個人の趣味趣向
を含んだ、いわゆる個人情報に適したダイジェスト映像
の必要性の重要度を考えている。For example, when a user views a large amount of recorded video, it is inefficient and wastes time to watch the same time as the recorded time. Therefore, it is expected that the necessity of creating a digest video for a recorded video will increase more and more for a multi-channel in the future digital age. In such an era, the meaning of creating a digest video is that in addition to shortening the time so that the contents of the program video can be understood, there is a need for digest videos suitable for so-called personal information, including personal tastes and preferences. Thinking about the importance of gender.

【００２４】図２はダイジェスト作成システム１０の構
成を示すブロック図である。FIG. 2 is a block diagram showing the configuration of the digest creation system 10.

【００２５】図２において、映像取り込み部１０１は、
（図１中のＴＶ番組予約システム２０にて）通常のＴＶ
映像、衛星取送、ケーブルテレビなどから録画した映像
情報を取り込み、映像情報管理部１０２の持つ映像蓄積
部１０３に番組映像全体を記憶する。映像情報管理部１
０２は映像蓄積部１０３に蓄積した映像情報に対してダ
イジェスト映像を作成するための全体的な制御管理を行
う。In FIG. 2, the image capturing unit 101 includes:
(With TV program reservation system 20 in FIG. 1) Normal TV
Video information recorded from video, satellite transmission, cable television, or the like is captured, and the entire program video is stored in a video storage unit 103 of the video information management unit 102. Video information management unit 1
02 performs overall control management for creating a digest video with respect to the video information stored in the video storage unit 103.

【００２６】テロップ抽出部１０４は、ダイジェスト映
像作成のための１機能要素であり、番組映像中にテロッ
プが表示された番組映像ポイント（番組映像の最初から
計数した経過時間）に対応した番組映像から取り出され
た静止画像（フレーム画像）、及び複数枚の連続した動
画像を抽出する。The telop extraction unit 104 is one functional element for creating a digest video. The telop extraction unit 104 extracts a program video corresponding to a program video point (elapsed time counted from the beginning of the program video) in which the telop is displayed in the program video. The extracted still image (frame image) and a plurality of continuous moving images are extracted.

【００２７】音情報抽出部１０５は、テロップ抽出部１
０４と同様にダイジェスト映像作成のための１機能要素
であり、番組映像に同期して存在する音情報を使って番
組映像を要約するような特徴部分を抽出する。[0027] The sound information extracting unit 105 includes the telop extracting unit 1.
Similar to 04, it is one functional element for creating a digest video, and extracts a characteristic part that summarizes a program video using sound information existing in synchronization with the program video.

【００２８】組み合わせ処理部１０６は、テロップ抽出
部１０４及び音情報抽出部１０５を段階的に組み合わせ
て使用して、テロップ抽出により取り出される番組映像
の特徴部分と、音情報に基づいて取り出される番組映像
の特徴部分を持つ新たな特徴部分を持つダイジェスト映
像を抽出する。The combination processing unit 106 uses the telop extraction unit 104 and the sound information extraction unit 105 in a stepwise combination, and uses the feature portion of the program video extracted by telop extraction and the program video extracted based on the sound information. A digest video having a new characteristic part having the characteristic part is extracted.

【００２９】抽出情報制御部（情報切り替え部）１０７
は、テロップ抽出部１０４、音情報抽出部１０５、及び
組み合わせ処理部１０６を適宜切り替えてダイジェスト
作成部１０８によるダイジェスト映像作成に供する。Extraction information control unit (information switching unit) 107
Switches the telop extraction unit 104, the sound information extraction unit 105, and the combination processing unit 106 as appropriate, and provides the digest creation unit 108 with digest video creation.

【００３０】ダイジェスト作成部１０８は、個人情報処
理部１０９によって得た個人の趣味趣向を含んだ個人情
報を使って、抽出情報制御部１０７の出力に基づく最終
的なダイジェスト映像を作成し、映像情報管理部１０２
に転送蓄積する。The digest creation unit 108 creates a final digest video based on the output of the extraction information control unit 107 by using personal information including the individual's hobbies and preferences obtained by the personal information processing unit 109. Management unit 102
To be stored.

【００３１】ダイジェスト情報出力部１１０は、映像蓄
積部１０３に蓄積されたダイジェスト映像を他システム
（ここでは図１中のＴＶ番組予約システム２０）に出力
・表示する。The digest information output unit 110 outputs and displays the digest video stored in the video storage unit 103 to another system (here, the TV program reservation system 20 in FIG. 1).

【００３２】ユーザインタフェース部１１１はユーザと
のインタフェースをなすもので、ユーザの入力した個人
情報の個人情報処理部１０９への転送、ユーザの指示に
従う制御情報の映像情報管理部１０２への転送を行う。The user interface unit 111 serves as an interface with the user, and transfers personal information input by the user to the personal information processing unit 109 and transfers control information according to the user's instruction to the video information management unit 102. .

【００３３】次に、以上のように構成されたダイジェス
ト作成システム１０の動作を説明する。Next, the operation of the digest creation system 10 configured as described above will be described.

【００３４】まず、ダイジェスト作成システム１０内の
音情報抽出部１０５による音情報（具体的には音量値）
に基づく特徴抽出（特徴ポイントまたは特徴シーンの抽
出）について、図３の動作説明図及び図４のフローチャ
ートを参照して説明する。First, sound information (specifically, volume value) by the sound information extraction unit 105 in the digest creation system 10
Extraction of feature points (extraction of feature points or feature scenes) will be described with reference to the operation explanatory diagram of FIG. 3 and the flowchart of FIG.

【００３５】ここでは、図１中のＴＶ番組予約システム
２０にて録画テープ（ＶＴＲテープ）ＴＰに２時間録画
された大相撲の番組映像が、当該番組映像に同期した音
情報と共に、映像取り込み部１０１により取り込まれて
映像蓄積部１０３に蓄積されているものとする。図３に
示す録画テープＴＰ内の時間範囲Ｒ１は１つの取り組み
を示している。Here, the sumo wrestling program video recorded on the recording tape (VTR tape) TP for 2 hours by the TV program reservation system 20 in FIG. And stored in the video storage unit 103. The time range R1 in the recording tape TP shown in FIG. 3 indicates one approach.

【００３６】このＲ１内の音量値をトレースしてグラフ
化したものをＲ２域に示す。この取り組みでは、予め定
められたある一定の音量値（音声パワーレベル）Ｔh1以
上となったとき（あるいはＴh1を越えたとき）の番組映
像の最初のフレーム画像を各々Ｄ１，Ｄ２，Ｄ３の画像
として抽出する。A graph obtained by tracing the sound volume value in R1 and forming a graph is shown in the R2 area. In this approach, the first frame images of the program video when the sound volume level (audio power level) becomes equal to or more than a predetermined fixed sound value (audio power level) Th1 (or exceeds Th1) are taken as D1, D2, and D3 images, respectively. Extract.

【００３７】取り組み状況としては、観客を沸かせた、
即ち大歓声による音の連続時間帯の場面が３回あったこ
とを示すと考えれば、音量がある一定値Ｔh1以上となっ
た時間帯に対応する番組映像を連続して抽出することも
可能となる。音情報を使った場合には、時間的に連続し
た特徴シーンとして抽出する方が、より効果的な特徴抽
出となり得る。As for the status of the efforts, the audience was excited,
That is, considering that there are three scenes in the continuous time zone of the loud cheering sound, it is possible to continuously extract the program video corresponding to the time zone in which the sound volume is equal to or more than the certain value Th1. Become. When sound information is used, extracting as a temporally continuous feature scene can be a more effective feature extraction.

【００３８】そこで本実施形態における音情報抽出部１
０５は、録画テープＴＰの先頭から番組映像に同期した
音情報の音量値をチェックし（ステップＳ１）、音量値
がＴh1以上となったことを検出した場合には、音量値が
Ｔh1以上となる状態が、その時点から予め定められた一
定時間Ｔh2以上継続するか否か、つまり音量値が連続し
てＴh1以上となる期間（時間帯）がＴh2以上であるか否
かを調べる（ステップＳ３）。そして音情報抽出部１０
５は、音量値がＴh1以上となる状態が時間Ｔh2以上継続
している場合に限り、次の映像抽出を行う。Therefore, the sound information extracting unit 1 in the present embodiment
05 checks the volume value of the sound information synchronized with the program video from the beginning of the recording tape TP (step S1), and when it is detected that the volume value is greater than Th1, the volume value is greater than Th1. It is checked whether or not the state continues for a predetermined time Th2 or more from that point in time, that is, whether or not a period (time zone) in which the volume value continuously becomes Th1 or more is Th2 or more (step S3). . And the sound information extraction unit 10
5 performs the next video extraction only when the state in which the volume value is equal to or greater than Th1 continues for the time Th2 or more.

【００３９】まず音情報抽出部１０５は、ユーザインタ
フェース部１１１を介して映像情報管理部１０２に入力
されたユーザの指定情報に従い、静止画として抽出する
か連続画として抽出するかを判断する（ステップＳ
４）。もし、静止画が指定されているならば、音情報抽
出部１０５は音量値がＴh1以上となった番組映像の先頭
のフレーム画像を（音量値による特徴ポイントとして）
抽出する（ステップＳ５）。これに対し、連続画が指定
されているならば、音情報抽出部１０５は音量値がＴh1
以上となっている時間帯の番組映像を（音量値による特
徴シーンとして）抽出する（ステップＳ６）。First, the sound information extraction unit 105 determines whether to extract as a still image or a continuous image in accordance with the user's designation information input to the video information management unit 102 via the user interface unit 111 (step). S
4). If a still image is specified, the sound information extraction unit 105 sets the first frame image of the program video whose volume value is equal to or greater than Th1 (as a feature point based on the volume value).
Extract (step S5). On the other hand, if the continuous image is designated, the sound information extracting unit 105 sets the volume value to Th1.
The program video in the time zone described above is extracted (as a characteristic scene based on the volume value) (step S6).

【００４０】音情報抽出部１０５は以上の動作を録画テ
ープＴＰに録画された番組映像の終了位置まで繰り返す
（ステップＳ７）。つまり、図３の例であれば、録画テ
ープＴＰ内の各取り組みの時間帯Ｒ１についての処理
を、当該録画テープＴＰの２時間全体に対して行うこと
によって、音量値による特徴ポイントまたは特徴シーン
を抽出する。The sound information extracting unit 105 repeats the above operation up to the end position of the program video recorded on the recording tape TP (step S7). That is, in the example of FIG. 3, by performing the process for the time zone R1 of each approach in the recording tape TP for the entire two hours of the recording tape TP, the characteristic points or the characteristic scenes based on the volume values can be obtained. Extract.

【００４１】なお、図３中の操作画面上のモニタＭ１は
録画テープＴＰテープをそのまま表示し、モニタＭ２は
抽出された特徴シーン（ここでは、音情報抽出部１０５
により抽出された音量特徴によるダイジェスト映像）を
表示するのに用いられる。The monitor M1 on the operation screen in FIG. 3 displays the recording tape TP tape as it is, and the monitor M2 displays the extracted characteristic scene (here, the sound information extraction unit 105).
Is used to display a digest video based on the volume feature extracted by the above.

【００４２】図５は図３で説明した大相撲の映像番組を
同様に録画テープＴＰに２時間録画した映像に対して、
テロップを表示している映像場面をテロップ抽出部１０
４により抽出する方法について説明する図である。FIG. 5 shows a video of the sumo wrestling video program described with reference to FIG.
The video scene displaying the telop is extracted by the telop extraction unit 10
FIG. 4 is a diagram for explaining a method of extraction by No. 4.

【００４３】図５において、録画テープＴＰの時間範囲
Ｒ１は１つの取り組みを示している。図５の例では、Ｒ
１内でテロップ映像として３箇所の場面、即ち画像Ｄ１
１，Ｄ１２，Ｄ１３を抽出している。この画像（テロッ
プ画像）Ｄ１１，Ｄ１２，Ｄ１３の具体例を図６
（ａ），（ｂ），（ｃ）に示す。図６（ａ）の画像Ｄ１
１は力士呼び出しの後に表示される対戦相手の力士名の
テロップ画像、図６（ｂ）の画像Ｄ１２は過去の対戦成
績表のテロップ画像、図６（ｃ）の画像Ｄ１３は取り組
みの結果、勝ち名乗りを受ける際の決まり手のテロップ
画像である大相撲の番組映像については表示テロップの
パターンは固定化されているため、録画テープＴＰの１
つの取り組み時間範囲であるＲ１を抽出することも可能
である。In FIG. 5, the time range R1 of the recording tape TP shows one approach. In the example of FIG.
1, three scenes as telop images, that is, image D1
1, D12 and D13 are extracted. FIG. 6 shows a specific example of these images (telop images) D11, D12, and D13.
(A), (b) and (c) show. Image D1 in FIG.
1 is a telop image of the opponent's wrestler name displayed after calling the wrestler, image D12 of FIG. 6 (b) is a telop image of the past match record, and image D13 of FIG. Since the display telop pattern is fixed for the sumo wrestling program video, which is the telop image that is the definitive telop image when receiving the video, the 1
It is also possible to extract R1 which is one working time range.

【００４４】テロップ抽出部１０４によるテロップ画像
（画面）の抽出は、基本的には次のように行われる。ま
ず、映像画面全体について時間方向での隣接するフレー
ム画像の差情報が一定値以上となる画素数をカウント
し、そのカウント値が一定数以上となった画像を検出す
ることで行う。なお、映像画面全体について時間方向で
の隣接するフレーム画像の差情報を一定値で２値化し、
エッジ・データの出現頻度が、ある一定値以上となる画
像を検出するようにしても構わない。The extraction of a telop image (screen) by the telop extraction unit 104 is basically performed as follows. First, the number of pixels for which the difference information between adjacent frame images in the time direction of the entire video screen is equal to or greater than a certain value is counted, and an image whose count value is equal to or more than a certain number is detected. Note that the difference information between adjacent frame images in the time direction for the entire video screen is binarized by a constant value,
An image in which the appearance frequency of the edge data is equal to or higher than a certain value may be detected.

【００４５】先の大相撲などの例では、テロップが表示
される画面上の領域は、図６のように固定化されてい
る。このような場合には、予めマスク領域を設定してチ
ェックすることで抽出精度を格段に高くすることが可能
となる。テロップが表示される画面上の領域は各番組に
よって規定される場合が多い。したがって、例えば、ユ
ーザの趣味趣向情報の１つとして扱うこと、つまり各ユ
ーザの好みの番組毎に、固有のマスク領域を設定するこ
とも可能である。In the above example of sumo wrestling, the area on the screen where the telop is displayed is fixed as shown in FIG. In such a case, by setting and checking a mask area in advance, the extraction accuracy can be significantly increased. The area on the screen where the telop is displayed is often defined by each program. Therefore, for example, it is also possible to treat the information as one of the user's hobby and taste information, that is, to set a unique mask area for each user's favorite program.

【００４６】また、音情報抽出部１０５による音量値に
基づく特徴シーンの抽出と、テロップ抽出部１０４によ
るテロップ抽出に基づく特徴映像抽出の方法を、組み合
わせ処理部１０６にて組み合わせて利用することによっ
て、一層効率的なダイジェスト番組映像を作成すること
ができる。つまり、先の大相撲の例であれば、テロップ
表示による特徴映像情報で各取り組みの時間範囲を決定
し、その時間範囲内で、音量値が設定値Ｔh1以上となる
映像シーンを抽出することで、より品質の高いダイジェ
スト映像を作成することができる。Also, the combination processing unit 106 uses a combination of a feature scene extraction method based on the sound volume value by the sound information extraction unit 105 and a feature video extraction method based on the telop extraction by the telop extraction unit 104. More efficient digest program video can be created. In other words, in the case of the above-mentioned sumo wrestling, the time range of each approach is determined based on the characteristic video information by telop display, and video scenes in which the volume value is equal to or greater than the set value Th1 are extracted within the time range. A higher quality digest video can be created.

【００４７】ここでテロップ抽出部１０４によりテロッ
プ映像を抽出して、ダイジェスト映像を作成する場合の
動作の詳細を、ニュースなどの速報用テロップ映像を抽
出する場合と、バラエティ番組などで多用される強調テ
ロップ映像を抽出する場合に分けて説明する。The operation of extracting a telop video by the telop extraction unit 104 and creating a digest video is described in detail below. The details of the operation of extracting a telop video for breaking news such as news and the emphasis often used in variety programs are shown. The case of extracting a telop video will be described separately.

【００４８】図７（ａ）は、ニュースなどの速報用テロ
ップ映像を抽出するために設定されるマスク領域を示
す。ここでは、テレビ画面３０上での検出領域（マスク
領域）として、画面上部の横長の矩形領域３１、または
画面下部の横長の矩形領域３２を設定し、当該領域３１
または３２内の画像の時間的変化をチェックすること
で、速報用テロップ映像を抽出する。FIG. 7A shows a mask area set for extracting a flash telop image such as news. Here, a horizontally long rectangular area 31 at the top of the screen or a horizontally long rectangular area 32 at the bottom of the screen is set as a detection area (mask area) on the television screen 30.
Alternatively, by checking the temporal change of the image in 32, the telop image for the bulletin is extracted.

【００４９】以下、テロップ抽出部１０４による速報用
テロップ映像の抽出処理について図８のフローチャート
を参照して説明する。Hereinafter, the process of extracting a telop image for a flash report by the telop extraction unit 104 will be described with reference to the flowchart of FIG.

【００５０】まずテロップ抽出部１０４は、録画テープ
ＴＰに録画された番組映像の先頭から時間軸上の前後の
映像フレームを順次取り出し、両フレーム画像のマスク
領域（３１または３２）の同一位置の画素同士の例えば
輝度データの差分からなる多値の差情報（差分画像）を
求める（ステップＳ１１〜Ｓ１３）。First, the telop extraction unit 104 sequentially extracts the preceding and succeeding video frames on the time axis from the beginning of the program video recorded on the recording tape TP, and extracts the pixels at the same position in the mask area (31 or 32) of both frame images. Multi-value difference information (difference images) composed of, for example, differences between the luminance data is obtained (steps S11 to S13).

【００５１】次にテロップ抽出部１０４は、そのマスク
領域（３１または３２）の差情報（差分画像）につい
て、多値画像のままで、ある設定閾値以上の画素の数を
カウントし（ステップＳ１５）、そのカウント値が当該
領域内で（マスク領域の総画素数で決まる）一定個数
（あるいはマスク領域の総画素数に対する一定の割合）
以上となった場合に、ニュースなどの速報用テロップ映
像が抽出されたと判断する（ステップＳ１５）。Next, the telop extraction unit 104 counts the number of pixels having a certain threshold or more for the difference information (differential image) of the mask area (31 or 32) as it is in the multi-valued image (step S15). , A count value of which is determined in the area (determined by the total number of pixels in the mask area) (or a certain ratio to the total number of pixels in the mask area)
If this is the case, it is determined that a flash telop video such as news has been extracted (step S15).

【００５２】この判断手法だけでは、テロップ映像が表
示される時点とテロップ表示が終了した時点のみの抽出
となる。そこで本実施形態では、以下の手法を適用す
る。With this determination method alone, extraction is performed only at the time when the telop video is displayed and when the telop display is completed. Therefore, in the present embodiment, the following method is applied.

【００５３】即ちテロップ抽出部１０４は、テロップ映
像が表示される時点を検出すると、直前のフレーム画
像、即ちテロップを表示する１つ前のフレーム画像を保
持すると共に（ステップＳ１６，Ｓ１７）、テロップ映
像の表示の開始時点を検出した旨のフラグ（テロップ映
像表示開始フラグ）をＯＮし（ステップＳ１８）、ステ
ップＳ１２で取り出した現フレーム画像（ここでは、先
頭のテロップ映像）をテロップ映像として抽出する（ス
テップＳ１９）。That is, upon detecting the time point at which the telop image is displayed, the telop extraction unit 104 holds the immediately preceding frame image, that is, the frame image immediately before the telop image is displayed (steps S16 and S17), and Is turned on (step S18), and the current frame image (here, the first telop image) extracted in step S12 is extracted as the telop image (step S18). Step S19).

【００５４】以後テロップ抽出部１０４は、後続のフレ
ーム画像を取り出して（ステップＳ１２）、ステップＳ
１７で保持しておいたテロップを表示する１つ前のフレ
ーム画像との差情報を求める（ステップＳ１３）。この
ように、テロップを表示する１つ前のフレーム画像との
差情報を求めることで、テロップ表示が続いている限
り、差情報（差分画像）における設定閾値以上の画素の
数は連続して一定個数以上の大きな値となる（ステップ
Ｓ１４，Ｓ１５）。したがって、この状態で、且つテロ
ップ映像表示開始フラグがＯＮしている場合には、現フ
レーム画像をテロップ画像として抽出することで（ステ
ップＳ１６，Ｓ１９）、テロップが表示されている期間
における全フレーム画像をテロップ映像として抽出でき
る。Thereafter, the telop extraction unit 104 extracts the subsequent frame image (step S12), and
The difference information from the frame image immediately before displaying the telop stored in step 17 is obtained (step S13). In this way, by obtaining difference information from the frame image immediately before displaying the telop, as long as the telop display continues, the number of pixels equal to or larger than the set threshold in the difference information (difference image) is continuously constant. The value is larger than the number (steps S14 and S15). Therefore, in this state, when the telop video display start flag is ON, the current frame image is extracted as the telop image (steps S16 and S19), so that all the frame images during the period in which the telop is displayed are displayed. Can be extracted as a telop image.

【００５５】やがて設定閾値以上の画素の数が一定個数
に満たなくなると、テロップ抽出部１０４はテロップ映
像表示開始フラグをＯＦＦし（ステップＳ２１）、現フ
レーム画像を保持した上で（ステップＳ２２）、次のフ
レーム画像を取り出して( ステップＳ１２）、前記した
ステップＳ１３以降の処理を繰り返す。このように設定
閾値以上の画素の数が一定個数に満たなくなった時点で
は、通常の表示に戻るため、差分情報における設定閾値
以上の画素の数は少なくなる。When the number of pixels equal to or larger than the set threshold value is less than the predetermined number, the telop extraction unit 104 turns off the telop video display start flag (step S21), and holds the current frame image (step S22). The next frame image is taken out (step S12), and the processing from step S13 is repeated. As described above, when the number of pixels equal to or larger than the set threshold value does not reach the certain number, the display returns to the normal display, and the number of pixels equal to or larger than the set threshold value in the difference information decreases.

【００５６】図７（ｂ）は、バラエティ番組などで多用
されている強調テロップ映像を抽出するために設定され
るマスク領域を示す。ここでは、テレビ画面３０上での
検出領域（マスク領域）として、ほぼ画面全体の矩形領
域３３を設定し、当該領域３３の画像の時間的変化をチ
ェックすることで、強調テロップ映像を抽出する。強調
テロップ映像の特徴としては、画面全体に大きな文字
で、且つ目立つ色でテロップ映像が流される点にある。FIG. 7B shows a mask area set to extract an emphasized telop image frequently used in variety programs and the like. Here, a rectangular area 33 of almost the entire screen is set as a detection area (mask area) on the television screen 30, and a temporal change of an image of the area 33 is checked to extract an emphasized telop video. The feature of the emphasized telop image is that the telop image is played in large characters and in a conspicuous color over the entire screen.

【００５７】したがって本実施形態では、図９のフロー
チャートに示すように、先の速報用テロップ映像の抽出
と同様にして、領域３３内の画像の時間軸上での差分処
理（ステップＳ２１，Ｓ２２〜Ｓ２４）を行う他に、以
下の処理を行う。Therefore, in the present embodiment, as shown in the flow chart of FIG. 9, similar to the extraction of the telop image for the bulletin, the difference processing on the time axis of the image in the area 33 (steps S21, S22 to S22). In addition to performing S24), the following processing is performed.

【００５８】まず、マスク領域（３３）の差情報（差分
画像）における設定閾値以上の画素数が一定個数以上と
なった時点で、直前のフレーム画像、即ちテロップを表
示する１つ前のフレーム画像を保持し、テロップ映像表
示開始フラグをＯＮする点（ステップＳ２６〜Ｓ２８）
は、先の速報用テロップ映像の抽出の場合と同様であ
る。速報用テロップ映像の抽出と異なるのは、差情報
（差分画像）における設定閾値以上の画素の数が連続し
て一定個数以上の大きな値となっている期間、その都
度、その時点における現フレーム画像を強調テロップ映
像候補として抽出して、そのフレーム画像における色の
頻度情報を求めておく点（ステップＳ２９）である。こ
の場合、色情報として色相値を使用することで、色の抽
出が安定して行える。First, when the number of pixels equal to or larger than the set threshold value in the difference information (difference image) of the mask area (33) becomes equal to or larger than a predetermined number, the immediately preceding frame image, that is, the frame image immediately before the telop is displayed. Is held and the telop video display start flag is turned ON (steps S26 to S28)
Is the same as the previous case of extracting the telop video for the bulletin. The difference from the extraction of the telop image for the flash report is that the current frame image at that point in time is a period in which the number of pixels equal to or greater than the set threshold in the difference information (difference image) is continuously a large value equal to or greater than a certain number Is extracted as a highlighted telop video candidate, and color frequency information in the frame image is obtained (step S29). In this case, color extraction can be performed stably by using hue values as color information.

【００５９】さてテロップ抽出部１０４は、設定閾値以
上の画素の数が一定個数に満たなくなり、通常の表示に
戻ると、テロップ映像表示開始フラグをＯＦＦし、現フ
レーム画像を保持した上で（ステップＳ３０，Ｓ３
１）、それまで抽出しておいた各強調テロップ映像候補
の色頻度を比較して、マスク領域（チェック領域）３３
内で一定の割合となる色を探す（ステップＳ３２）。そ
してテロップ抽出部１０４は、領域３３内で一定の割合
となる色が存在する場合に、該当する映像を強調テロッ
プ映像と判断して記憶する（ステップＳ３４）。一方、
領域３３内で一定の割合となる色が存在しない場合に
は、テロップ抽出部１０４は該当する映像は強調テロッ
プ映像ではないとして、廃棄する。The telop extraction unit 104 turns off the telop video display start flag when the number of pixels equal to or larger than the set threshold value does not reach the predetermined number and returns to the normal display, and holds the current frame image (step S30, S3
1) Compare the color frequencies of the respective emphasized telop video candidates that have been extracted up to that point, and compare them with the mask area (check area) 33.
A color having a fixed ratio is searched for (step S32). Then, when there is a color having a fixed ratio in the area 33, the telop extraction unit 104 determines that the video is the emphasized telop video and stores the video (step S34). on the other hand,
If there is no color having a certain ratio in the area 33, the telop extracting unit 104 discards the relevant video as not being an enhanced telop video.

【００６０】ここで、ダイジェスト映像としては、強調
テロップ映像のみの映像では、本来の番組映像の内容が
分かりにくい可能性がある。このため、強調テロップ映
像が表示された後、予め定められた時間の映像を抽出す
ることにより、強調テロップで表現した内容と映像を対
応させることが可能となる。Here, as the digest video, if the video is only the emphasized telop video, the original program video content may be difficult to understand. Therefore, by extracting a video at a predetermined time after the highlighted telop video is displayed, it is possible to make the content expressed by the emphasized telop correspond to the video.

【００６１】次に、ダイジェスト作成用に抽出した映像
素材をユーザインタフェース部１１１を通して与えられ
るユーザの個人情報を使って、ダイジェスト作成部１０
８にて時間軸上で差し替える方法について、図１０を参
照して説明する。Next, using the user's personal information provided through the user interface unit 111 with the video material extracted for digest creation, the digest creation unit 10
The method of replacing the data on the time axis at 8 will be described with reference to FIG.

【００６２】まずユーザのＴＶ番組における例えば図１
０の料理番組映像４０の視聴スタイルとして、当該ユー
ザが実際に料理を作る必要がある場合には、料理名やレ
シピ情報の映像を最初に見たいことが多く、必要に応じ
て料理過程を参照する。First, for example, in FIG.
As the viewing style of the cooking program video 40 of 0, when the user needs to actually cook, he often wants to first see the video of the cooking name and recipe information, and refers to the cooking process as necessary. I do.

【００６３】ところが、既に述べたような手法で通常に
作成されるダイジェスト映像では、特徴シーンの順番
は、図１０のように、「料理番組タイトル」のテロップ
画像Ｄ２１、「レシピ」のテロップ画像Ｄ２２、そして
「出来上がり」のテロップ画像Ｄ２３の順の並びとな
り、ユーザとしては料理番組映像４０の時間（ここでは
１０分）全部を見る必要がある。However, in the digest video normally created by the method described above, the order of the characteristic scenes is as shown in FIG. 10 in that the telop image D21 of the "cooking program title" and the telop image D22 of the "recipe" Then, the telop image D23 of “finished” is arranged in this order, and the user needs to watch the entire cooking program video 40 (here, 10 minutes).

【００６４】これに対して、ダイジェスト作成部１０８
により、ユーザの個人情報（好み）を考慮して、矢印４
１のように、抽出した特徴シーンの時間的順番を差し替
えて、例えば、料理名を表示した「出来上がり」テロッ
プの画像Ｄ２３、「レシピ」情報を表示したテロップ画
像Ｄ２２の順に並べ、その後に実際の料理過程の映像シ
ーンを並べる。この料理過程の映像シーンは、「レシ
ピ」のテロップ画像Ｄ２２と「出来上がり」のテロップ
画像２３とで挟まれた時間Ｌの区間のものをそのまま抽
出しても、時間的にサンプリングしたものであっても構
わない。On the other hand, the digest creation unit 108
In consideration of the user's personal information (preference), arrow 4
As shown in FIG. 1, the temporal order of the extracted characteristic scenes is changed, and for example, an image D23 of the “finished” telop displaying the name of the dish and a telop image D22 displaying the “recipe” information are arranged in this order. Arrange video scenes during the cooking process. The video scene of the cooking process is temporally sampled even if a section of the time L sandwiched between the “recipe” telop image D22 and the “finished” telop image 23 is extracted as it is. No problem.

【００６５】以上は、ＴＶ番組予約システム２０と組み
合わせて使用されるダイジェスト作成システム１０につ
いて説明したが、これに限るものではない。Although the digest creation system 10 used in combination with the TV program reservation system 20 has been described above, the present invention is not limited to this.

【００６６】例えば、図１１に示すように、ダイジェス
ト作成システム１０が、ＴＶ番組映像などのコンテンツ
を提供するシステム（コンテンツプロバイダ側システ
ム）５０に組み込まれて使用されるものであっても構わ
ない。つまり、コンテンツプロバイダ側システム５０が
ユーザ側システム６０から要求された番組映像のダイジ
ェスト映像を作成するのにダイジェスト作成システム１
０を用いることも可能である。For example, as shown in FIG. 11, the digest creation system 10 may be used by being incorporated in a system (content provider side system) 50 for providing contents such as TV program video. That is, the digest creation system 1 is used by the content provider system 50 to create a digest video of the program video requested by the user system 60.
It is also possible to use 0.

【００６７】図１１の構成において、ユーザ側システム
６０は、ユーザ自身が見たい番組映像をコンテンツプロ
バイダ側システム５０に要求する。この際、要求した番
組映像の提供をコンテンツプロバイダ側システム５０か
ら受ける場合に、当該システム５０が最初から番組映像
の全てをユーザ側システム６０に送るように構成されて
いるものとすると、容量の多い映像を転送するのに時間
がかかると共に、ユーザが本当に見たい番組映像であっ
たのかといった問題がある。In the configuration shown in FIG. 11, the user system 60 requests the program video that the user wants to view from the content provider system 50. At this time, when the requested program video is provided from the content provider system 50, if the system 50 is configured to send all of the program video from the beginning to the user system 60, the capacity is large. It takes time to transfer the video, and there is a problem that the user really wants to watch the program video.

【００６８】これに対して、図１１のビデオ・オン・デ
マンドシステムにおいては、コンテンツプロバイダ側シ
ステム５０に組み込まれたダイジェスト作成システム１
０により、ユーザの要求に応じてユーザの指定した番組
映像のダイジェスト映像を作成して、ユーザ側システム
６０に提供することができる。これによりユーザは、ま
ず最初にダイジェスト映像で目的とする番組映像の大ま
かな内容を把握した上で、もし本当に見たい番組映像で
あるならば、再度、番組映像の原映像の転送を要求すれ
ばよい。On the other hand, in the video-on-demand system of FIG. 11, the digest creation system 1 incorporated in the content provider system 50
With 0, a digest video of the program video specified by the user can be created in accordance with the user's request and can be provided to the user-side system 60. With this, the user first grasps the rough contents of the target program video in the digest video, and if the user wants to watch the program video really, he requests the transfer of the original video of the program video again. Good.

【００６９】ところで、ダイジェスト映像を番組映像の
内容を示す１つの要約と考えた場合、今後、普及が予想
される映像のデジタル化に向けて、時間軸上でクラスタ
化した映像番組の要約の表現として、同様な時間軸上で
連続した情報である必要はない。そこで、例えばＤＶＤ
用の映画コンテンツをその特徴情報に対応した映像中の
静止画像として抽出し、予め用意している映画紹介シナ
リオ・マップに貼り付けることによって映画の内容を示
す要約として利用することも可能である。勿論、その要
約に興味があれば、原映像を鑑賞することも可能であ
る。By the way, when the digest video is considered as one summary showing the contents of the program video, the expression of the summary of the video program clustered on the time axis toward the digitization of the video which is expected to spread in the future. However, the information need not be continuous on a similar time axis. So, for example, DVD
It is also possible to extract a movie content for use as a still image in a video corresponding to the feature information and paste it into a movie introduction scenario map prepared in advance to use it as a summary showing the content of the movie. Of course, if you are interested in the summary, you can also watch the original video.

【００７０】また、前記実施形態で適用したように、テ
ロップ情報及び音情報を用いて、番組映像の特徴とし
た、時間軸上の各ポイント映像を抽出して、時間軸上で
連続したダイジェスト番組を作成する他に、当該時間軸
上の各ポイント映像を本来の連続した時間軸上に編集せ
ずに、時間軸上で任意に並べ替えて編集することによっ
て、ダイジェスト番組を鑑賞する場合においても直ちに
結果映像を見ることを可能とする。このような並べ替え
による編集の効果的な例として、野球やサッカーなどの
スポーツ番組映像への適用がある。ここでは、試合結果
を最初に見て、後でゆっくりと内容を見るといった場面
を考えることができる。Further, as applied in the above embodiment, each point video on the time axis, which is a feature of the program video, is extracted using the telop information and the sound information, and the digest program which is continuous on the time axis is extracted. Besides creating a point program on the time axis without editing it on the original continuous time axis, but by arbitrarily rearranging and editing it on the time axis, It allows you to see the result video immediately. As an effective example of the editing by such rearrangement, there is application to a sports program video such as baseball or soccer. Here, it is possible to think of a situation in which the result of the game is viewed first, and then the content is slowly viewed later.

【００７１】以上に述べたダイジェスト作成システム１
０を構成する機能要素群、あるいはダイジェスト作成シ
ステム１０にて実行される処理手順、特にテロップ抽出
部１０４及び音情報抽出部１０５にて実行される処理手
順は、コンピュータをその機能要素群の集合として機能
させるためのプログラム、あるいはコンピュータに当該
処理手順を実行させるためのプログラムが記録されたＣ
Ｄ−ＲＯＭ等の記録媒体をコンピュータに装着して、当
該プログラムを読み取り実行させることにより実現され
る。このプログラムが、通信回線等の通信媒体を通して
コンピュータにロードされるものであってもよい。The digest creation system 1 described above
0, or a processing procedure executed by the digest creation system 10, particularly a processing procedure executed by the telop extraction unit 104 and the sound information extraction unit 105, uses a computer as a set of the functional element group. A program for causing the computer to execute the processing procedure or a program for causing the computer to function.
This is realized by loading a recording medium such as a D-ROM into a computer and reading and executing the program. This program may be loaded into a computer through a communication medium such as a communication line.

【００７２】[0072]

【発明の効果】以上詳述したように本発明によれば、映
像情報、特に録画した番組映像に対して、通常ＶＴＲ装
置の早速りのような単純な時間軸上の情報圧縮、映像認
識など処理に時間を要する方法によらず、番組映像の属
性を表現しているテロップや音情報を使って、当該番組
映像の特徴となる場面やシーンを抽出してダイジェスト
映像としているため、当該ダイジェスト映像をフレキシ
ブルに編集できる。As described above in detail, according to the present invention, for video information, especially for a recorded program video, information compression on a simple time axis such as a normal VTR device, video recognition, etc. Regardless of the method that requires processing time, scenes and scenes that are characteristic of the program video are extracted using telops and sound information representing the attributes of the program video, and the digest video is extracted. Can be edited flexibly.

[Brief description of the drawings]

【図１】本発明の一実施形態に係るダイジェスト作成シ
ステムをＴＶ番組予約システムと組み合わせて構成され
た情報システムのブロック図。FIG. 1 is a block diagram of an information system configured by combining a digest creation system according to an embodiment of the present invention with a TV program reservation system.

【図２】図１中のダイジェスト作成システム１０の構成
を示すブロック図。FIG. 2 is a block diagram showing a configuration of a digest creation system 10 in FIG.

【図３】音情報抽出部１０５による音情報（音量値）に
基づく特徴抽出を説明するための図。FIG. 3 is a view for explaining feature extraction based on sound information (volume value) by a sound information extraction unit 105;

【図４】音情報抽出部１０５による音情報（音量値）に
基づく特徴抽出の処理手順を示すフローチャート。FIG. 4 is a flowchart showing a processing procedure of feature extraction based on sound information (volume value) by a sound information extraction unit 105;

【図５】テロップ抽出部１０４による番組映像内のテロ
ップ映像抽出を説明するための図。FIG. 5 is a view for explaining extraction of a telop video from a program video by a telop extraction unit 104;

【図６】大相撲の番組映像から抽出されたテロップ映像
の一例を示す図。FIG. 6 is a diagram showing an example of a telop video extracted from a sumo wrestling program video.

【図７】テロップ抽出部１０４によるテロップ映像抽出
に用いられるマスク領域を、速報用テロップ映像抽出と
強調テロップ映像抽出の各々について示す図。FIG. 7 is a diagram showing a mask area used for telop image extraction by the telop extraction unit 104, for each of the preliminary notification telop image extraction and the enhanced telop image extraction.

【図８】テロップ抽出部１０４による速報用テロップ映
像抽出の処理手順を示す図。FIG. 8 is a diagram showing a processing procedure for extracting a telop video for a flash report by a telop extraction unit 104;

【図９】テロップ抽出部１０４による強調テロップ映像
抽出の処理手順を示す図。FIG. 9 is a diagram showing a processing procedure for extracting an enhanced telop video by a telop extraction unit 104;

【図１０】ダイジェスト作成用に抽出した映像素材をダ
イジェスト作成部１０８により時間軸上で差し替える動
作を説明するための図。FIG. 10 is a view for explaining an operation of replacing a video material extracted for digest creation with a digest creation unit on a time axis.

【図１１】本発明の一実施形態に係るダイジェスト作成
システムをビデオ・オン・デマンドシステムに適用した
構成例を示すブロック図。FIG. 11 is a block diagram showing a configuration example in which a digest creation system according to an embodiment of the present invention is applied to a video-on-demand system.

[Explanation of symbols]

１０…ダイジェスト作成システム２０…ＴＶ番組予約システム５０…コンテンツプロバイダ側システム１０２…映像情報管理部１０４…テロップ抽出部１０５…音情報抽出部１０６…組み合わせ処理部１０７…抽出情報制御部１０８…ダイジェスト作成部１０９…個人情報処理部 DESCRIPTION OF SYMBOLS 10 ... Digest creation system 20 ... TV program reservation system 50 ... Content provider side system 102 ... Video information management part 104 ... Telop extraction part 105 ... Sound information extraction part 106 ... Combination processing part 107 ... Extraction information control part 108 ... Digest creation part 109 ... Personal Information Processing Department

───────────────────────────────────────────────────── フロントページの続き (72)発明者阿部哲也神奈川県川崎市幸区柳町70番地株式会社東芝柳町工場内 (72)発明者大喜多秀紀神奈川県川崎市幸区柳町70番地株式会社東芝柳町工場内 (72)発明者紺田和宣神奈川県川崎市幸区柳町70番地株式会社東芝柳町工場内 ──────────────────────────────────────────────────続き Continued on the front page (72) Inventor Tetsuya Abe 70, Yanagimachi, Kochi-ku, Kawasaki, Kanagawa Prefecture Inside the Toshiba Yanagicho Plant Co., Ltd. (72) Inventor Kazunori Konda 70, Yanagicho, Saiwai-ku, Kawasaki City, Kanagawa Prefecture

Claims

[Claims]

1. A sound information extracting means for extracting a characteristic image or a characteristic scene of the video information based on the sound information added in synchronization with the video information, and a characteristic extracted by the sound information extracting means. A digest creation system, comprising: digest creation means for creating a digest video based on an image or a feature scene.

2. A telop extracting means for extracting a telop video including character information from video information, and a digest generating means for generating a digest video based on the telop video extracted by the telop extracting means. Digest creation system featuring.

3. A sound information extracting means for extracting a characteristic image or a characteristic scene of the video information based on the sound information added in synchronization with the video information, and a telop video including character information from the video information. A telop extraction unit for extracting a characteristic image or a characteristic scene extracted by the sound information extraction unit, and a telop video extracted by the telop extraction unit, and a combination processing unit for extracting a new characteristic video; A digest creation system, comprising: digest creation means for creating a digest video based on the feature video extracted by the combination processing means.

4. The method according to claim 1, wherein the sound information extracting unit is configured to set a time period during which a volume value of the sound information added in synchronization with the video information continuously exceeds a predetermined set value to a predetermined set time or more. The digest creation according to claim 1 or 3, wherein a feature image of the video information at a corresponding time point or a feature scene of the video information in a corresponding time zone is extracted by detecting a state of the digest. system.

5. The telop extraction means detects a change on the time axis of image data in the region of the type of the video on the video screen or a region specific to the type of the telop video. The digest creation system according to claim 2 or 3, wherein a telop image is extracted.

6. The video information to be subjected to telop video extraction by the telop extraction means is a program video, and the telop extraction means targets a predetermined area in an upper part and a lower part on a program video screen on a time axis. Bulletin telop extraction means for extracting bulletin telop images by checking changes in luminance information, and checking the change state on the time axis of color information for the entire program video screen and extracting it in the past on the time axis 6. The digest creation system according to claim 5, further comprising: an enhanced telop extracting means for extracting an enhanced telop image by detecting a state in which a color separated from the set color is equal to or greater than a set ratio.