JP3503797B2

JP3503797B2 - Video telop detection method and apparatus

Info

Publication number: JP3503797B2
Application number: JP11271997A
Authority: JP
Inventors: 隆佐藤; 佳伸外村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1997-04-30
Filing date: 1997-04-30
Publication date: 2004-03-08
Anticipated expiration: 2017-04-30
Also published as: JPH10304247A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、映像からテロップ
を検出する方法および装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method and apparatus for detecting telops in video.

【０００２】[0002]

【従来の技術】映像からテロップを検出する従来の装置
は、１枚から数枚のフレーム画像による局所的な特徴を
用いてテロップを検出していた。2. Description of the Related Art A conventional device for detecting a telop from an image detects a telop using a local feature of one to several frame images.

【０００３】例えば、テロップの周辺には大きな輝度差
があることを利用し、まず、フレーム間で輝度や色相の
分布の変化を調べ、テロップが出現するフレームを見つ
け、次にテロップ出現前後のフレーム間で差分をとり、
テロップを抽出するという方法がある（例えば、根本
他、「テロップの認識による資料映像の検索につい
て」、１９９４年電子情報通信学会春季大会Ｄ−４２
７、１９９４など）。For example, by utilizing the fact that there is a large brightness difference around the telop, first, the change in the distribution of brightness and hue between frames is examined to find the frame in which the telop appears, and then the frames before and after the telop appears. Take the difference between
There is a method of extracting a telop (for example, Nemoto et al., “Searching for Material Video by Recognition of Telop”, 1994 IEICE Spring Conference D-42).
7, 1994).

【０００４】また、１枚のフレーム画像を対象にし、テ
ロップが背景に比べて高輝度でありエッジを抽出しやす
いという性質を用い、画像に対して１次微分によるエッ
ジ抽出を行い、エッジ画像を縦横方向に投影してテロッ
プを検出するという方法もある（例えば、茂木他、「ニ
ュース映像中の文字認識に基づく記事の索引付け」、電
子情報通信学会技術研究報告ＩＥ９５−１５３、１９９
６など）。Further, targeting one frame image, the telop has a higher brightness than the background and the edge is easily extracted, and the edge extraction is performed on the image by the primary differential to extract the edge image. There is also a method of detecting telops by projecting in the vertical and horizontal directions (for example, Mogi et al., "Indexing Articles Based on Character Recognition in News Video", IEICE Technical Research Report IE95-153, 199).
6 etc.).

【０００５】また、テロップが静止していて、かつ、高
輝度であるという性質を用い、２枚のフレーム間で動き
のない部分を求め、さらに、輝度が所定値以上の領域を
字幕部分として検出する装置もある（例えば、特開平８
−３３１４５６「字幕移動装置」）。Further, by using the property that the telop is stationary and having high luminance, a portion having no movement between two frames is obtained, and further, an area having a luminance equal to or higher than a predetermined value is detected as a subtitle portion. There is also a device (for example, Japanese Unexamined Patent Application Publication No. Hei 8
-331456 "Subtitle moving device").

【０００６】また、ＭＰＥＧなどフレーム間の相関を用
いて符号化された映像では、フレーム間の相関を用い、
かつ、動き補償を用いないで符号化された画素がテロッ
プの部分に時間的空間的に集中するという性質がある。
この性質を利用し、フレーム間の相関を用い、かつ、動
き補償を用いないで符号化された画素の出現頻度をある
時間区間で計数することによってテロップを検出する装
置もある（佐藤他、「ＭＰＥＧ映像からのテロップ領域
抽出法」１９９６年電子情報通信学会情報・システム
ソサイエティ大会Ｄ−２７３）。Further, in a video image coded using correlation between frames such as MPEG, correlation between frames is used,
Moreover, there is a property that pixels coded without using motion compensation are temporally and spatially concentrated in the telop portion.
Using this property, there is also a device that detects a telop by counting the frequency of appearance of coded pixels using correlation between frames and without using motion compensation (Sato et al., " Method of extracting telop area from MPEG image "1996 Information and Systems Society Conference of the Institute of Electronics, Information and Communication Engineers D-273).

【０００７】[0007]

【発明が解決しようとする課題】上述した従来技術で
は、１枚または２枚のフレーム画像という、時間的に局
所的な情報を用いてテロップを検出していたため、静止
している、輝度が高い、高周波成分が大きいなどのテロ
ップと類似した特徴を持ったテロップ以外の被写体が存
在すると、それをテロップとして誤検出してしまうとい
う問題があった。In the above-mentioned prior art, since the telop is detected using the temporally local information of one or two frame images, the telop is stationary and the brightness is high. However, if there is a subject other than the telop having similar characteristics to the telop, such as a large high-frequency component, there is a problem in that it is erroneously detected as the telop.

【０００８】逆に、長時間画面に現れているテロップ
が、画質劣化やノイズ等の影響によって一時的に動いた
り、輪郭がぼけたりすると、その部分は検出漏れになっ
てしまう。このため、本来ひとつの連続したテロップ
を、複数の時間区間にわたる別々のテロップとして重複
検出してしまうことになる。On the other hand, if the telop appearing on the screen for a long time temporarily moves or the contour is blurred due to the influence of image quality deterioration, noise, etc., that portion will be missed. For this reason, originally one continuous telop is redundantly detected as separate telops over a plurality of time intervals.

【０００９】つまり、従来技術は、ある短い区間を対象
にして、テロップが存在するかを判定しているため、テ
ロップ以外の被写体の過剰検出や、ノイズによるテロッ
プの検出漏れを免れることが難しい。したがって、映像
からテロップの一覧を得るという用途に従来技術を用い
ると、テロップ以外の被写体を誤って表示したり、一つ
のテロップを重複して表示してしまうことがしばしばあ
った。That is, in the prior art, since it is determined whether a telop exists for a certain short section, it is difficult to avoid excessive detection of a subject other than the telop and omission of detection of the telop due to noise. Therefore, when the conventional technique is used for obtaining a list of telops from an image, a subject other than the telops may be erroneously displayed or one telop may be displayed in duplicate.

【００１０】本発明の目的は、映像からテロップを過不
足なく検出する映像テロップ検出方法および装置を提供
することである。An object of the present invention is to provide a video telop detection method and apparatus for detecting telops from video without excess or deficiency.

【００１１】[0011]

【課題を解決するための手段】上記目的を達成するため
に、映像テロップ検出方法は、画素または画素の集合の
単位でテロップ候補画素を抽出し、縦横の空間軸と時間
軸とから成る３次元のバッファに格納する抽出段階と、
前記バッファ上のテロップ候補画素を併合する併合段階
を有する。In order to achieve the above object, a video telop detection method extracts a telop candidate pixel in the unit of a pixel or a set of pixels, and has a three-dimensional structure including vertical and horizontal spatial axes and a temporal axis. Extraction stage to store in the buffer of
There is a merging step of merging telop candidate pixels on the buffer.

【００１２】また、本発明の映像テロップ検出装置は、
縦横の空間軸と時間軸とから成る３次元のバッファと、
画素または画素の集合の単位でテロップ候補画素を抽出
し、バッファに格納する抽出手段と、バッファ上のテロ
ップ候補画素を併合する併合手段を有する。Further, the video telop detection device of the present invention is
A three-dimensional buffer consisting of vertical and horizontal space axes and time axis,
It has extraction means for extracting telop candidate pixels in units of pixels or a set of pixels and storing them in a buffer, and merging means for merging telop candidate pixels on the buffer.

【００１３】映像から画素または画素の集合の単位でテ
ロップ候補画素を抽出し、縦横の空間軸と時間軸とから
成る３次元のバッファに格納することにより、従来技術
よりも長時間に渡る映像を処理することが可能となる。
さらに、バッファ上のテロップ候補画素を併合すること
により、短時間の微小な変化を無視してノイズによる影
響を除去することができる。By extracting telop candidate pixels from a video in units of pixels or a set of pixels and storing the telop candidate pixels in a three-dimensional buffer consisting of vertical and horizontal spatial axes and a temporal axis, a video for a longer time than the prior art can be displayed. It becomes possible to process.
Further, by merging the telop candidate pixels on the buffer, it is possible to ignore the minute change for a short time and remove the influence of noise.

【００１４】本発明の実施態様によれば、抽出手段は、
映像のエッジを求めるエッジ生成手段と、エッジの値を
縦方向と横方向に投影する投影手段と、投影された値と
閾値との比較結果に基づいてテロップ候補画素を判定す
る比較手段を有する。According to an embodiment of the present invention, the extraction means is
It has an edge generating means for obtaining an edge of the image, a projecting means for projecting the edge value in the vertical and horizontal directions, and a comparing means for judging the telop candidate pixel based on the comparison result of the projected value and the threshold value.

【００１５】抽出手段において、エッジ生成手段が映像
のエッジを求めることにより、テロップの高周波成分が
大きいという特徴に基づき、テロップ周辺にエッジが集
中した画像を得ることができる。次に、エッジの値を縦
横方向に投影する投影手段によって、エッジの集中の度
合を１次元で評価することが可能となる。比較手段で
は、投影された値と閾値とを比較することによって、エ
ッジが集中している部分を検出することができる。これ
によって、テロップ候補画素を求めることができる。In the extraction means, the edge generation means obtains the edges of the image, so that an image in which the edges are concentrated around the telop can be obtained based on the feature that the high frequency component of the telop is large. Next, the degree of concentration of edges can be evaluated one-dimensionally by the projection means that projects the edge values in the vertical and horizontal directions. The comparison means can detect a portion where the edges are concentrated by comparing the projected value with the threshold value. Thereby, the telop candidate pixel can be obtained.

【００１６】本発明の実施態様によれば、抽出手段は、
フレーム間の相関を利用して符号化された映像データか
ら、フレーム間の相関を用い、かつ、動き補償を用いな
いで符号化された画素の数を、それぞれの画素の位置ご
とに、計数区間内で計数する計数手段を有する。According to an embodiment of the present invention, the extraction means is
From the video data coded using the correlation between frames, the number of pixels coded using the correlation between frames and without motion compensation is calculated for each pixel position. It has a counting means for counting inside.

【００１７】計数手段は、フレーム間の相関を利用して
符号化された映像データから、フレーム間の相関を用
い、かつ、動き補償を用いないで符号化された画素の数
を、それぞれの画素の位置ごとに、ある計数区間で計数
する。テロップには、フレーム間の相関を用い、かつ、
動き補償を用いないで符号化された画素が集中するとい
う傾向があるため、テロップの画素についてのみ大きい
計数値が得られる。これによって、テロップの尤度が高
いほど値が大きいテロップ候補画素を求めることができ
る。The counting means determines the number of pixels encoded from the video data encoded by utilizing the correlation between the frames, using the correlation between the frames and without using motion compensation. Each position of is counted in a certain counting section. For the telop, use correlation between frames, and
Since there is a tendency that the coded pixels are concentrated without using motion compensation, a large count value is obtained only for the pixels of the telop. As a result, a telop candidate pixel having a larger value can be obtained as the likelihood of the telop is higher.

【００１８】本発明の実施態様によれば、併合手段は、
３次元の平滑化フィルタによってテロップ候補画素を平
滑化する平滑化手段を有する。According to an embodiment of the invention, the merging means comprises
It has smoothing means for smoothing the telop candidate pixels by a three-dimensional smoothing filter.

【００１９】併合手段において、平滑化手段が３次元の
平滑化フィルタによってテロップ候補画素を平滑化する
ことによって、近接するテロップ候補画素どうしが併合
されるとともに、孤立する小さいテロップ候補画素が消
滅する。In the merging means, the smoothing means smoothes the telop candidate pixels by the three-dimensional smoothing filter, so that the adjacent telop candidate pixels are merged and the small isolated telop candidate pixels disappear.

【００２０】本発明の実施態様によれば、併合手段は、
テロップ候補画素を近傍画素の最大値に置き換える膨張
手段と、テロップ候補画素を近傍画素の最小値に置き換
える収縮手段を有する。According to an embodiment of the invention, the merging means comprises
It has expansion means for replacing the telop candidate pixel with the maximum value of the neighboring pixels and contraction means for replacing the telop candidate pixel with the minimum value of the neighboring pixels.

【００２１】併合手段において、膨張手段がテロップ候
補画素を近傍画素の最大値に置き換えることによって、
近接するテロップ候補画素どうしが併合される。収縮手
段において、テロップ候補を近傍画素の最小値に置き換
えることによって、孤立する小さい画素が消滅する。こ
の２つの手段の組合せによって、近接するテロップ候補
画素どうしが併合されるとともに、孤立する小さいテロ
ップ候補画素が消滅する。In the merging means, the expanding means replaces the telop candidate pixel with the maximum value of the neighboring pixels,
The adjacent telop candidate pixels are merged. The contracting means replaces the telop candidate with the minimum value of the neighboring pixels, so that the small isolated pixel disappears. By the combination of these two means, adjacent telop candidate pixels are merged and small isolated telop candidate pixels disappear.

【００２２】本発明の実施態様によれば、映像テロップ
検出装置は、テロップ候補画素の存在しない時間帯の前
または後のフレームを、テロップを含む代表フレームと
する判定手段をさらに有する。According to the embodiment of the present invention, the video telop detection device further includes a determination unit that sets a frame before or after a time zone in which no telop candidate pixel exists as a representative frame including a telop.

【００２３】判定手段が、テロップ候補画素の存在しな
い時間区間の前または後のフレームをテロップを表す代
表フレームとすることによって、テロップを表す代表フ
レームを得ることができる。The determining means can obtain the representative frame representing the telop by setting the frame before or after the time section in which the telop candidate pixel does not exist as the representative frame representing the telop.

【００２４】本発明の実施態様によれば、映像テロップ
検出装置は、併合されたテロップ候補画素の連結成分に
ラベルを付与するラベリング手段と、ラベルのつけられ
たテロップ候補画素を含むフレームをテロップを含むフ
レームとする判定手段をさらに有する。According to an embodiment of the present invention, the video telop detection apparatus labels the connected component of the merged telop candidate pixels with a labeling means, and displays a frame containing the labeled telop candidate pixels with a telop. It further has a determination means for determining the frame to be included.

【００２５】ラベリング手段が、併合されたテロップ候
補画素の連結成分にラベルを付与するので、個々のテロ
ップを識別することが可能となる。判定手段が、ラベル
のつけられたテロップ候補画素を含むフレームをテロッ
プを表す代表フレームとするので、個々のテロップにつ
いて、過不足なく代表フレームを得ることができる。Since the labeling means gives a label to the connected component of the merged telop candidate pixels, it is possible to identify each telop. Since the determination unit sets the frame including the labeled telop candidate pixels as the representative frame representing the telop, it is possible to obtain the representative frame for each telop without excess or deficiency.

【００２６】[0026]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Next, embodiments of the present invention will be described with reference to the drawings.

【００２７】図１は本発明の第１の実施形態の映像テロ
ップ検出装置を表すブロック図である。FIG. 1 is a block diagram showing a video telop detection apparatus according to the first embodiment of the present invention.

【００２８】本実施形態の映像テロップ検出装置は、バ
ッファ１０３と、入力端子１０１から入力された映像か
らテロップ領域の候補となる画素または画素の集合を検
出し、バッファ１０３に蓄積するテロップ候補画素抽出
部１０２と、バッファ１０３に蓄積されたテロップ候補
を併合し、出力端子１０５に出力する併合部１０４で構
成されている。The video telop detection apparatus of this embodiment detects a pixel or a set of pixels that are candidates for a telop area from a buffer 103 and a video input from the input terminal 101, and extracts telop candidate pixels to be accumulated in the buffer 103. The unit 102 and a merging unit 104 that merges the telop candidates accumulated in the buffer 103 and outputs the merged telop candidates to the output terminal 105.

【００２９】画素の集合として８×８ないし１６×１６
のブロックを用いることができる。バッファ１０３は３
次元であり、図２のように画面と平行な２軸ｘ，ｙと垂
直な軸ｔによって表される。例えば、７２０×４８０画
素の画面で１６×１６のブロックを用いる場合、バッフ
ァ１０３の幅Ｗは４５、高さＨは３０となる。バッファ
１０３の奥行きＴは、対象にする映像の時間を時間解像
度によって割った値になる。例えば、１０分の映像を
０．５秒間隔で処理する場合には、Ｔは１２００にな
る。8 × 8 to 16 × 16 as a set of pixels
Blocks can be used. Buffer 103 is 3
It is a dimension and is represented by an axis t perpendicular to two axes x and y parallel to the screen as shown in FIG. For example, when a 16 × 16 block is used on a screen of 720 × 480 pixels, the width W of the buffer 103 is 45 and the height H is 30. The depth T of the buffer 103 is a value obtained by dividing the time of the target video by the time resolution. For example, when a 10-minute video is processed at 0.5 second intervals, T becomes 1200.

【００３０】図３はテロップ候補画素抽出部１０２を表
すブロック図である。入力端子２０１に入力された映像
からエッジ生成部２０２においてエッジ画像を求め、バ
ッファ３０３に格納する。エッジを求める方法として、
ラプラシアンや、Robertなどの画像処理オペレータを用
いることができる。次に、縦投影部２０４によって、画
像を縦方向に投影し頻度をとる。すると、図４のｖのよ
うに、エッジの集中している部分の頻度が高くなるの
で、これを比較部２０５によって入力端子２０６に入力
された閾値と比較し、閾値以上の範囲（ｘ０〜ｘ１）を
求める。（ｘ０〜ｘ１）の範囲に限定して、さらに、横
投影部２０７において、エッジ画像を横方向に投影頻度
を求める。これを比較部２０８において、エッジ画像を
横方向に投影頻度を求める。これを比較部２０８におい
て入力端子２０９に与えられた閾値と比較し、閾値以上
の範囲（ｙ０〜ｙ１）を求める。合成部２１０では、以
上により求められた範囲（ｘ１〜ｘ１，ｙ０〜ｙ１）の
部分の画素値を１、それ以外を０としたテロップ候補画
素として出力端子２１１に出力する。FIG. 3 is a block diagram showing the telop candidate pixel extraction unit 102. The edge generator 202 obtains an edge image from the video input to the input terminal 201 and stores it in the buffer 303. As a method to find the edge,
An image processing operator such as Laplacian or Robert can be used. Next, the vertical projection unit 204 projects the image in the vertical direction to determine the frequency. Then, as shown by v in FIG. 4, the frequency of the portion where the edges are concentrated increases, so this is compared with the threshold value input to the input terminal 206 by the comparison unit 205, and the range (x0 to x1) equal to or more than the threshold value ). Limiting the range to (x0 to x1), the horizontal projection unit 207 further calculates the projection frequency of the edge image in the horizontal direction. The comparison unit 208 obtains the frequency of lateral projection of the edge image. The comparison unit 208 compares this with a threshold value given to the input terminal 209 to obtain a range (y0 to y1) equal to or more than the threshold value. The synthesizing unit 210 outputs to the output terminal 211 as a telop candidate pixel in which the pixel value of the portion of the range (x1 to x1, y0 to y1) obtained as described above is 1, and the other values are 0.

【００３１】なお、縦投影において閾値以上の範囲が複
数存在する場合には、それぞれの範囲について横投影を
行う。また、縦投影部２０４と横投影部２０７の順序は
入れ替わってもよい。When there are a plurality of ranges equal to or greater than the threshold value in the vertical projection, the horizontal projection is performed for each range. Further, the order of the vertical projection unit 204 and the horizontal projection unit 207 may be exchanged.

【００３２】図５はテロップ候補画素抽出部１０２の他
の例を表すブロック図である。この例は、ＭＰＥＧ等、
フレーム間の相関を利用して符号化された映像データを
対象にしている。入力端子３０１に入力された符号化映
像データは、位置復号部３０２によって画素の位置が復
号され、カウンタ３０４のアドレス（Ａ）に出力され
る。同様に、種類復号部３０３によって画素の符号化の
種類が復号される。種類復号部３０３では、画素の種類
が、フレーム間の相関を用い、かつ、動き補償を用いな
いで符号化されたものである場合に限り、“１”が出力
され、それ以外は“０”が出力される。この信号はカウ
ンタ３０４の増減を制御する。カウンタ３０４の値は計
数時間内で増減され、そのままテロップ候補画素として
出力端子３０５に出力される。出力後は、カウンタ３０
４の値はすべて０にリセットされる。FIG. 5 is a block diagram showing another example of the telop candidate pixel extraction unit 102. An example of this is MPEG, etc.
The target is video data encoded by utilizing the correlation between frames. The position of the pixel of the encoded video data input to the input terminal 301 is decoded by the position decoding unit 302, and the decoded video data is output to the address (A) of the counter 304. Similarly, the type decoding unit 303 decodes the type of pixel coding. The type decoding unit 303 outputs "1" only when the type of pixel is coded using inter-frame correlation and without using motion compensation, and otherwise outputs "0". Is output. This signal controls the increase / decrease of the counter 304. The value of the counter 304 is increased / decreased within the counting time and directly output to the output terminal 305 as a telop candidate pixel. After output, counter 30
All four values are reset to zero.

【００３３】次に、併合部１０４について説明する。Next, the merging unit 104 will be described.

【００３４】まず、３次元平滑化フィルタを用いる併合
部１０４について説明する。３次元の平滑化フィルタと
して、次のような３次元ガウシアンフィルタを考える。First, the merging unit 104 using the three-dimensional smoothing filter will be described. As a three-dimensional smoothing filter, consider the following three-dimensional Gaussian filter.

【００３５】[0035]

【数１】これを、バッファ１０３（Ｂ（ｘ，ｙ，ｔ））に対して
畳み込み積分を行う。[Equation 1] The buffer 103 (B (x, y, t)) is subjected to convolutional integration.

【００３６】[0036]

【数２】あるいは、１次元のガウシアンフィルタ[Equation 2] Alternatively, a one-dimensional Gaussian filter

【００３７】[0037]

【数３】をｘ，ｙ，ｔの３軸方向について、順番に畳み込んでも
よい。すなわち、[Equation 3] May be convolved in order with respect to the three axial directions of x, y, and t. That is,

【００３８】[0038]

【数４】とする。[Equation 4] And

【００３９】次に、膨張処理と収縮処理を用いる併合部
１０４について説明する。膨張処理は、ある画素Ｂ
（ｘ，ｙ，ｔ）の近傍Ｒ（ｘ，ｙ，ｔ）に含まれる点の
最大値をその画素の値とする。すなわち、画素Ｂ（ｘ，
ｙ，ｚ）は次の式により画素Ｂｅ（ｘ，ｙ，ｚ）の値と
なる。Next, the merging unit 104 using the expansion processing and the contraction processing will be described. Expansion processing is performed on a certain pixel B
The maximum value of the points included in the neighborhood R (x, y, t) of (x, y, t) is set as the value of the pixel. That is, the pixel B (x,
y, z) is the value of the pixel Be (x, y, z) according to the following equation.

【００４０】[0040]

【数５】膨張処理は、幅、高さ、奥行きのいずれかがＲより小さ
い穴や隙間を埋める働きをする。例えば、図６（１）に
ついて４近傍（注目画素の上下左右に接する４画素）の
膨張処理を行うと図６（２）のようになる。２つの黒領
域の間の隙間がなくなり、黒領域内の白い穴も埋められ
る。収縮処理は、ある画素Ｂ（ｘ，ｙ，ｔ）の近傍Ｒ
（ｘ，ｙ，ｔ）に含まれる点の最小値をその画素の値と
する。すなわち、画素Ｂ（ｘ，ｙ，ｚ）は次の式により
画素Ｂｅ（ｘ，ｙ，ｚ）の値となる。[Equation 5] The expansion process serves to fill holes or gaps whose width, height, or depth is smaller than R. For example, when expansion processing is performed on four neighborhoods (four pixels in contact with the target pixel in the vertical and horizontal directions) in FIG. 6A, the result is as shown in FIG. 6B. The gap between the two black areas disappears and the white holes in the black areas are also filled. The contraction processing is performed in the vicinity R of a certain pixel B (x, y, t)
The minimum value of the points included in (x, y, t) is the value of the pixel. That is, the pixel B (x, y, z) becomes the value of the pixel Be (x, y, z) according to the following formula.

【００４１】[0041]

【数６】収縮処理は、幅、高さ、奥行きのいずれかがＲより小さ
い領域を消去する働きをする。例えば、図６（１）につ
いて４近傍の収縮処理を行うと図６（３）のようにな
る。高さが２の黒領域が消滅していることがわかる。ま
た、先程膨張処理を行った図６（２）について収縮処理
を行うと、図６（４）のようになる。図６（１）と図６
（４）を比べると、大きさを維持しながら、穴や隙間が
無くなっていることがわかる。すなわち、膨張収縮処理
は、穴や隙間などの画素の欠落を補う働きがある。ま
た、収縮処理の結果である図６（３）に対して膨張処理
を行うと、図６（５）のようになる。図６（１）と図６
（５）を比べると、大きさを維持しながら小さい領域が
消滅していることがわかる。すなわち、収縮膨張処理
は、ノイズを除去する働きがある。[Equation 6] The contraction process works to erase an area whose width, height, or depth is smaller than R. For example, when the contraction process of 4 neighborhoods is performed with respect to FIG. 6A, the result is as shown in FIG. 6C. It can be seen that the black area of height 2 disappears. When contraction processing is performed on FIG. 6 (2), which has been subjected to expansion processing, the result is as shown in FIG. 6 (4). 6 (1) and FIG.
Comparing (4), it can be seen that holes and gaps are eliminated while maintaining the size. That is, the expansion / contraction process has a function of compensating for the lack of pixels such as holes and gaps. When the expansion process is performed on the result of the contraction process shown in FIG. 6C, the result is as shown in FIG. 6 (1) and FIG.
Comparing (5), it can be seen that the small region disappears while maintaining the size. That is, the contraction / expansion process has a function of removing noise.

【００４２】併合部１０４の他の例では、膨張収縮処理
と膨張膨張の順番を変えられるように、図７のような構
成をとる。すなわち、入力端子４０１に入力されたテロ
ップ候補画素は、連動する４つのスイッチ４０６を介し
て膨張部４０２，４０５と収縮部４０３，４０４によっ
て処理され、出力端子４０７に出力される。スイッチ４
０６の接片が上側にあるときは、先に膨張収縮処理を行
い、次に収縮膨張処理を行う。スイッチ４０６の接片が
下側にあるときは、逆の順番になる。In another example of the merging unit 104, a configuration as shown in FIG. 7 is adopted so that the order of expansion / contraction processing and expansion / expansion can be changed. That is, the telop candidate pixel input to the input terminal 401 is processed by the expansion units 402 and 405 and the contraction units 403 and 404 via the four interlocking switches 406, and is output to the output terminal 407. Switch 4
When the contact piece of 06 is on the upper side, the expansion / contraction process is performed first, and then the contraction / expansion process is performed. When the contact piece of the switch 406 is on the lower side, the order is reversed.

【００４３】先に、膨張収縮処理を行うと、欠損を補う
ことを優先し、先に収縮膨張処理を行うと、ノイズの除
去を優先するという構成になる。If expansion / contraction processing is performed first, priority is given to supplementing the defect, and if contraction / expansion processing is performed first, noise removal is prioritized.

【００４４】図８は図１の第１の実施形態に判定部を追
加した実施形態のブロック図である。入力端子１０１に
入力された映像から、テロップ候補画素抽出部１０２に
よってテロップ領域の候補となる画素あるいは画素の集
合を検出し、バッファ１０３に蓄積し、併合部１０４に
よってテロップ候補画素が併合される。併合されたテロ
ップ候補画素は、判定部１０６によってテロップを表す
代表フレームが判定され、出力端子１０５に出力され
る。FIG. 8 is a block diagram of an embodiment in which a judging unit is added to the first embodiment of FIG. From the video input to the input terminal 101, the telop candidate pixel extraction unit 102 detects a pixel or a set of pixels that are candidates for the telop area, accumulates it in the buffer 103, and the merging unit 104 merges the telop candidate pixels. With respect to the merged telop candidate pixels, a representative frame representing the telop is determined by the determination unit 106 and output to the output terminal 105.

【００４５】次に、判定部１０６について２つの例を説
明する。まず、テロップ候補画素の存在しない時間区間
の前または後のフレームをテロップを表す代表フレーム
として判定する例を説明する。Next, two examples of the judging section 106 will be described. First, an example will be described in which a frame before or after a time section in which no telop candidate pixel exists is determined as a representative frame representing a telop.

【００４６】例えば、図９のようにテロップＡ〜Ｇが時
間的に配置されているとする。この図で、横軸が時間軸
（ｔ軸）であり、縦軸はｘまたはｙ軸である。テロップ
候補画素の存在しない時間区間の後のフレームを示した
のが、ｂ₁〜ｂ₄である。また、ｆ₁〜ｆ₄はテロップ候補
画素の存在しない時間区間の前のフレームを示したもの
である。For example, it is assumed that telops A to G are temporally arranged as shown in FIG. In this figure, the horizontal axis is the time axis (t axis) and the vertical axis is the x or y axis. The frames after the time section in which the telop candidate pixel does not exist are b _{1 to} b ₄ . Further, f _{1 to} f ₄ indicate the previous frame of the time section in which the telop candidate pixel does not exist.

【００４７】ｂ₁〜ｂ₄をテロップを表す代表フレームと
すると、テロップＡ，Ｂ，Ｄ，Ｆ，Ｇは反映されるが、
テロップＣとＥのように、他のテロップが出現している
途中で出現するテロップは反映されない。一方、ｆ₁〜
ｆ₄を用いると、テロップＡ，Ｂ，Ｃ，Ｄ，Ｆは反映さ
れるが、テロップＥ，Ｇのように、他のテロップが出現
している途中で消滅するテロップが反映されない。テロ
ップ候補画素が存在しない区間の検出は比較的簡単に実
現できるため、この方法には、簡便性という利点があ
る。When b _{1 to} b ₄ are representative frames representing telops, telops A, B, D, F and G are reflected,
The telops appearing while other telops are appearing, such as telops C and E, are not reflected. On the other hand, f ₁ ~
When f ₄ is used, telops A, B, C, D, and F are reflected, but telops such as telops E and G that disappear while other telops are appearing are not reflected. This method has an advantage of simplicity because the detection of the section in which the telop candidate pixel does not exist can be realized relatively easily.

【００４８】次に、ラベリングを用いた判定部１０６の
例について説明する。Next, an example of the determination unit 106 using labeling will be described.

【００４９】図１０は、ラベリングを用いた判定部１０
６のブロック図である。入力端子５０１に入力されたテ
ロップ候補画素は、ラベリング部５０２により、近傍画
素との連結成分が求められ、ラベル情報としてバッファ
５０３に蓄えられる。ラベル情報は、図１１（１）に示
すような表形式によって管理される。ここでは、図１１
（２）に示すように、外接直方体の座標値によってラベ
ルの位置を表現している。判定部５０４は、ｔ₀≦ｔ≦
ｔ₁の範囲のｔを選び、代表フレームとして出力端子５
０５に出力する。FIG. 10 shows a determination unit 10 using labeling.
6 is a block diagram of FIG. With respect to the telop candidate pixel input to the input terminal 501, the labeling unit 502 obtains a connected component with a neighboring pixel, and the connected component is stored in the buffer 503 as label information. Label information is managed in a table format as shown in FIG. Here, FIG.
As shown in (2), the position of the label is represented by the coordinate value of the circumscribing rectangular parallelepiped. The determination unit 504 determines that t ₀ ≦ t ≦
Select t in the range of t ₁ and output terminal 5 as the representative frame.
Output to 05.

【００５０】例として、図１２に図９と同様のテロップ
の時間配置を示す。本実施形態によれば、各テロップＡ
〜Ｇを識別し、その時間範囲を求めることができる。こ
こでは、テロップの出現するフレーム（ｔ₀）を代表フ
レームとし、ｔ₁〜ｔ₆の時間を出力している。As an example, FIG. 12 shows a telop time arrangement similar to that in FIG. According to this embodiment, each telop A
~ G can be identified and its time range can be determined. Here, the frame (t ₀ ) in which the telop appears is used as the representative frame, and the times t ₁ to t ₆ are output.

【００５１】なお、代表フレームとして、テロップの消
滅する直前のフレーム（ｔ₁）を用いてもよいし、ｔ₀と
ｔ₁の中間のフレームを用いてもよい。As the representative frame, a frame (t ₁ ) immediately before the telop disappears may be used, or a frame between t ₀ and t ₁ may be used.

【００５２】本発明は、発明の趣旨を変えない範囲で、
様々に変更して実施することもできる。例えば、テロッ
プの検出結果を用いて代表フレームを表示し、映像のテ
ロップ一覧を作成することもできる。The present invention is within the scope of the invention.
Various modifications can be implemented. For example, a representative frame can be displayed using the telop detection result to create a telop list of images.

【００５３】[0053]

【発明の効果】以上説明したように、本発明によれば、
テロップの類似した被写体が短時間出現することによる
誤検出を除去し、画質劣化やノイズ等の影響による一時
的なテロップ検出漏れを補うので、過不足のないテロッ
プ検出が可能となる。As described above, according to the present invention,
Erroneous detection due to the appearance of a subject having a similar telop for a short time is removed, and temporary telop detection omission due to the influence of image quality deterioration, noise, etc. is compensated, so that it is possible to detect telops without excess or deficiency.

[Brief description of drawings]

【図１】本発明の一実施形態の映像テロップ抽出装置の
ブロック図である。FIG. 1 is a block diagram of a video telop extraction device according to an embodiment of the present invention.

【図２】３次元のバッファ１０３を示す説明図である。FIG. 2 is an explanatory diagram showing a three-dimensional buffer 103.

【図３】テロップ候補画素抽出部１０２の一例のブロッ
ク図である。FIG. 3 is a block diagram of an example of a telop candidate pixel extraction unit 102.

【図４】テロップ検出の原理を示す説明図である。FIG. 4 is an explanatory diagram showing the principle of telop detection.

【図５】テロップ候補画素抽出部１０２の他の例のブロ
ック図である。FIG. 5 is a block diagram of another example of the telop candidate pixel extraction unit 102.

【図６】テロップ候補画素の膨張処理、収縮処理を示す
例示図である。FIG. 6 is an exemplary diagram showing expansion processing and contraction processing of telop candidate pixels.

【図７】テロップ候補画素併合部１０４の一例のブロッ
ク図である。FIG. 7 is a block diagram of an example of a telop candidate pixel merging unit 104.

【図８】本発明の他の実施形態の映像テロップ抽出装置
のブロック図である。FIG. 8 is a block diagram of a video telop extraction device according to another embodiment of the present invention.

【図９】テロップ候補画素の存在しない時間区間の前後
を判定する一実施形態による判定結果を示す例示部であ
る。FIG. 9 is an exemplary unit showing a determination result according to an embodiment of determining before and after a time period in which a telop candidate pixel does not exist.

【図１０】ラベリングを用いた判定部１０６の一例を示
すブロック図である。FIG. 10 is a block diagram showing an example of a determination unit 106 using labeling.

【図１１】ラベル情報を示す例示図である。FIG. 11 is an exemplary diagram showing label information.

【図１２】ラベリングを用いた判定部１０６の一例によ
る判定結果を示す例示図である。FIG. 12 is an exemplary diagram showing a determination result by an example of a determination unit 106 using labeling.

[Explanation of symbols]

１０１入力端子１０２テロップ画素候補抽出部１０３バッファ１０４併合部１０５出力端子１０６判定部２０１入力端子２０２エッジ生成部２０３バッファ２０４縦投影部２０５比較部２０６入力端子２０７横投影部２０８比較部２０９入力端子２１０合成部２１１出力端子３０１入力端子３０２位置復号部３０３種類復号部３０４カウンタ３０５出力端子４０１入力端子４０２，４０５膨張部４０３，４０４収縮部４０６スイッチ４０７出力端子５０１入力端子５０２ラベリング部５０３バッファ５０４判定部５０５出力端子 101 input terminal 102 telop pixel candidate extraction unit 103 buffer 104 Merger 105 output terminals 106 determination unit 201 input terminal 202 Edge generator 203 buffer 204 Vertical projection unit 205 comparison section 206 input terminal 207 Horizontal projection unit 208 Comparison section 209 input terminal 210 Synthesis Department 211 Output terminal 301 input terminal 302 position decoding unit 303 types decoding unit 304 counter 305 output terminal 401 input terminal 402,405 Inflating section 403,404 contraction section 406 switch 407 output terminal 501 input terminal 502 Labeling section 503 buffer 504 Judgment unit 505 output terminal

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平９−18798（ＪＰ，Ａ) 特開平８−317301（ＪＰ，Ａ) 特開平８−212231（ＪＰ，Ａ) 特開平７−192003（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04N 5/262 - 5/28 G06T 7/00 - 7/60 G06K 9/00 G06K 9/46 - 9/52 G06K 9/62 - 9/82 ─────────────────────────────────────────────────── ─── Continuation of the front page (56) Reference JP-A-9-18798 (JP, A) JP-A-8-317301 (JP, A) JP-A-8-212231 (JP, A) JP-A-7- 192003 (JP, A) (58) Fields investigated (Int.Cl. ⁷ , DB name) H04N 5/262-5/28 G06T 7/00-7/60 G06K 9/00 G06K 9/46-9/52 G06K 9/62-9/82

Claims

(57) [Claims]

1. A method of detecting a telop from an image, comprising extracting telop candidate pixels in units of pixels or a set of pixels and storing the telop candidate pixels in a three-dimensional buffer having vertical and horizontal spatial axes and a temporal axis. And a merging step of merging telop candidate pixels on the buffer.

2. A device for detecting a telop from an image, comprising: a three-dimensional buffer having vertical and horizontal spatial axes and a temporal axis; extracting telop candidate pixels in the unit of a pixel or a set of pixels; An image telop detection device comprising: an extracting unit that stores the telop candidate pixels on the buffer;

3. The telop based on the comparison result between the projected value and the threshold, the extraction means obtaining the edge of the image, the projection means for projecting the edge value in the vertical and horizontal directions. The video telop detection device according to claim 2, further comprising a comparison unit that determines a candidate pixel.

4. The number of pixels encoded from the video data encoded by utilizing the correlation between frames using the correlation between frames and without using motion compensation, 3. The video telop detection device according to claim 2, further comprising a counting unit that counts each position of each pixel within a counting section.

5. The video telop detection device according to claim 2, wherein the merging unit has a smoothing unit that smoothes the telop candidate pixels with a three-dimensional smoothing filter.

6. The merging means has expansion means for replacing the telop candidate pixel with the maximum value of the neighboring pixels and contraction means for replacing the telop candidate pixel with the minimum value of the neighboring pixels. The video telop detection device according to.

7. The video telop detection device according to claim 2, further comprising a determination unit that determines a frame before or after a time section in which no telop candidate pixel is present as a representative frame representing a telop. .

8. The method according to claim 2, further comprising labeling means for giving a label to the connected components of the merged telop candidate pixels, and determining means for making a frame including the labeled telop candidate pixels a representative frame representing the telop. 7. The video telop detection device according to any one of 1 to 6.