JP3569390B2

JP3569390B2 - Apparatus and method for extracting character appearance frame

Info

Publication number: JP3569390B2
Application number: JP19049296A
Authority: JP
Inventors: 秀豪桑野; 正治倉掛
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1996-07-19
Filing date: 1996-07-19
Publication date: 2004-09-22
Anticipated expiration: 2016-07-19
Also published as: JPH1040391A

Description

【０００１】
【発明の属する技術分野】
この発明は，動画像を構成する複数のフレームの中から文字が含まれるフレームを抽出する文字出現フレーム抽出装置および文字出現フレーム抽出方法に関するものである。
【０００２】
【従来の技術】
動画像を構成する複数のフレームの中から文字が含まれるフレームを抽出する技術に関しては，近年多くの研究が行われている。その一つとして，各フレームにおいて輝度に関して高コントラスト部分の面積を求め，面積値が時間軸上で急激に増加した時刻のフレームを文字出現フレームとする方法が提案されている。この方法に関する参考文献としては，以下の参考文献［１］［２］［３］がある。
【０００３】
また，ニュース映像中の文字出現フレーム抽出に特化した次の参考文献［４］に示されているような方法なども提案されている。これは，テロップ文字がシーンチェンジが行われたフレームに表示されることを利用するもので，シーンチェンジフレームの中でスタジオシーンを選択して，このフレームを文字出現フレームとして抽出する方法である。
【０００４】
〔参考文献〕
［１］中島，堀，塩原：“キーワード画像抽出による動画像サマリの作成”，情処全大，１Ｆ−１０，（１９９４後期）．
［２］根本，半谷，宮内：“テロップの認識による資料映像の検索について”，信学全大，Ｄ−４２７，（１９９４春季）．
［３］高野，中村：“Ｈ．２６１符号ハンドリングによるテロップ検出”，画像電子学会研究会予稿，９４−０６−０４，ｐｐ．１３−１６，（１９９４−０６）．
［４］茂木，有木：“ニュース映像中の文字認識に基づく記事の索引付け”，信学技報，ＩＥ９５−１５３，ＰＲＵ９５−２４０，ｐｐ．３３−４０，（１９９６−０３）．
【０００５】
【発明が解決しようとする課題】
従来手法の中で，輝度の高コントラスト部分の面積の時間変化に基づく方法では，被写体が複数フレームに渡って激しく連続して動く場合やフラッシュがたかれた場合のフレームを文字出現フレームとして過剰検出してしまうという問題点がある。
【０００６】
さらに，出現した文字の面積が画像全体の面積に比べ相対的に小さい場合，あるいは文字がフェード出現するような場合には，高コントラスト部の急激な増加という形では現れにくくなるため，文字出現フレームとして認識できず，検出不可能になるという問題点もある。
【０００７】
また，シーンチェンジフレームを利用する方法では，シーンチェンジが文字出現によって発生したかどうかの判断を行っていないため，文字出現ではないシーンチェンジフレームを誤って文字出現フレームとして検出するという問題点がある。
【０００８】
本発明は，以上のような従来手法で問題となった文字出現フレーム抽出失敗，および過剰検出を減少させ，文字が出現するフレームの抽出精度を向上させることを目的としている。
【０００９】
【課題を解決するための手段】
本発明は，予め決められた方法で，動画像を構成する各フレームを部分矩形領域に分割し，前後の時刻の複数のフレームとの間で対応する部分矩形領域内で輝度ヒストグラム差分値を算出しておき，
各フレームと前後の時刻の複数のフレームとの間で求めた複数の輝度ヒストグラム差分値の中で，前の時刻の複数のフレームとの間で求めた差分値は，予め設定した第１の閾値よりも大きい値が１つ以上存在し，且つ後の時刻の複数のフレームとの間で求めた差分値は全ての値が予め設定した第２の閾値より小さいという条件を満たす部分矩形領域を持つ場合，該フレームを文字出現フレームとして抽出する。
【００１０】
本発明の作用は以下のとおりである。
映像中のテロップ文字やフリップボード上の文字を投影した場合，文字が出現することで前のフレームとの画素値の差が生じ，文字が出現したフレームの後，数フレームに渡っては文字部分は変化が少ないというパターンをとる場合が多い。したがって，前述した条件を満たす部分矩形領域を持つフレームを検出する文字出現判断手段（請求項１記載の第７の手段）により，文字出現を検出することができる。
【００１１】
被写体が複数フレームに渡って激しく連続して動く場合やフラッシュがたかれた場合には，前の時刻のフレームとの間で算出した輝度ヒストグラム差分値だけでなく，後の時刻のフレームとの間で算出した輝度ヒストグラムも急激に変化する。したがって，請求項１記載の第７の手段で用いる前の時刻のフレームとの輝度ヒストグラム差分値の中で，予め設定した閾値より大きい値が１個以上存在し，且つ後の時刻のフレームとの輝度ヒストグラム差分値は全て予め設定した閾値より小さいという条件を満たす時刻のフレームを文字出現フレームと検出する手段により，被写体が複数フレームに渡って激しく連続して動く場合やフラッシュがたかれた場合を文字出現フレームとして過剰抽出することを抑制することができる。
【００１２】
また，文字がフェード出現する場合，隣接する時刻のフレーム間の輝度ヒストグラム差分値は小さいが，差分値を算出するフレーム同士の時間間隔を大きくすると，前の時刻のフレームとの輝度ヒストグラム差分は大きくなる。フェードの効果が終了し，文字領域の輝度が以後のフレームに渡って一定となる時刻のフレームに対しては，後の時刻のフレームとの輝度ヒストグラム差分は小さくなり，これは，前記の文字出現フレーム抽出への閾値処理の条件を満たす。したがって，請求項１記載の第７の手段において複数の前の時刻のフレームとの輝度ヒストグラム差分値を算出して判定することにより，文字のフェード出現の場合にも安定して文字出現フレームを抽出することができる。
【００１３】
文字出現部分の面積が部分矩形領域の面積と比べ，相対的に小さく，輝度ヒストグラムの分布パターンの変化に反映されにくい場合に対しては，部分矩形領域を階層性をもつ複数のサイズに予め設定しておき，重なりのある異なる階層の部分矩形領域の文字出現検出処理結果の中で，いずれかが文字出現として判定された場合に文字出現とすることで対処することができる。
【００１４】
【発明の実施の形態】
以下，本発明の実施の形態を図１を参照して詳細に説明する。
図１は，本発明を実施する装置の構成の一例を示すブロック図である。図中，１は動画像入力記憶部，２は部分矩形領域分割部，３は輝度ヒストグラム生成部，４は文字出現検出処理対象時刻指定部，５は文字出現検出処理開始制御部，６は輝度ヒストグラム差分値算出部，７は文字出現判断部，８は文字出現フレーム抽出処理結果の出力部である。
【００１５】
動画像入力記憶部１では，動画像を構成する複数のフレームをメモリ上に記憶する。
部分矩形領域分割部２では，動画像入力記憶部１の処理で記憶した動画像を構成する各フレームに対して，予め設定した複数のフレーム分割法により部分矩形領域に分割する（図２参照）。図２はフレーム分割法の一例を示すものであり，複数の階層を設け，各階層毎に異なるサイズの部分矩形領域にフレームを分割している。
【００１６】
輝度ヒストグラム生成部３では，部分矩形領域分割部２の処理で部分矩形領域に分割された各フレームにおいて，各フレームの部分矩形領域における輝度ヒストグラムを生成する。この場合のヒストグラムの輝度階級幅は予め設定した階級幅を用いる。
【００１７】
図３は輝度ヒストグラム説明図である。輝度ヒストグラム生成部３は，図２に示す各部分矩形領域毎に，例えば図３（Ａ）に示すような輝度ヒストグラムを生成する。ヒストグラムの横軸は輝度値（濃度値）であり，縦軸は画素数である。この例では，輝度値レンジを［０．０，１．０］とし，０．２刻みで画素数をカウントしている。
【００１８】
文字出現検出処理対象時刻指定部４では，輝度ヒストグラム生成部３により輝度ヒストグラムが生成されたフレームのうち，文字出現検出処理を行っていないフレームに対して，予め決められた方法で文字出現検出処理を行うフレームの時刻を設定する。時刻の決め方は，例えば時刻の順に決めればよい。
【００１９】
文字出現検出処理開始制御部５では，文字出現検出処理対象時刻指定部４で指定された時刻に対応するフレームと該フレームから時間軸の前後に予め設定した１種類以上の時間間隔ずつ離れた１以上の時刻に対応するフレームについて，輝度ヒストグラム生成部３で得られた各フレーム内の各部分矩形領域毎の輝度ヒストグラムが生成されたかどうかを判断し，生成されていれば，文字出現検出処理の開始を決定する。
【００２０】
輝度ヒストグラム差分値算出部６では，文字出現検出処理対象時刻のフレームＦ（Ｔ）に対し，文字出現検出処理開始制御部５の条件を満たす前後の複数フレームをＦ（Ｔ−Ｎ），…，Ｆ（Ｔ−１），およびＦ（Ｔ＋１），…，Ｆ（Ｔ＋Ｎ）とした場合，Ｆ（Ｔ）とＦ（Ｔ−Ｎ），Ｆ（Ｔ−Ｎ＋１），…，Ｆ（Ｔ−１），およびＦ（Ｔ）とＦ（Ｔ＋１），Ｆ（Ｔ＋２），…，Ｆ（Ｔ＋Ｎ）の間でそれぞれ対応する部分矩形領域間の輝度ヒストグラム差分値を算出する。
【００２１】
図４は，輝度ヒストグラム差分値算出部６の処理フローチャートである。
輝度ヒストグラム差分値算出部６は，図４に示すように，まず，Ｆ（Ｔ）とＦ（Ｔ−ｎ）の輝度ヒストグラム差分値（ただし，ｎは１，２，…，Ｎ）を算出し，次にＦ（Ｔ）とＦ（Ｔ＋ｎ）の輝度ヒストグラム差分値（ただし，ｎは１，２，…，Ｎ）を算出して，結果の輝度ヒストグラム差分値を文字出現判断部７へ送る。
【００２２】
輝度ヒストグラムの差分値は，各階級における度数の差で定義し，対応する部分矩形領域同士の間で求める。つまり，算出される差分値の数は，各部分矩形領域毎に入力した前後のフレーム数となる。
【００２３】
図５に，その輝度ヒストグラム差分値算出部６による差分値算出の一例を示す。図５の例では，前フレーム２枚と後フレーム２枚が輝度ヒストグラム差分値の算出対象となっている。これらのフレームにおける対応する部分矩形領域間で輝度ヒストグラム差分値を算出する。
【００２４】
文字出現判断部７では，各部分矩形領域毎に輝度ヒストグラム差分値算出部６の処理で算出した複数の差分値に関して，前の時刻の複数フレームとの間で算出した値の中で予め設定した閾値以上の値が１個以上存在し，且つ後の時刻の複数フレームとの間で算出した値は全て予め設定した閾値よりも小さいという条件を満たすかどうかの判定を行い，該条件を満たす部分矩形領域を持つ場合，該フレームを文字出現フレームと判断する。輝度ヒストグラム差分値算出部６の処理で算出した複数の差分値に関して，該条件を満たさない場合，該フレームを非文字出現フレームと判断し，文字出現検出処理対象時刻指定部４の処理に戻る。
【００２５】
すなわち，文字出現フレームと判断する条件は，まず第１に，例えば図５のフレームＦ（Ｔ）と前フレームＦ（Ｔ−１），Ｆ（Ｔ−２）間で対応する輝度ヒストグラムについて，各階級で度数差分値を算出し，その度数差分値の中で予め設定した度数閾値以上のものが１個以上存在するということである。例えば図３（Ｂ）に示すような輝度ヒストグラムの差分値が存在する場合，この条件を満たすことになる。
【００２６】
第２の条件は，例えば図５のフレームＦ（Ｔ）と後フレームＦ（Ｔ＋１），Ｆ（Ｔ＋２）間で対応する輝度ヒストグラムについて，各階級で度数差分値を算出し，その度数差分値が予め設定した度数閾値よりも全て小さいということである。全ての輝度ヒストグラムの差分値が，例えば図３（Ｃ）に示すように所定の閾値より小さい場合には，この条件を満たすことになる。
【００２７】
出力部８では，文字出現判断部７の処理で得られた文字出現フレーム抽出結果を出力する。
【００２８】
【発明の効果】
以上の説明の通り，本発明によれば，文字出現前後の一定時間の輝度値の変化パターンを反映した量により，動画像を構成する各フレームに対し，文字出現フレームかどうかを判断しているため，従来の手法に比べ，より高精度に文字出現フレームの抽出が可能となる。
【図面の簡単な説明】
【図１】本発明を実施する装置の構成を示すブロック図である。
【図２】部分矩形領域分割部によるフレーム分割法の一例を示す図である。
【図３】輝度ヒストグラム説明図である。
【図４】輝度ヒストグラム差分値算出部による処理の一例を示すフローチャートである。
【図５】輝度ヒストグラム差分値算出部による差分値算出の一例を示す図である。
【符号の説明】
１動画像入力記憶部
２部分矩形領域分割部
３輝度ヒストグラム生成部
４文字出現検出処理対象時刻指定部
５文字出現検出処理開始制御部
６輝度ヒストグラム差分値算出部
７文字出現判断部
８文字出現フレーム抽出処理結果の出力部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a character appearance frame extraction device and a character appearance frame extraction method for extracting a frame including a character from a plurality of frames constituting a moving image.
[0002]
[Prior art]
In recent years, much research has been conducted on a technique for extracting a frame including a character from a plurality of frames constituting a moving image. As one of the methods, a method has been proposed in which an area of a high contrast portion is obtained with respect to luminance in each frame, and a frame at a time when the area value rapidly increases on a time axis is set as a character appearance frame. References related to this method include the following references [1], [2], and [3].
[0003]
In addition, a method as disclosed in the following reference [4] specialized in extracting a character appearance frame in a news video has been proposed. This method utilizes the fact that a telop character is displayed in a frame where a scene change has been made, and is a method of selecting a studio scene from scene change frames and extracting this frame as a character appearance frame.
[0004]
(References)
[1] Nakajima, Hori, Shiobara: "Creation of a moving image summary by extracting keyword images", Jikyou Zendai, 1F-10, (late 1994).
[2] Nemoto, Hanya, Miyauchi: "Retrieval of material video by recognizing telops", IEICE, D-427, (1994 Spring).
[3] Takano, Nakamura: "H.261 telop detection by code handling", Proc. Of the Institute of Image Electronics Engineers of Japan, 94-06-04, pp. 13-16, (1994-06).
[4] Mogi, Ariki: "Indexing of Articles Based on Character Recognition in News Videos", IEICE Technical Report, IE95-153, PRU95-240, pp. 33-40, (1996-03).
[0005]
[Problems to be solved by the invention]
Among the conventional methods, the method based on the time change of the area of the high-contrast part of the luminance is excessively detected as the character appearance frame when the subject moves violently continuously over multiple frames or when the flash is turned on. There is a problem of doing it.
[0006]
Furthermore, if the area of the appearing character is relatively small compared to the area of the entire image, or if the character appears to fade, it is difficult to appear in the form of a sharp increase in the high contrast area. There is also a problem that it cannot be recognized as a result and cannot be detected.
[0007]
Further, in the method using a scene change frame, since it is not determined whether or not a scene change has occurred due to the appearance of a character, there is a problem that a scene change frame that is not a character appearance is erroneously detected as a character appearance frame. .
[0008]
SUMMARY OF THE INVENTION It is an object of the present invention to reduce the occurrence of a character appearance frame extraction failure and the excessive detection, which are problems in the above-described conventional method, and to improve the extraction accuracy of a frame in which a character appears.
[0009]
[Means for Solving the Problems]
According to the present invention, each frame constituting a moving image is divided into partial rectangular regions by a predetermined method, and a luminance histogram difference value is calculated in a partial rectangular region corresponding to a plurality of frames at preceding and following times. Aside,
Among a plurality of brightness histogram difference values obtained between each frame and a plurality of frames at the preceding and following times, a difference value obtained between a plurality of frames at the previous time is a first threshold value set in advance. There is a partial rectangular area that satisfies the condition that there are one or more values greater than and that a difference value obtained between a plurality of frames at a later time satisfies the condition that all values are smaller than a preset second threshold value. In this case, the frame is extracted as a character appearance frame.
[0010]
The operation of the present invention is as follows.
When a telop character in a video or a character on a flipboard is projected, the appearance of the character causes a difference in the pixel value from the previous frame, and the character portion extends over several frames after the frame in which the character appears. Often takes a pattern of little change. Therefore, a character appearance can be detected by the character appearance determining means (the seventh means of the present invention) for detecting a frame having a partial rectangular area satisfying the above-described condition.
[0011]
When the subject moves violently and continuously over multiple frames or when a flash is fired, not only the difference between the luminance histogram calculated with the frame at the previous time but also the frame at the later time is calculated. The luminance histogram calculated in step also changes rapidly. Therefore, among the luminance histogram difference values from the frame at the previous time used in the seventh means according to the first aspect, there is one or more values larger than a preset threshold value, and the difference between the luminance histogram and the frame at the later time. By means of detecting a frame at a time satisfying the condition that all the luminance histogram difference values are smaller than a preset threshold value as a character appearance frame, a case where the subject moves violently continuously over a plurality of frames or a case where a flash is fired. Excessive extraction as a character appearance frame can be suppressed.
[0012]
When a character fades, the luminance histogram difference value between frames at adjacent times is small, but when the time interval between frames for calculating the difference value is increased, the luminance histogram difference from the frame at the previous time increases. Become. For the frame at the time when the fade effect ends and the brightness of the character area becomes constant over the subsequent frames, the difference in the brightness histogram from the frame at the later time becomes smaller. Satisfies the threshold processing condition for frame extraction. Therefore, a character appearance frame can be stably extracted even in the case of a character fading appearance by calculating and determining a luminance histogram difference value from a plurality of previous time frames in the seventh means of the present invention. can do.
[0013]
If the area where the character appears is relatively small compared to the area of the partial rectangular area, and it is difficult to reflect the change in the distribution pattern of the luminance histogram, the partial rectangular area is preset to a plurality of hierarchical sizes. In addition, it is possible to cope with a case where any of the character appearance detection processing results of the partial rectangular areas of different layers having overlaps is determined as a character appearance.
[0014]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment of the present invention will be described in detail with reference to FIG.
FIG. 1 is a block diagram showing an example of the configuration of an apparatus for implementing the present invention. In the figure, 1 is a moving image input storage unit, 2 is a partial rectangular area division unit, 3 is a luminance histogram generation unit, 4 is a character appearance detection processing target time designation unit, 5 is a character appearance detection processing start control unit, and 6 is a luminance. A histogram difference value calculation unit, 7 is a character appearance determination unit, and 8 is a character appearance frame extraction processing output unit.
[0015]
The moving image input storage unit 1 stores a plurality of frames constituting a moving image in a memory.
The partial rectangular area dividing section 2 divides each frame constituting the moving image stored in the processing of the moving image input storage section 1 into a partial rectangular area by a plurality of preset frame dividing methods (see FIG. 2). . FIG. 2 shows an example of a frame division method, in which a plurality of layers are provided, and a frame is divided into partial rectangular areas of different sizes for each layer.
[0016]
The luminance histogram generation unit 3 generates a luminance histogram in the partial rectangular area of each frame in each of the frames divided into the partial rectangular areas by the processing of the partial rectangular area dividing unit 2. In this case, a predetermined class width is used as the luminance class width of the histogram.
[0017]
FIG. 3 is an explanatory diagram of a luminance histogram. The luminance histogram generator 3 generates a luminance histogram as shown in FIG. 3A for each of the partial rectangular areas shown in FIG. The horizontal axis of the histogram is a luminance value (density value), and the vertical axis is the number of pixels. In this example, the luminance value range is [0.0, 1.0], and the number of pixels is counted at intervals of 0.2.
[0018]
The character appearance detection processing target time designation unit 4 uses a character appearance detection process by a predetermined method with respect to the frames for which the character appearance detection process is not performed among the frames for which the brightness histogram is generated by the brightness histogram generation unit 3. Set the time of the frame to perform. The time may be determined, for example, in the order of the time.
[0019]
The character appearance detection process start control unit 5 includes a frame corresponding to the time designated by the character appearance detection process target time designation unit 4 and one or more types of time intervals separated from the frame by one or more predetermined time intervals before and after the time axis. For the frame corresponding to the above time, it is determined whether or not a luminance histogram has been generated for each of the partial rectangular areas in each frame obtained by the luminance histogram generating unit 3. Determine the start.
[0020]
The luminance histogram difference value calculation unit 6 compares a plurality of frames before and after satisfying the condition of the character appearance detection processing start control unit 5 with F (T−N),. F (T-1) and F (T + 1),..., F (T + N), F (T) and F (TN), F (TN + 1),. , F (T) and F (T + 1), F (T + 2),..., F (T + N).
[0021]
FIG. 4 is a processing flowchart of the luminance histogram difference value calculation unit 6.
As shown in FIG. 4, the luminance histogram difference value calculation unit 6 first calculates the luminance histogram difference values (where n is 1, 2,..., N) of F (T) and F (T−n). , And then calculates a luminance histogram difference value between F (T) and F (T + n) (where n is 1, 2,..., N) and sends the resulting luminance histogram difference value to the character appearance determination unit 7.
[0022]
The difference value of the luminance histogram is defined by the difference in frequency between each class, and is obtained between corresponding partial rectangular regions. In other words, the number of calculated difference values is the number of frames before and after input for each partial rectangular area.
[0023]
FIG. 5 shows an example of the difference value calculation by the brightness histogram difference value calculation unit 6. In the example of FIG. 5, two front frames and two rear frames are calculation targets of the luminance histogram difference value. A luminance histogram difference value is calculated between corresponding partial rectangular areas in these frames.
[0024]
The character appearance determination unit 7 presets a plurality of difference values calculated in the process of the luminance histogram difference value calculation unit 6 for each partial rectangular area among values calculated between a plurality of frames at the previous time. It is determined whether or not there is at least one value equal to or greater than the threshold value and all values calculated between a plurality of frames at later times satisfy a condition that the value is smaller than a preset threshold value. If the frame has a rectangular area, the frame is determined to be a character appearance frame. If the plurality of difference values calculated in the process of the brightness histogram difference value calculation unit 6 do not satisfy the condition, the frame is determined to be a non-character appearance frame, and the process returns to the character appearance detection processing target time designation unit 4.
[0025]
That is, the conditions for determining a character appearance frame are as follows. First, for example, for each of the luminance histograms corresponding to the frame F (T) and the previous frames F (T-1) and F (T-2) in FIG. This means that a frequency difference value is calculated for each class, and one or more of the frequency difference values having a frequency threshold value or more are set in advance. For example, when there is a difference value of the luminance histogram as shown in FIG. 3B, this condition is satisfied.
[0026]
The second condition is that, for example, a frequency difference value is calculated for each class with respect to a luminance histogram corresponding to the frame F (T) in FIG. 5 and the subsequent frames F (T + 1) and F (T + 2). That is, all are smaller than the frequency threshold set in advance. This condition is satisfied when the difference values of all the luminance histograms are smaller than a predetermined threshold value, for example, as shown in FIG.
[0027]
The output unit 8 outputs a character appearance frame extraction result obtained by the processing of the character appearance determination unit 7.
[0028]
【The invention's effect】
As described above, according to the present invention, it is determined whether or not each frame constituting a moving image is a character appearance frame based on the amount reflecting the luminance value change pattern for a certain period of time before and after the character appearance. Therefore, it is possible to extract a character appearance frame with higher accuracy than the conventional method.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an apparatus for implementing the present invention.
FIG. 2 is a diagram illustrating an example of a frame dividing method by a partial rectangular area dividing unit.
FIG. 3 is an explanatory diagram of a luminance histogram.
FIG. 4 is a flowchart illustrating an example of a process performed by a luminance histogram difference value calculation unit.
FIG. 5 is a diagram illustrating an example of calculation of a difference value by a brightness histogram difference value calculation unit.
[Explanation of symbols]
REFERENCE SIGNS LIST 1 moving image input storage unit 2 partial rectangular area division unit 3 luminance histogram generation unit 4 character appearance detection processing target time designation unit 5 character appearance detection processing start control unit 6 luminance histogram difference value calculation unit 7 character appearance judgment unit 8 character appearance frame Output part of extraction processing result

Claims

In a character appearance frame extracting apparatus for extracting a frame including a character from a plurality of frames constituting a moving image,
First means for inputting and storing moving images in frame units;
A second means for providing a plurality of layers for frame division and dividing each frame input using the first means into a plurality of frames having partial rectangular areas of different sizes for each layer ;
A third means for generating a luminance histogram for each partial rectangular area in each frame divided into partial rectangular areas using the second means;
Fourth means for designating a character appearance detection processing target time;
A frame corresponding to a time designated by using the fourth means and a frame corresponding to one or more times separated by one or more types of time intervals set before and after the time axis from the time of the frame by one or more types A fifth step of deciding to start a character appearance detection process when a luminance histogram obtained by using the third means is generated for each partial rectangular area obtained by using the second means. Means of
The second means is provided between a frame of a character appearance detection processing target time and a frame corresponding to one or more times separated by one or more types of predetermined time intervals before and after the time axis from the time of the frame. A sixth means for calculating a brightness histogram difference value using a brightness histogram obtained by using the third means for each partial rectangular area obtained by using the third means;
For each partial rectangular area obtained by using the second means, the luminance histogram difference value obtained by using the sixth means is used to determine the frame of the character appearance detection processing target time designated by the fourth means. One or more values that are equal to or greater than a preset first threshold value among the difference values calculated between the frame at the earlier time and the frame at the time later than the frame of the character appearance detection processing target time are present. It is determined whether or not all of the difference values calculated between the two satisfy a condition that the difference value is smaller than a preset second threshold value. If there is a partial rectangular area satisfying the condition, the frame is detected as a character appearance frame. Means of
An eighth means for outputting a frame detected as a character appearance frame by using the seventh means.

In a character appearance frame extraction method for extracting a frame including a character from a plurality of frames constituting a moving image,
A process of inputting a moving image in frame units;
Dividing each frame constituting the moving image into partial rectangular regions by a predetermined method;
Generating a luminance histogram for each partial rectangular area in each frame divided into partial rectangular areas;
Calculating a luminance histogram difference value within a partial rectangular area corresponding to a plurality of frames at preceding and following times;
Among a plurality of brightness histogram difference values for each partial rectangular area obtained between each frame and a plurality of frames at the preceding and following times, a difference value obtained between a plurality of frames at the previous time is set in advance. When there is one or more values larger than the threshold value and the difference value obtained between a plurality of frames at a later time has a partial rectangular area satisfying the condition that all values are smaller than a preset threshold value Extracting the frame as a character appearance frame and outputting the character appearance frame.