JP2000187732A

JP2000187732A - Method and device for deciding character area and recording medium storing the method

Info

Publication number: JP2000187732A
Application number: JP10363651A
Authority: JP
Inventors: Hidekatsu Kuwano; 秀豪桑野; Hiroyuki Arai; 啓之新井; Masaharu Kurakake; 正治倉掛; Toshiaki Sugimura; 利明杉村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1998-12-22
Filing date: 1998-12-22
Publication date: 2000-07-04
Anticipated expiration: 2018-12-22
Also published as: JP3504874B2

Abstract

PROBLEM TO BE SOLVED: To improve the discrimination accuracy of each character area in an image including characters. SOLUTION: A color image input storage part 1 inputs color image data including characters and stores the inputted data in a memory. An image area division part 2 divides the inputted and stored color image into connected pixel areas by using a previously determined method. A high contrast area deciding part 3 discriminates a character area highly accurately by executing processing for detecting an edge from an area boundary part between plural divided areas, leaving an area in which the ratio of peripheral length to edge pixels is more than a threshold as a high contrast area having high possibility of including a character and disusing other areas as low contrast areas and then stores the obtained character area image in an character area image storing part 4.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、例えば、テレビ放
送の映像等のカラー動画像を構成する複数のフレーム画
像において、テレビ放送映像中のテロップ文字等の文字
が表示されているフレーム画像の中から文字部分を画素
連結領域として抽出する文字領域判定技術に関するもの
である。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to, for example, a plurality of frame images constituting a color moving image such as a video image of a television broadcast, among a frame image in which characters such as telop characters in the television broadcast image are displayed. The present invention relates to a character region determination technology for extracting a character portion from a character region as a pixel connection region.

【０００２】[0002]

【従来の技術】カラー動画像を構成する複数のフレーム
画像の中の文字が表示されているフレーム画像、あるい
は文字が表示されている静止カラー画像から文字部分を
画素連結領域として抽出する文字領域判定技術に関して
は、従来から多くの研究が行われている。2. Description of the Related Art A character region determination for extracting a character portion as a pixel connection region from a frame image in which characters are displayed in a plurality of frame images constituting a color moving image or a still color image in which characters are displayed. A great deal of research has been conducted on technology.

【０００３】参考文献［１］：桑野、倉掛、小高：“映
像データ検索のためのテロップ文字抽出法”、信学技
報、ＰＲＭＵ９６−９８、ｐｐ．３９−４６、（１９９
６−１１）で提案されている方法（以下、従来手法
［１］と記す）は、最初に入力されたカラー画像を色空
間の分割処理により、連結画素領域に分割し、その後、
得られた領域に対し、文字領域かどうかの判別を行って
いる。従来手法［１］では、図６左に示すように、面積
値が一定値以下の領域、画像の枠に接しない領域、およ
び一定時間位置が変わらない領域を文字領域として判定
している。Reference [1]: Kuwano, Kurakake, Odaka: "A telop character extraction method for video data search", IEICE Technical Report, PRMU 96-98, pp. 139-143. 39-46, (199
The method proposed in 6-11) (hereinafter referred to as conventional method [1]) divides an input color image into a connected pixel area by a color space division process,
It is determined whether or not the obtained area is a character area. In the conventional method [1], as shown on the left side of FIG. 6, a region whose area value is equal to or less than a certain value, a region that does not touch the image frame, and a region whose position does not change for a certain time are determined as character regions.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、前記の
従来手法［１］は、図６右に示すように、領域分割処理
後の文字領域判定処理において、文字と同程度の面積を
持ち、画像の枠に触れておらず、且つ一定時間位置が変
わらない文字以外の領域を残留させてしまう問題があ
る。However, in the conventional method [1], as shown in the right side of FIG. 6, in the character area determination processing after the area division processing, the area has about the same area as the character, and There is a problem that a region other than the character which does not touch the frame and whose position does not change for a certain period of time remains.

【０００５】そこで、本発明は、例えば映像中におい
て、文字と同程度の面積を持ち、画像の枠に触れておら
ず、且つ一定時間位置が変わらない文字以外の領域を削
除することで、文字を含む画像中の文字領域の判別精度
を向上させる文字領域判別方法および装置を提供するこ
とを課題とする。[0005] Therefore, the present invention provides a method for removing characters other than characters, for example, in a video, which has the same area as the characters, does not touch the frame of the image, and does not change its position for a certain period of time. It is an object of the present invention to provide a method and an apparatus for determining a character region in which the accuracy of the determination of a character region in an image including the character region is improved.

【０００６】[0006]

【課題を解決するための手段】前記の課題を解決するた
めに、本発明による文字領域抽出方法は、文字が表示さ
れている画像を入力して原画像として記憶する第１の段
階と、該第１の段階で入力され記憶された原画像を予め
決めた方法を用いて、連結画素領域に分割し、領域分割
画像を得る第２の段階と、該第２の段階で得られた領域
分割画像中の各領域に対し、領域境界部における領域の
内側と領域の外側との明暗の差を表すコントラスト特徴
を計算し、コントラスト特徴が予め設定した値より大き
い領域を文字領域として該領域分割画像中に残し、そう
でない領域を該領域分割画像中から削除する第３の段階
と、該第３の段階で得られた文字領域として判定された
全ての連結画素領域を含む文字領域画像を蓄積する第４
の段階とを、有することを特徴とする。In order to solve the above-mentioned problems, a character area extracting method according to the present invention comprises a first step of inputting an image in which characters are displayed and storing the image as an original image; A second step of dividing the original image input and stored in the first step into connected pixel regions by using a predetermined method to obtain a region-divided image, and a region dividing process obtained in the second stage. For each region in the image, a contrast feature representing the difference in brightness between the inside of the region and the outside of the region at the region boundary is calculated, and a region where the contrast feature is larger than a preset value is set as a character region and the region divided image is calculated. A third step of deleting the remaining area from the area divided image from the area divided image and storing a character area image including all the connected pixel areas determined as the character area obtained in the third step. 4th
And a step of:

【０００７】あるいは、上記の文字領域抽出方法におい
て、第３の段階が、入力された原画像に対し予め決めた
方法（例えばＲｏｂｉｎｓｏｎのエッジ検出用オペレー
タを用いる等）を用いて画像中のエッジ画素を検出する
第３−１の段階と、領域分割画像中の各領域の内側の境
界画素の個数を計算する第３−２の段階と、該領域分割
画像中の各領域の内側の境界画素のうち第３−１の段階
により検出されたエッジである画素の個数を計算する第
３−３の段階と、該第３−２の段階で得られた各領域の
内側の境界画素の個数と、該第３−３の段階で得られた
各領域の内側の境界画素のうちのエッジである画素の個
数の比を計算する第３−４の段階と、該第３−４の段階
で得られた領域の内側の境界画素の個数と領域の内側の
境界画素のうちエッジである画素の個数の比が予め設定
した値より大きい場合は、該領域を文字領域として判定
し該領域分割画像中に残し、そうでない場合は、該領域
を背景ノイズ領域として判定し、該領域分割画像中から
削除する第３−５の段階とを、を有することを特徴とす
る。Alternatively, in the above-described character region extraction method, the third step is to use an edge pixel in the image by using a predetermined method (for example, using a Robinson edge detection operator) on the input original image. 3-1), the number of boundary pixels inside each region in the region-divided image is calculated, and the number of boundary pixels inside each region in the region-divided image is calculated. Among them, a third stage of calculating the number of pixels which are edges detected in the third stage, a number of boundary pixels inside each region obtained in the third stage, A third-fourth step of calculating the ratio of the number of pixels that are edges among boundary pixels inside each region obtained in the third-third step; The number of boundary pixels inside the If the ratio of the number of pixels that are the same is greater than a preset value, the region is determined as a character region and is left in the region-divided image; otherwise, the region is determined as a background noise region. And a 3-5th step of deleting from the region divided image.

【０００８】さらには、以上の文字領域判定方法におけ
る段階をコンピュータに実行させるためのプログラム
を、該コンピュータが読み取り可能な記録媒体に記録し
たことを特徴とする。Further, a program for causing a computer to execute the steps in the above character area determination method is recorded on a recording medium readable by the computer.

【０００９】同じく前記の課題を解決するために、本発
明による文字領域抽出装置は、文字が表示されている画
像を入力して原画像として記憶する画像入力記憶手段
と、該画像入力記憶手段により入力され記憶された原画
像を予め決めた方法を用いて、連結画素領域に分割し、
領域分割画像を得る画像領域分割手段と、該画像領域分
割手段により得られた領域分割画像中の各領域に対し、
領域境界部における領域の内側と領域の外側との明暗の
差を表すコントラスト特徴を計算し、コントラスト特徴
が予め設定した値より大きい領域を文字領域として該領
域分割画像中に残し、そうでない領域を該領域分割画像
中から削除する高コントラスト領域判定手段と、該高コ
ントラスト領域判定手段により得られた文字領域として
判定された全ての連結画素領域を含む文字領域画像を蓄
積する文字領域画像蓄積手段と、該画像入力記憶手段、
該画像領域分割手段、該高コントラスト領域判定手段お
よび文字領域画像蓄積手段の実行順序を制御する制御手
段とを、具備することを特徴とする。According to another aspect of the present invention, there is provided a character region extracting apparatus comprising: an image input storage unit for inputting an image in which characters are displayed and storing the input image as an original image; The input and stored original image is divided into connected pixel areas using a predetermined method,
Image region dividing means for obtaining a region divided image, and for each region in the region divided image obtained by the image region dividing means,
A contrast feature representing the difference in brightness between the inside of the region and the outside of the region at the region boundary is calculated, and a region in which the contrast feature is larger than a predetermined value is left as a character region in the region-divided image. A high-contrast region determining unit that deletes from the region-divided image; and a character-region image storing unit that stores a character region image including all connected pixel regions determined as character regions obtained by the high-contrast region determining unit. , The image input storage means,
A control unit for controlling an execution order of the image area dividing means, the high contrast area determining means and the character area image storing means.

【００１０】あるいは、上記の文字領域抽出装置におい
て、高コントラスト領域判定手段が、原画像に対し予め
決めた方法（例えばＲｏｂｉｎｓｏｎのエッジ検出用オ
ペレータを用いる等）を用いて画像中のエッジ画素を検
出するエッジ検出手段と、領域分割画像中の各領域の内
側の境界画素の個数を計算する領域周囲長計算手段と、
該領域分割画像中の各領域の内側の境界画素のうち該エ
ッジ検出手段により検出されたエッジである画素の個数
を計算する領域境界エッジ計算手段と、該領域周囲長計
算手段により得られた各領域の内側の境界画素の個数
と、該領域境界エッジ計算手段により得られた各領域の
内側の境界画素のうちのエッジである画素の個数の比を
計算する領域周囲長／エッジ比計算手段と、該領域周囲
長／エッジ比計算手段により得られた領域の内側の境界
画素の個数と領域の内側の境界画素のうちエッジである
画素の個数の比が予め設定した値より大きい場合は、該
領域を文字領域として判定し該領域分割画像中に残し、
そうでない場合は、該領域を背景ノイズ領域として判定
し、該領域分割画像中から削除する文字領域判定手段と
を、具備することを特徴とする。Alternatively, in the above-described character region extracting apparatus, the high-contrast region determining means detects an edge pixel in the image by using a predetermined method (for example, using a Robinson edge detection operator) on the original image. Edge detecting means, and a region perimeter calculating means for calculating the number of boundary pixels inside each region in the region divided image,
Area boundary edge calculation means for calculating the number of pixels which are edges detected by the edge detection means among boundary pixels inside each area in the area divided image; Area perimeter / edge ratio calculating means for calculating a ratio between the number of boundary pixels inside the area and the number of pixels which are edges among the boundary pixels inside each area obtained by the area boundary edge calculating means; If the ratio of the number of boundary pixels inside the region obtained by the region perimeter / edge ratio calculation means to the number of edge pixels among the boundary pixels inside the region is larger than a preset value, Determine the area as a character area and leave it in the area divided image,
If not, a character region determining unit that determines the region as a background noise region and deletes the region from the region divided image is provided.

【００１１】一般に映像中に表示される文字は周囲との
輝度値コントラストが高い場合が多いという特徴を持
つ。従来手法［１］では、領域の周囲の輝度コントラス
トは評価していないため、文字と同程度の面積を持ち、
画像の枠に触れず、且つ一定時間位置が変わらない文字
以外の領域のうち境界部の輝度コントラストが低い一般
物体の領域も残留させてしまう。In general, a character displayed in an image has a feature that the luminance value contrast with the surroundings is often high. In the conventional method [1], since the brightness contrast around the area is not evaluated, the area has about the same area as the character,
A region of a general object having a low brightness contrast at a boundary portion among regions other than characters that do not touch the frame of the image and whose position does not change for a certain period of time remains.

【００１２】そこで、本発明では、領域分割後に各領域
の境界部における輝度値のコントラスト特徴を計算し、
コントラストの高い領域だけを残留させることで、文字
領域の判別精度を向上させることを可能とする。Therefore, according to the present invention, after the region is divided, the contrast characteristic of the brightness value at the boundary of each region is calculated,
By leaving only a high contrast area, it is possible to improve the accuracy of character area determination.

【００１３】[0013]

【発明の実施の形態】以下、図面を参照して本発明の実
施形態例を詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

【００１４】図１は、本発明の一実施形態例による装置
の構成とともに処理の流れを示すブロック図である。FIG. 1 is a block diagram showing the configuration and processing flow of an apparatus according to an embodiment of the present invention.

【００１５】図１において、１はカラー画像入力記憶部
であり、文字が表示されているカラー画像等の画像デー
タを入力してメモリに記憶する。In FIG. 1, reference numeral 1 denotes a color image input storage unit, which inputs image data such as a color image in which characters are displayed and stores it in a memory.

【００１６】２は画像領域分割部であり、カラー画像入
力記憶部１で入力され記憶されたカラー画像等の原画像
を予め決めた方法（例えば、従来手法［１］で提案され
ている方法等）を用いて、連結画素領域に分割する。Reference numeral 2 denotes an image area dividing unit which determines an original image such as a color image input and stored in the color image input / storage unit 1 (for example, a method proposed by the conventional method [1], etc.). ) To divide into connected pixel regions.

【００１７】３は高コントラスト領域判定部であり、画
像領域分割部２で得られた領域分割画像中の各領域に対
し、領域境界部における輝度値のコントラスト特徴を計
算し、コントラスト特徴が予め設定した値より大きい領
域を文字領域として判別する。Reference numeral 3 denotes a high-contrast region determining unit which calculates a contrast characteristic of a luminance value at a region boundary for each region in the region-divided image obtained by the image region dividing unit 2, and sets the contrast characteristic in advance. An area larger than the set value is determined as a character area.

【００１８】４は文字領域画像蓄積部であり、高コント
ラスト領域判定部により得られた文字領域画像を蓄積す
る。Reference numeral 4 denotes a character area image storage unit, which stores the character area image obtained by the high contrast area determination unit.

【００１９】５は処理制御部であり、上記１〜４の各部
の実行順序を制御する。Reference numeral 5 denotes a processing control unit which controls the execution order of each of the above-mentioned units 1-4.

【００２０】図２は、図１中の高コントラスト領域判定
部３の構成の一例とともに処理の流れの一例を示すブロ
ック図である。FIG. 2 is a block diagram showing an example of the processing flow together with an example of the configuration of the high-contrast area determination section 3 in FIG.

【００２１】図２において、６はエッジ検出部であり、
カラー原画像等に対し予め決めた方法（例えば、Ｒｏｂ
ｉｎｓｏｎのエッジ検出用オペレータを用いて行う等）
を用いて画像中のエッジ画素を検出する。In FIG. 2, reference numeral 6 denotes an edge detector.
A predetermined method (for example, Rob
using an inson edge detection operator, etc.)
Is used to detect edge pixels in the image.

【００２２】７は領域周囲長計算部であり、領域分割画
像中の各領域の内側の境界画素の個数を計算する。Reference numeral 7 denotes an area perimeter calculating unit which calculates the number of boundary pixels inside each area in the area divided image.

【００２３】８は領域境界エッジ計算部であり、領域分
割画像中の各領域の内側の境界画素のうちエッジ検出部
６により得られたエッジ画素の個数を計算する。Reference numeral 8 denotes a region boundary edge calculation unit which calculates the number of edge pixels obtained by the edge detection unit 6 among boundary pixels inside each region in the region division image.

【００２４】９は領域周囲長／エッジ比計算部であり、
領域周囲長計算部７により得られた各領域の内側の境界
画素の個数と領域境界エッジ計算部８により得られた各
領域の内側の境界画素のうちのエッジ画素の個数の比を
計算する。Reference numeral 9 denotes an area perimeter / edge ratio calculation unit.
The ratio between the number of boundary pixels inside each region obtained by the region perimeter calculation unit 7 and the number of edge pixels among the boundary pixels inside each region obtained by the region boundary edge calculation unit 8 is calculated.

【００２５】１０は文字領域判定部であり、領域周囲長
／エッジ比計算部９により得られた領域の内側境界にお
ける全周囲長とエッジ画素の個数の比が予め設定した値
より大きい場合は該領域を文字領域として判別し該領域
分割画像中に残し、そうでない場合は、該領域を背景ノ
イズ領域として該領域分割画像中から削除する。Numeral 10 denotes a character area judging section. If the ratio of the total perimeter to the number of edge pixels on the inner boundary of the area obtained by the area perimeter / edge ratio calculating section 9 is larger than a predetermined value, The region is determined as a character region and is left in the region-divided image. Otherwise, the region is deleted from the region-divided image as a background noise region.

【００２６】１１は、上記６〜１０の各部の実行順序を
制御する処理制御部である。この処理制御部１１は、図
１の処理制御部５が兼ねる構成であってもよい。Reference numeral 11 denotes a processing control unit for controlling the execution order of the units 6 to 10. The processing control unit 11 may be configured to also serve as the processing control unit 5 in FIG.

【００２７】図３は、図２中の高コントラスト領域判定
部３の処理の実施形態を説明するためのフローチャート
である。FIG. 3 is a flow chart for explaining an embodiment of the processing of the high contrast area determination section 3 in FIG.

【００２８】図３において、ステップ（３１）は、カラ
ー原画像と領域分割画像をメモリ中に読み込む過程であ
る。In FIG. 3, step (31) is a process of reading the color original image and the region-divided image into the memory.

【００２９】ステップ（３２）は、原画像に対し、予め
決められた方法（例えばＲｏｂｉｎｓｏｎのエッジ検出
用オペレータを用いて行う等）を用いてエッジ画素の検
出を行う過程である。Step (32) is a process of detecting an edge pixel from the original image by using a predetermined method (for example, by using a Robinson edge detection operator).

【００３０】ステップ（３３）は、領域分割画像中の領
域数用の変数ｎを１に初期化する過程である。Step (33) is a process of initializing a variable n for the number of regions in the region-divided image to one.

【００３１】ステップ（３４）は、変数ｎと領域分割画
像中の領域の総数Ｎを比較し、ｎの値がＮ以下の場合は
ステップ（３５）に移り、ｎの値がＮより大きい場合は
処理を終了する過程である。In step (34), the variable n is compared with the total number N of the areas in the divided image. If the value of n is equal to or less than N, the process proceeds to step (35). This is the process of ending the processing.

【００３２】ステップ（３５）は、領域分割画像中のｎ
番目の領域について、領域の内側境界の画素の個数Ｐ
（ｎ）を計算する過程である。In step (35), n in the divided image
For the second region, the number P of pixels on the inner boundary of the region
This is the process of calculating (n).

【００３３】ステップ（３６）は、ステップ（３５）に
より得られたｎ番目の領域の内側境界画素のうち、ステ
ップ（３２）により得られたエッジ画素の個数Ｑ（ｎ）
を計算する過程である。In step (36), the number Q (n) of edge pixels obtained in step (32) among the inner boundary pixels of the n-th area obtained in step (35)
Is the process of calculating

【００３４】ステップ（３７）は、ステップ（３５）お
よびステップ（３６）により求めたＰ（ｎ）とＱ（ｎ）
の比を計算し、得られた値が予め設定した値以上の場
合、ステップ（３８）に移り、そうでなければステップ
（３９）へ移る過程である。Step (37) consists of P (n) and Q (n) obtained in steps (35) and (36).
Is calculated, and if the obtained value is equal to or larger than a preset value, the process proceeds to step (38); otherwise, the process proceeds to step (39).

【００３５】ステップ（３８）は、Ｐ（ｎ）とＱ（ｎ）
の比が予め設定した値以上の場合、領域分割画像中のｎ
番目の領域を文字領域として判断し、該領域分割画像中
に残す過程である。Step (38) consists of P (n) and Q (n)
Is greater than or equal to a preset value, n
This is a process in which the second region is determined as a character region and is left in the region divided image.

【００３６】ステップ（３９）は、Ｐ（ｎ）とＱ（ｎ）
の比が予め設定した値より小さい場合、領域分割画像中
のｎ番目の領域を背景ノイズ領域として判断し、該領域
分割画像中から削除する過程である。Step (39) consists of P (n) and Q (n)
If the ratio is smaller than a preset value, the n-th region in the region-divided image is determined as a background noise region, and is deleted from the region-divided image.

【００３７】ステップ（４０）は、変数ｎを１だけイン
クリメントした後、ステップ（３４）へ戻る過程であ
る。Step (40) is a process of returning to step (34) after incrementing the variable n by one.

【００３８】図４は、本発明による効果を示す図であ
る。本発明の主要な特徴点は、文字が含まれる映像等の
原画像から、文字領域を抽出する処理において、原画像
を領域分割し、領域境界部からエッジを検出して（図４
−（ａ））、周囲長とエッジ画素の比が閾値以上である
領域を、文字を含んでいる可能性が高い高コントラスト
領域として残し、それ以外の領域を低コントラスト領域
として捨てる処理を行うことにより、文字以外の領域
（例えば、低コントラストの静止画領域）を削除し文字
領域だけを高精度で残すことを可能とする（図４−
（ｂ））ものである。これによって、文字認識の精度を
向上させることが可能となる。このような特徴的な構成
が図２の６〜１０の各部での処理であり、その処理の詳
細が、図３のステップ（３５）〜（３９）での処理であ
る。FIG. 4 is a diagram showing the effect of the present invention. The main feature of the present invention is that, in a process of extracting a character region from an original image such as a video including characters, the original image is divided into regions, and edges are detected from region boundaries (FIG. 4).
(A)) performing a process of leaving an area in which the ratio between the perimeter and the edge pixel is equal to or greater than a threshold value as a high-contrast area having a high possibility of including a character, and discarding other areas as a low-contrast area. Accordingly, it is possible to delete a region other than a character (for example, a low-contrast still image region) and leave only a character region with high accuracy (FIG. 4).
(B)). This makes it possible to improve the accuracy of character recognition. Such a characteristic configuration is the processing in each of the units 6 to 10 in FIG. 2, and the details of the processing are the processing in steps (35) to (39) in FIG.

【００３９】上記の実施形態例では、原画像としてカラ
ー画像に適用した場合を例に説明したが、本発明は、
（１）入力が二値化されている画像の場合、（２）二値
化されていない普通の画像の場合のどちらの場合におい
ても適用可能なものである。In the above embodiment, the case where the present invention is applied to a color image as an original image has been described as an example.
The present invention can be applied to both cases of (1) an image whose input is binarized and (2) an ordinary image which is not binarized.

【００４０】本発明がもっとも効果を奏するのは、図５
に示すように、前処理として「文字領域抽出処理」を行
った後に本発明の「文字領域判定処理」を行い、その
後、さらに文字認識処理を行う構成においてである。FIG. 5 shows that the present invention is most effective.
As shown in Fig. 7, the "character area extraction processing" is performed as the preprocessing, the "character area determination processing" of the present invention is performed, and then the character recognition processing is further performed.

【００４１】なお、図１、図２で示した手段の一部もし
くは全部を、コンピュータを用いて機能させることがで
きること、あるいは、図１、図２、図３で示した処理の
ステップ（段階）をコンピュータで実行させることがで
きることは言うまでもなく、コンピュータをその手段と
して機能させるためのプログラム、あるいは、コンピュ
ータでその処理のステップを実行させるためのプログラ
ムを、そのコンピュータが読み取り可能な記録媒体、例
えば、ＦＤ（フロッピーディスク）や、ＭＯ、ＲＯＭ、
メモリカード、ＣＤ、ＤＶＤ、リムーバブルディスクな
どに記録して提供し、配布することが可能である。It should be noted that some or all of the means shown in FIGS. 1 and 2 can be made to function using a computer, or that the steps (steps) of the processing shown in FIGS. 1, 2 and 3 can be performed. Needless to say, the program can be executed by a computer, a program for causing a computer to function as the means, or a program for causing a computer to execute the steps of the processing, a computer-readable recording medium, for example, FD (floppy disk), MO, ROM,
It can be recorded on a memory card, a CD, a DVD, a removable disk, etc., provided, and distributed.

【００４２】[0042]

【発明の効果】以上の説明のとおり、本発明によれば、
本発明では、領域分割後に各領域の境界部における輝度
値のコントラスト特徴を計算し、コントラストの高い領
域だけを残留させることで、文字領域の判別精度を向上
させる効果が得られる。As described above, according to the present invention,
According to the present invention, the contrast characteristic of the luminance value at the boundary between the regions is calculated after the region is divided, and only the region with high contrast is left, so that the effect of improving the accuracy of determining the character region can be obtained.

[Brief description of the drawings]

【図１】本発明の一実施形態例による文字領域判定装置
の構成とともに処理の流れを示すブロック図である。FIG. 1 is a block diagram showing a configuration and a processing flow of a character area determination device according to an embodiment of the present invention.

【図２】上記実施形態例における高コントラスト領域判
定部の構成の一例とともに処理の流れの一例を示すブロ
ック図である。FIG. 2 is a block diagram illustrating an example of a processing flow together with an example of a configuration of a high-contrast area determination unit in the embodiment.

【図３】図１中の高コントラスト領域判定部３の処理実
施の一例を示すフローチャートである。FIG. 3 is a flowchart illustrating an example of processing performed by a high-contrast area determination unit 3 in FIG. 1;

【図４】（ａ），（ｂ）は、本発明の実施形態例による
文字領域判定結果の一例を示す図である。FIGS. 4A and 4B are diagrams illustrating an example of a character area determination result according to the embodiment of the present invention.

【図５】本発明の効果が最もよく発揮される適用例を説
明する図である。FIG. 5 is a diagram illustrating an application example in which the effects of the present invention are best exhibited.

【図６】従来手法［１］による文字領域判別結果の一例
を示す図である。FIG. 6 is a diagram illustrating an example of a character area determination result according to a conventional method [1].

[Explanation of symbols]

１…カラー画像入力記憶部２…画像領域分割部３…高コントラスト領域判定部４…文字領域画像蓄積部５…処理制御部６…エッジ検出部７…領域周囲長計算部８…領域境界エッジ計算部９…領域周囲長／エッジ比計算部１０…文字領域判定部１１…処理制御部 DESCRIPTION OF SYMBOLS 1 ... Color image input storage part 2 ... Image area division part 3 ... High contrast area determination part 4 ... Character area image storage part 5 ... Processing control part 6 ... Edge detection part 7 ... Area perimeter calculation part 8 ... Area boundary edge calculation Unit 9: Area perimeter / edge ratio calculation unit 10: Character area determination unit 11: Processing control unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者倉掛正治東京都新宿区西新宿３丁目19番２号日本電信電話株式会社内 (72)発明者杉村利明東京都新宿区西新宿３丁目19番２号日本電信電話株式会社内Ｆターム(参考） 5B029 CC21 CC27 5C023 AA01 AA06 AA34 BA01 BA02 BA03 CA01 CA05 DA04 DA08 5C066 AA11 BA01 CA05 EC12 GA04 GB01 HA01 JA01 KE16 5L096 BA17 FA06 FA44 FA54 FA65 GA28 9A001 BB06 HH22 HH23 HH28 HH30 HH31 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Shoji Kurakake 3-19-2 Nishi-Shinjuku, Shinjuku-ku, Tokyo Japan Telegraph and Telephone Corporation (72) Inventor Toshiaki Sugimura 3- 192-1 Nishi-Shinjuku, Shinjuku-ku, Tokyo No. Nippon Telegraph and Telephone Co., Ltd. F-term (reference)

Claims

[Claims]

1. A first step of inputting an image displaying characters and storing it as an original image, and connecting the original image input and stored in the first step using a predetermined method. A second step of dividing the image into pixel areas to obtain an area-divided image; and, for each area in the area-divided image obtained in the second step, the contrast between the inside of the area at the area boundary and the outside of the area. A third step of calculating a contrast feature representing a difference between the two, leaving an area in which the contrast feature is larger than a preset value as a character area in the area divided image, and deleting the other area from the area divided image, A fourth step of storing a character area image including all the connected pixel areas determined as the character area obtained in the third step.

2. The character area extracting method according to claim 1, wherein the third step is a step of detecting an edge pixel in the input original image by using a predetermined method. A third step of calculating the number of boundary pixels inside each region in the region divided image; and a third stage out of boundary pixels inside each region in the region divided image.
A third step of calculating the number of pixels that are edges detected by the step of -1; a number of boundary pixels inside each region obtained in the step of 3-2; Step 3-4 of calculating the ratio of the number of pixels which are edges among the boundary pixels inside each area obtained in Step 3; If the ratio of the number of inner boundary pixels to the number of edge pixels among the inner boundary pixels of the region is larger than a preset value, the region is determined as a character region and left in the region divided image, and so on. If not, a 3-5th step of determining the area as a background noise area and deleting it from the area divided image is provided.

3. An image input storage means for inputting an image in which characters are displayed and storing it as an original image, and connecting the original images input and stored by the image input storage means using a predetermined method. Image region dividing means for dividing the image into pixel regions to obtain a region divided image; and for each region in the region divided image obtained by the image region dividing device, the contrast between the inside of the region at the region boundary and the outside of the region. A high contrast area determining unit that calculates a contrast feature representing a difference between the two, leaves an area where the contrast feature is larger than a predetermined value as a character area in the area divided image, and deletes an area that is not so from the area divided image. A character region image storage unit that stores a character region image including all connected pixel regions determined as the character region obtained by the high contrast region determination unit; Image input storage means, the image region dividing unit, and control means for controlling the execution order of the high-contrast region determination unit and the character area image storing means, a character region determining apparatus characterized by comprising.

4. The character area extracting apparatus according to claim 3, wherein the high contrast area determining means includes: an edge detecting means for detecting an edge pixel in the original image by using a predetermined method; Region perimeter calculating means for calculating the number of boundary pixels inside each region in the region, and the number of pixels which are edges detected by the edge detection device among the boundary pixels inside each region in the region-divided image. And the number of boundary pixels inside each region obtained by the region perimeter calculation unit, and the number of boundary pixels inside each region obtained by the region boundary edge calculation unit Region perimeter / edge ratio calculating means for calculating the ratio of the number of pixels which are edges of the region; and the number of boundary pixels inside the region obtained by the region perimeter / edge ratio calculating device and the boundary inside the region. If the ratio of the number of pixels that are edges among the boundary pixels is larger than a preset value, the area is determined as a character area and is left in the area divided image; otherwise, the area is used as a background noise area. A character region determining unit for determining and deleting from the region divided image.

5. A character area determining method according to claim 1, wherein a program for causing a computer to execute the steps in the character area determining method according to claim 1 or 2 is recorded on a computer-readable recording medium. Recording medium on which is recorded.