JP2008011484A

JP2008011484A - Apparatus and method for extracting character and graphic string, program for executing the method, recording medium with the program stored therein

Info

Publication number: JP2008011484A
Application number: JP2006246338A
Authority: JP
Inventors: Tomoaki Ro; 朝陽盧; Shingo Ando; 慎吾安藤; Kaori Kataoka; 香織片岡; Hiroko Takahashi; 裕子高橋; Akira Suzuki; 章鈴木; Takayuki Yasuno; 貴之安野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-06-02
Filing date: 2006-09-12
Publication date: 2008-01-17

Abstract

<P>PROBLEM TO BE SOLVED: To extract a region that is a character and graphic string, from an image appropriately and at a high speed. <P>SOLUTION: A character and graphic string region is output which is obtained by applying to image data input by an image data input means, in turn, a means 12 for applying edge extraction processing and binarization processing, an isolate point removing means 13 for removing isolation points, a straight line removing section 141, a brush processing means 14, a morphology processing means 15, a shape analysis means 16 for analyzing a shape, based on a threshold acquired from a threshold storage section 19, and an overlapping region removing means 17. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、画像データから文字図形列や図形などのパターンを抽出する技術に関するものである。 The present invention relates to a technique for extracting a pattern such as a character graphic string or a graphic from image data.

近年、デジタルカメラが携帯電話やＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）等に搭載されるようになり、いつでも手軽に画像データを取得、保存できるようになった。この機能をさらに活用する手段として、画像に写る文字図形（即ち、文字を形成する図形）列を画像データから認識し、画像検索のためのインデクシングとして活用することが考えられる。 In recent years, digital cameras have been installed in mobile phones, PCs (Personal Computers), etc., and image data can be easily acquired and stored at any time. As a means for further utilizing this function, it is conceivable to recognize a character graphic (that is, a graphic forming a character) sequence appearing in an image from image data and use it as an index for image retrieval.

例えば、文字図形列を効率よく検出する手段として、画像から文字と関係ない成分を段階的に除去していき、最後に残った成分を文字成分として検出する手法が知られている（例えば、非特許文献１参照）。この手法では、エッジ成分を検出し、長い直線成分や孤立点を除去した後、残った成分の外接矩形を取り、形状に関する解析結果から文字図形列を検出するアルゴリズムを提案している。また、この手法は、図面や文書等を主に対象にしている手法である。 For example, as a means for efficiently detecting a character graphic string, a method is known in which a component unrelated to a character is removed from an image step by step, and the last remaining component is detected as a character component (for example, non-character Patent Document 1). This method proposes an algorithm that detects edge components, removes long straight line components and isolated points, takes a circumscribed rectangle of the remaining components, and detects a character graphic string from an analysis result on the shape. Further, this method is a method mainly for drawings and documents.

なお、関連技術として、自動的に閾値を選択して２値化を行う手法も広く知られている。
ＺｈａｏｙａｎｇＬｕ，”ＤｅｔｅｃｔｉｏｎｏｆＴｅｘｔＲｅｇｉｏｎｓｆｒｏｍＤｉｇｉｔａｌＥｎｇｉｎｅｅｒｉｎｇＤｒａｗｉｎｇｓ”，ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，ＡＰＲＩＬ１９９８，Ｖｏｌ．２０，Ｎｏ．４，ｐ．４３１−４３９．大津展之，「判別および最小２乗基準に基づく自動しきい値選定法」，電子通信学会論文誌，１９８０，Ｖｏｌ．Ｊ６３−Ｄ，Ｎｏ．４，ｐｐ．３４９−３５６． As a related technique, a technique of automatically selecting a threshold value and performing binarization is also widely known.
Zhaoyang Lu, “Detection of Text Regions from Digital Engineering Drawings”, IEEE Transactions on Pattern Analysis and Machine IntelliIL. 20, no. 4, p. 431-439. Otsu, Nobuyuki, “Automatic threshold selection method based on discrimination and least square criterion”, IEICE Transactions, 1980, Vol. J63-D, no. 4, pp. 349-356.

上述の手法は、図面等を対象としているため、例えば、デジタルカメラによって撮影された自然画像（例えば、文字入りの看板が背景に撮像された画像やテロップを含む映像）から文字図形列を抽出する手法としては最適と言えない。特に、大量の画像インデクシングを行ったり、映像を扱う場合、できるだけ高速に文字図形列を抽出しなくてはならず、上述の手法ではやや複雑過ぎて処理速度面に関して問題を生じる。 Since the above-described method is intended for drawings and the like, for example, a character graphic string is extracted from a natural image photographed by a digital camera (for example, an image in which a signboard with characters is captured in the background or a video including a telop). It is not an optimal method. In particular, when performing a large amount of image indexing or handling video, a character / graphic string must be extracted as fast as possible, and the above-described method is somewhat complicated and causes a problem in terms of processing speed.

本発明は、前記課題に基づいてなされたものであって、自然画像または映像から文字図形列である領域を適切かつ高速に抽出できる文字図形列抽出装置，文字図形列抽出方法，その方法を実行するプログラム，そのプログラムを記録した記録媒体を提供することにある。 The present invention has been made based on the above-described problems, and executes a character graphic string extraction device, a character graphic string extraction method, and a method thereof that can appropriately and quickly extract an area that is a character graphic string from a natural image or video. It is to provide a program to be recorded and a recording medium on which the program is recorded.

本発明は、前記課題の解決を図るために、請求項１記載の発明は、画像データに含まれる文字図形列を形成する領域を抽出する文字図形列抽出装置であって、前記画像データを画像データ取得手段から入力する画像データ入力手段と、前記入力された画像データにエッジ抽出処理を施し、さらに、そのエッジ抽出された画像データに対し２値化処理を施した第１段階画像データを生成するエッジ抽出手段と、前記第１段階画像データに関して他の第１値画素から孤立した第１値画素を除去して、第２段階画像データを生成する孤立点除去手段と、前記第２段階画像データから直線的に連続する第１値画素の領域を検出し、さらに、その領域を構成する第１値画素を除去した第３段階画像データを生成する直線除去手段と、注目画素に関する近傍を定義する閾値を閾値記憶部から読み出し、その閾値に基づいて前記第３段階画像データに対してブラッシュ処理を施して、第４段階画像データを生成するブラッシュ処理手段と、前記第４段階画像データに対し、エロージョンとダイレーションを施して、第５段階画像データを生成するモフォロジ処理手段と、前記第５段階画像データに対して、第１値画素領域をラベリングし、同一領域を一領域と見做して各領域の第１外接矩形を算出し、該第１外接矩形内の第１値画素数が第１値画素数の上限値を超えた場合を真として判定する第１判定，該第１外接矩形内の第１値画素数が第１値画素数の下限値を超えた場合を真として判定する第２判定，該第１外接矩形の短辺長が短辺長の下限値を超えた場合を真として判定する第３判定，該第１外接矩形内の第１値画素数と第２値画素数の比率が比率の下限値を超えた場合を真として判定する第４判定，前記入力された画像データにおいて、その第１外接矩形の内部に対応する部分のコントラストがコントラストの下限値を超えた場合を真として判定する第５判定，のいずれかの判定で真として判定された第１外接矩形を第２外接矩形と見做し、算出された第１外接矩形から第２外接矩形を除去し、残った第１外接矩形を第３外接矩形と見做して生成する形状解析手段と、前記包含関係にある第３外接矩形のうち、含まれる方の第３外接矩形を除去し、残った第３外接矩形を第４外接矩形と見做して生成する重複領域除去手段と、前記第４外接矩形を文字図形列領域と見做して出力する文字図形列領域出力手段と、を備えることを特徴とする。 In order to solve the above problems, the present invention provides a character / graphic string extraction apparatus for extracting a region for forming a character / graphic string included in image data, wherein the image data is converted into an image. Image data input means for inputting from the data acquisition means, and edge extraction processing is performed on the input image data, and further, first-stage image data is generated by performing binarization processing on the edge extracted image data Edge extracting means for removing the first value pixels isolated from the other first value pixels with respect to the first stage image data, and generating second stage image data, and the second stage image A straight line removing unit that detects a region of first value pixels that are linearly continuous from the data, and further generates third-stage image data in which the first value pixels constituting the region are removed; A brush processing means for reading out a threshold value defining the threshold value from the threshold value storage unit, performing a brush process on the third-stage image data based on the threshold value, and generating fourth-stage image data; and the fourth-stage image data On the other hand, a morphological processing means for generating fifth-stage image data by performing erosion and dilation, and the first-value pixel area is labeled with respect to the fifth-stage image data, and the same area is regarded as one area. First, a first circumscribing rectangle of each region is calculated, and a first determination for determining true when the number of first value pixels in the first circumscribing rectangle exceeds the upper limit of the number of first value pixels, 2nd determination which judges as true when the number of first value pixels in one circumscribed rectangle exceeds the lower limit value of the first value pixels, the short side length of the first circumscribed rectangle exceeds the lower limit value of the short side length A third determination that determines that the case is true, the third determination Fourth determination for determining true when the ratio between the number of first value pixels and the number of second value pixels in the circumscribed rectangle exceeds the lower limit value of the ratio, and the inside of the first circumscribed rectangle in the input image data The first circumscribed rectangle determined to be true in any one of the fifth determinations in which the contrast of the portion corresponding to 1 exceeds the lower limit of the contrast is determined to be true, and is calculated as the second circumscribed rectangle. The second circumscribed rectangle is removed from the first circumscribed rectangle and the remaining first circumscribed rectangle is generated as the third circumscribed rectangle; and the third circumscribed rectangle in the inclusion relationship, An overlapping area removing unit that removes the contained third circumscribed rectangle and regards the remaining third circumscribed rectangle as a fourth circumscribed rectangle, and regards the fourth circumscribed rectangle as a character graphic string region. A character graphic string region output means And

請求項２記載の発明は、画像データに含まれる文字図形列を形成する領域を抽出する文字図形列抽出方法であって、前記画像データを画像データ取得手段から入力する画像データ入力ステップと、その入力された画像データにエッジ抽出処理を施し、さらに、そのエッジ抽出された画像データに対し２値化処理を施した第１段階画像データを生成するエッジ抽出ステップと、その第１段階画像データに関して他の第１値画素から孤立した第１値画素を除去し、第２段階画像データを生成する孤立点除去ステップと、前記第２段階画像データから直線的に連続する第１値画素の領域を検出し、さらに、その領域を構成する第１値画素を除去した第３段階画像データを生成する直線除去ステップと、注目画素に関する近傍を定義する閾値を閾値記憶部から読み出し、その閾値に基づいて前記第３段階画像データに対してブラッシュ処理を施し、第４段階画像データを生成するブラッシュ処理ステップと、前記第４段階画像データに対し、エロージョンとダイレーションを施して、第５段階画像データを生成するモフォロジ処理ステップと、前記第５段階画像データに対して、第１値画素領域をラベリングし、同一領域を一領域と見做して各領域の第１外接矩形を算出し、該第１外接矩形内の第１値画素数が第１値画素数の上限値を超えた場合を真として判定する第１判定，該第１外接矩形内の第１値画素数が第１値画素数の下限値を超えた場合を真として判定する第２判定，該第１外接矩形の短辺長が短辺長の下限値を超えた場合を真として判定する第３判定，該第１外接矩形内の第１値画素数と第２値画素数の比率が比率の下限値を超えた場合を真として判定する第４判定，前記入力された画像データにおいて、その第１外接矩形の内部に対応する部分のコントラストがコントラストの下限値を超えた場合を真として判定する第５判定，のいずれかの判定で真として判定された第１外接矩形を第２外接矩形と見做し、算出された第１外接矩形から第２外接矩形を除去し、残った第１外接矩形を第３外接矩形と見做して生成する形状解析ステップと、包含関係にある第３外接矩形のうち、含まれる方の第３外接矩形を除去し、残った第３外接矩形を第４外接矩形と見做して生成する重複領域除去ステップと、その第４外接矩形を文字図形列領域と見做して出力する文字図形列領域出力ステップと、を有することを特徴とする。 The invention according to claim 2 is a character graphic string extraction method for extracting a region for forming a character graphic string included in image data, the image data input step for inputting the image data from an image data acquisition means, An edge extraction step for performing an edge extraction process on the input image data, and further generating a first stage image data obtained by performing a binarization process on the edge extracted image data, and the first stage image data An isolated point removing step for removing the first value pixel isolated from the other first value pixels to generate second stage image data, and a region of the first value pixels linearly continuous from the second stage image data. The threshold value is defined as a straight line removing step for generating third-stage image data from which the first value pixels constituting the region are removed, and a neighborhood defining the pixel of interest. A step of performing a brush process on the third-stage image data based on the threshold value, generating a fourth-stage image data, and performing erosion and dilation on the fourth-stage image data. Then, the morphology processing step for generating the fifth stage image data and the first value pixel area are labeled with respect to the fifth stage image data, and the first area of each area is regarded as one area. A circumstance rectangle is calculated, and a first determination for determining as true when the number of first value pixels in the first circumscribed rectangle exceeds an upper limit value of the first value pixel number, a first value in the first circumscribed rectangle Second determination for determining true when the number of pixels exceeds the lower limit of the first number of pixels, and determining as true when the short side length of the first circumscribed rectangle exceeds the lower limit of the short side length 3 determination, the first in the first circumscribed rectangle Fourth determination for determining true when the ratio between the number of pixels and the number of second-value pixels exceeds the lower limit value of the ratio. In the input image data, the contrast of the portion corresponding to the inside of the first circumscribed rectangle is The first circumscribed rectangle determined as true in any one of the fifth determinations in which the case where the lower limit value of the contrast is exceeded is determined as true is regarded as the second circumscribed rectangle, and is calculated from the calculated first circumscribed rectangle. A shape analysis step for removing the second circumscribed rectangle and generating the remaining first circumscribed rectangle as a third circumscribed rectangle, and a third circumscribed rectangle that is included among the third circumscribed rectangles in an inclusive relationship And an overlapping area removing step for generating the remaining third circumscribed rectangle as a fourth circumscribed rectangle, and outputting a character / graphic string area as if the fourth circumscribed rectangle is regarded as a character / graphic string area And a step.

請求項３記載の発明は、文字図形列抽出プログラムであって、請求項２に記載の文字図形列抽出方法を、コンピュータで実行可能なコンピュータプログラムとして記述したことを特徴とする。 The invention described in claim 3 is a character graphic string extraction program, wherein the character graphic string extraction method according to claim 2 is described as a computer program executable by a computer.

請求項４記載の発明は、記録媒体であって、請求項２に記載の文字図形列抽出方法を、コンピュータで実行可能なプログラムに記述し、そのプログラムを記録したことを特徴とする。 The invention described in claim 4 is a recording medium, wherein the character / graphic string extraction method described in claim 2 is described in a computer-executable program and the program is recorded.

請求項５記載の発明は、画像データに含まれる文字図形列を形成する領域を抽出する文字図形列抽出装置であって、前記画像データを画像データ取得手段から入力する画像データ入力手段と、前記入力された画像データにエッジ抽出処理を施し、さらに、そのエッジ抽出された画像データに対し２値化処理を施した第１段階画像データを生成するエッジ抽出手段と、前記第１段階画像データに関して他の第１値画素から孤立した第１値画素を除去して、第２段階画像データを生成する孤立点除去手段と、前記第２段階画像データから直線的に連続する第１値画素の領域を検出し、さらに、その領域を構成する第１値画素を除去した第３段階画像データを生成する直線除去手段と、注目画素に関する近傍を定義する閾値を閾値記憶部から読み出し、その閾値に基づいて前記第３段階画像データに対してブラッシュ処理を施して、第４段階画像データを生成するブラッシュ処理手段と、前記第４段階画像データに対し、エロージョンとダイレーションを施して、第５段階画像データを生成するモフォロジ処理手段と、前記第５段階画像データに対して、第１値画素領域をラベリングし、同一領域を一領域と見做して各領域の第１外接矩形を算出し、該第１外接矩形内の第１値画素数が第１値画素数の上限値を超えた場合を真として判定する第１判定，該第１外接矩形内の第１値画素数が第１値画素数の下限値を超えた場合を真として判定する第２判定，該第１外接矩形の短辺長が短辺長の下限値を超えた場合を真として判定する第３判定，該第１外接矩形内の第１値画素数と第２値画素数の比率が比率の下限値を超えた場合を真として判定する第４判定，前記入力された画像データにおいて、その第１外接矩形の内部に対応する部分のコントラストがコントラストの下限値を超えた場合を真として判定する第５判定，のいずれかの判定で真として判定された第１外接矩形を第２外接矩形と見做し、算出された第１外接矩形から第２外接矩形を除去し、残った第１外接矩形を第３外接矩形と見做して生成する形状解析手段と、前記包含関係にある第３外接矩形のうち、含まれる方の第３外接矩形を除去し、残った第３外接矩形を第４外接矩形と見做して生成する重複領域除去手段と、前記第４外接矩形のうち、外接矩形領域の２値化のしやすさ、ストローク幅の分散の少なさ、ストローク幅の最大の大きさ、２値化後の第１値画素領域と第２値画素領域との境界の複雑さ、ストローク長の最大の大きさ、をそれぞれ数値化し、文字らしくない外接矩形領域と判定されたものを第４外接矩形から除去し、残った第４外接矩形を第５外接矩形と見做して生成する文字らしさ解析手段と、前記第５外接矩形を文字図形列領域と見做して出力する文字図形列領域出力手段と、を備えることを特徴とする。 The invention according to claim 5 is a character / graphic string extraction device for extracting a region forming a character / graphic string included in image data, the image data input means for inputting the image data from an image data acquisition means, Edge extraction means for performing edge extraction processing on the input image data and generating first-stage image data obtained by binarizing the edge-extracted image data; and the first-stage image data An isolated point removing unit that removes first value pixels isolated from other first value pixels to generate second stage image data, and a region of first value pixels that are linearly continuous from the second stage image data And a straight line removing unit that generates third-stage image data from which the first value pixels constituting the area are removed, and a threshold value that defines a neighborhood related to the target pixel is read from the threshold storage unit. The third stage image data is brushed based on the threshold value to generate the fourth stage image data, and the fourth stage image data is subjected to erosion and dilation. , A morphology processing means for generating fifth-stage image data, and a first value pixel area for the fifth-stage image data, labeling the same area as one area, and a first circumscribed rectangle of each area , And a first determination that determines true when the number of first value pixels in the first circumscribed rectangle exceeds the upper limit of the number of first value pixels, the number of first value pixels in the first circumscribed rectangle Is determined to be true when the value exceeds the lower limit of the number of first value pixels, and is determined as true when the short side length of the first circumscribed rectangle exceeds the lower limit of the short side length. , The first value pixel number and the second value pixel in the first circumscribed rectangle A fourth determination for determining the case where the ratio exceeds the lower limit of the ratio as true, and the contrast of the portion corresponding to the inside of the first circumscribed rectangle in the input image data exceeds the lower limit of the contrast The first circumscribed rectangle determined to be true in the determination of any one of the fifth determinations to determine as true is regarded as the second circumscribed rectangle, and the second circumscribed rectangle is removed from the calculated first circumscribed rectangle, Shape analysis means for generating the remaining first circumscribed rectangle as a third circumscribed rectangle, and the third circumscribed rectangle included in the third circumscribed rectangle in the inclusion relation is removed, and the remaining first circumscribed rectangle is removed. Overlapping area removing means for generating three circumscribed rectangles as fourth circumscribed rectangles, ease of binarization of circumscribed rectangular areas of the fourth circumscribed rectangles, little variation in stroke width, stroke Maximum value of width, first value pixel after binarization The complexity of the boundary between the region and the second value pixel region and the maximum stroke length are each digitized, and those determined as circumscribed rectangular regions that do not look like characters are removed from the fourth circumscribed rectangle, and the remaining second A character-likeness analysis unit that generates the four circumscribed rectangles as a fifth circumscribed rectangle; and a character / graphic string region output unit that outputs the fifth circumscribed rectangle as a character / graphic column region. Features.

請求項６記載の発明は、画像データに含まれる文字図形列を形成する領域を抽出する文字図形列抽出方法であって、前記画像データを画像データ取得手段から入力する画像データ入力ステップと、その入力された画像データにエッジ抽出処理を施し、さらに、そのエッジ抽出された画像データに対し２値化処理を施した第１段階画像データを生成するエッジ抽出ステップと、その第１段階画像データに関して他の第１値画素から孤立した第１値画素を除去し、第２段階画像データを生成する孤立点除去ステップと、前記第２段階画像データから直線的に連続する第１値画素の領域を検出し、さらに、その領域を構成する第１値画素を除去した第３段階画像データを生成する直線除去ステップと、注目画素に関する近傍を定義する閾値を閾値記憶部から読み出し、その閾値に基づいて前記第３段階画像データに対してブラッシュ処理を施し、第４段階画像データを生成するブラッシュ処理ステップと、前記第４段階画像データに対し、エロージョンとダイレーションを施して、第５段階画像データを生成するモフォロジ処理ステップと、前記第５段階画像データに対して、第１値画素領域をラベリングし、同一領域を一領域と見做して各領域の第１外接矩形を算出し、該第１外接矩形内の第１値画素数が第１値画素数の上限値を超えた場合を真として判定する第１判定，該第１外接矩形内の第１値画素数が第１値画素数の下限値を超えた場合を真として判定する第２判定，該第１外接矩形の短辺長が短辺長の下限値を超えた場合を真として判定する第３判定，該第１外接矩形内の第１値画素数と第２値画素数の比率が比率の下限値を超えた場合を真として判定する第４判定，前記入力された画像データにおいて、その第１外接矩形の内部に対応する部分のコントラストがコントラストの下限値を超えた場合を真として判定する第５判定，のいずれかの判定で真として判定された第１外接矩形を第２外接矩形と見做し、算出された第１外接矩形から第２外接矩形を除去し、残った第１外接矩形を第３外接矩形と見做して生成する形状解析ステップと、包含関係にある第３外接矩形のうち、含まれる方の第３外接矩形を除去し、残った第３外接矩形を第４外接矩形と見做して生成する重複領域除去ステップと、前記第４外接矩形のうち、外接矩形領域の２値化のしやすさ、ストローク幅の分散の少なさ、ストローク幅の最大の大きさ、２値化後の第１値画素領域と第２値画素領域との境界の複雑さ、ストローク長の最大の大きさ、をそれぞれ数値化し、文字らしくない外接矩形領域と判定されたものを第４外接矩形から除去し、残った第４外接矩形を第５外接矩形と見做して生成する文字らしさ解析ステップと、その第５外接矩形を文字図形列領域と見做して出力する文字図形列領域出力ステップと、を有することを特徴とする。 The invention according to claim 6 is a character / graphic string extraction method for extracting a region for forming a character / graphic string included in image data, the image data input step for inputting the image data from an image data acquisition means, An edge extraction step for performing an edge extraction process on the input image data, and further generating a first stage image data obtained by performing a binarization process on the edge extracted image data, and the first stage image data An isolated point removing step for removing the first value pixel isolated from the other first value pixels to generate second stage image data, and a region of the first value pixels linearly continuous from the second stage image data. The threshold value is defined as a straight line removing step for generating third-stage image data from which the first value pixels constituting the region are removed, and a neighborhood defining the pixel of interest. A step of performing a brush process on the third-stage image data based on the threshold value, generating a fourth-stage image data, and performing erosion and dilation on the fourth-stage image data. Then, the morphology processing step for generating the fifth stage image data and the first value pixel area are labeled with respect to the fifth stage image data, and the first area of each area is regarded as one area. A circumstance rectangle is calculated, and a first determination for determining as true when the number of first value pixels in the first circumscribed rectangle exceeds an upper limit value of the first value pixel number, a first value in the first circumscribed rectangle Second determination for determining true when the number of pixels exceeds the lower limit of the first number of pixels, and determining as true when the short side length of the first circumscribed rectangle exceeds the lower limit of the short side length 3 determination, the first in the first circumscribed rectangle Fourth determination for determining true when the ratio between the number of pixels and the number of second-value pixels exceeds the lower limit value of the ratio. In the input image data, the contrast of the portion corresponding to the inside of the first circumscribed rectangle is The first circumscribed rectangle determined as true in any one of the fifth determinations in which the case where the lower limit value of the contrast is exceeded is determined as true is regarded as the second circumscribed rectangle, and is calculated from the calculated first circumscribed rectangle. A shape analysis step for removing the second circumscribed rectangle and generating the remaining first circumscribed rectangle as a third circumscribed rectangle, and a third circumscribed rectangle that is included among the third circumscribed rectangles in an inclusive relationship And the overlapping region removal step for generating the remaining third circumscribed rectangle as a fourth circumscribed rectangle, the ease of binarization of the circumscribed rectangular region of the fourth circumscribed rectangle, and the stroke width Less dispersion, maximum stroke width The binarized first-value pixel area and the second-value pixel area are each digitized in terms of the complexity of the boundary and the maximum stroke length. Character likelihood analysis step that is generated by considering the remaining fourth circumscribed rectangle as the fifth circumscribed rectangle, and removing the fourth circumscribed rectangle as the fifth circumscribed rectangle, and the character that is output by regarding the fifth circumscribed rectangle as the character graphic string region And a graphic sequence region output step.

請求項７記載の発明は、文字図形列抽出プログラムであって、請求項６に記載の文字図形列抽出方法を、コンピュータで実行可能なコンピュータプログラムとして記述したことを特徴とする。 The invention described in claim 7 is a character graphic string extraction program, wherein the character graphic string extraction method according to claim 6 is described as a computer program executable by a computer.

請求項８記載の発明は、記録媒体であって、請求項６に記載の文字図形列抽出方法を、コンピュータで実行可能なプログラムに記述し、そのプログラムを記録したことを特徴とする。 The invention described in claim 8 is a recording medium, wherein the character / graphic string extraction method described in claim 6 is described in a computer-executable program and the program is recorded.

前記の請求項１，２，３，４の発明によれば、上述のエッジ抽出，２値化処理，孤立点除去，ブラッシュ処理，モフォロジ処理を施した画像に対し、単純化した形状解析処理を実行できる。 According to the first, second, third, and fourth aspects of the present invention, a simplified shape analysis process is performed on an image subjected to the above-described edge extraction, binarization process, isolated point removal, brush process, and morphology process. Can be executed.

前記の請求項５，６，７，８の発明によれば、上述のエッジ抽出，２値化処理，孤立点除去，ブラッシュ処理，モフォロジ処理を施した画像に対し、単純化した形状解析処理，文字らしさ解析処理を実行できる。 According to the inventions of the fifth, sixth, seventh, and eighth aspects, a simplified shape analysis process is performed on an image that has been subjected to the above-described edge extraction, binarization process, isolated point removal, brush process, and morphology process. Characteristic analysis processing can be executed.

以上示したように請求項１，２，３，４の発明によれば、形状解析処理が単純化されるため、文字図形列抽出処理全体として、画像から文字図形列領域を適切かつ高速に抽出できる。また、非文字図形列領域の誤抽出を抑制でき、その後の認識処理の負担を軽減できる。 As described above, according to the first, second, third, and fourth aspects of the invention, since the shape analysis processing is simplified, the character / graphic string extraction region is appropriately and quickly extracted from the image as the entire character / graphic string extraction processing. it can. In addition, it is possible to suppress erroneous extraction of the non-character graphic string region, and to reduce the burden of subsequent recognition processing.

請求項５，６，７，８の発明によれば、請求項１，２，３，４の発明よりも、非文字列領域の誤抽出を抑制でき、その後の認識処理の負担を軽減できる。 According to the fifth, sixth, seventh, and eighth inventions, it is possible to suppress erroneous extraction of non-character string regions and to reduce the burden of subsequent recognition processing, as compared with the first, second, third, and fourth inventions.

これらを以って文字認識技術分野に貢献できる。 These can contribute to the character recognition technology field.

以下、本発明の実施形態を図面等に基づいて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

本実施形態における文字図形列抽出装置の構成を図１に基づいて説明する。本実施形態における文字図形列抽出装置は、画像データ入力部１１，画像処理部１，文字図形列領域出力部１８，閾値記憶部１９から構成される。また、上記の画像処理部１は、エッジ抽出部１２，孤立点除去部１３，直線除去部１４１，ブラッシュ処理部１４，モフォロジ処理部１５，形状解析部１６，重複領域除去部１７から構成される。なお、閾値記憶部１９に対し閾値を入力及び記憶する手段を備えていても良いし、予め固定的な値が閾値記憶部１９に記憶されていても良い。また、前記閾値は、例えば、文字図形（または文字図形列），図形（例えば、トレードマーク）などを含むパターンを認識するための閾値である。 The configuration of the character graphic string extraction apparatus in this embodiment will be described with reference to FIG. The character / graphic string extraction apparatus according to the present embodiment includes an image data input unit 11, an image processing unit 1, a character / graphic string region output unit 18, and a threshold storage unit 19. The image processing unit 1 includes an edge extraction unit 12, an isolated point removal unit 13, a straight line removal unit 141, a brush processing unit 14, a morphology processing unit 15, a shape analysis unit 16, and an overlapping region removal unit 17. . The threshold storage unit 19 may be provided with a means for inputting and storing a threshold value, or a fixed value may be stored in the threshold storage unit 19 in advance. The threshold is a threshold for recognizing a pattern including, for example, a character graphic (or character graphic string), a graphic (for example, a trademark), or the like.

画像データ入力部１１は、画像データ取得手段（例えば、デジタルカメラや画像データを格納したデータベース等）から自然画像データ（即ち、多値画像データ）を入力し、エッジ抽出部１２に伝送する。なお、映像から自然画像データを入力する場合、その映像中の各フレーム画像（即ち、画像データ）を画像データ入力部１１から入力する。 The image data input unit 11 receives natural image data (that is, multi-valued image data) from an image data acquisition unit (for example, a digital camera or a database storing image data) and transmits the natural image data to the edge extraction unit 12. When natural image data is input from a video, each frame image (that is, image data) in the video is input from the image data input unit 11.

エッジ抽出部１２は、伝送された画像データにエッジ抽出処理を施し、さらに、そのエッジ抽出された画像データに対し２値化処理を施した画像データを生成し孤立点除去部１３に伝送する。なお、エッジ抽出処理には、例えば、Ｓｏｂｅｌオペレータ、Ｌａｐｌａｓｉａｎオペレータ等を用いる。また、２値化処理には、例えば、固定閾値による方法、局所窓内の明度ヒストグラムにもとづく適応的閾値による方法等を用いる。以下の説明では、エッジ領域に対し２値化処理を行った領域を黒色画素（即ち、画素値が「０」の画素；第１値画素）として説明する。 The edge extraction unit 12 performs edge extraction processing on the transmitted image data, generates image data obtained by performing binarization processing on the extracted image data, and transmits the image data to the isolated point removal unit 13. For the edge extraction process, for example, a Sobel operator, a Laplacian operator, or the like is used. In the binarization process, for example, a method using a fixed threshold value, a method using an adaptive threshold value based on a brightness histogram in a local window, or the like is used. In the following description, an area obtained by binarizing the edge area will be described as a black pixel (that is, a pixel having a pixel value “0”; a first value pixel).

孤立点除去部１３は、エッジ抽出後の画像データに対して、１画素の孤立点黒色画素を削除したその結果生成した画像データを直線除去部１４１へ伝送する。 The isolated point removal unit 13 transmits the image data generated as a result of deleting one isolated point black pixel from the image data after edge extraction to the straight line removal unit 141.

直線除去部１４１は、孤立点黒画素除去後の画像データから黒色画素が直線的に連続する場所を検出し、それらを構成する黒色画素を全て除去する。具体的には、まず、世界座標系において、画像を回転角度θ°だけ回転（もしくはスキュー変換）する。なお、回転後の画像も２値画像であるとする。次に、その画像を縦方向（世界座標系ｙ軸方向）または横方向（世界座標系ｘ軸方向）にスキャン（線状に検査）して、黒色画素のランレングスＬを計測し、そのランレングスＬが特定の閾値ｎ₁より大きいものを直線と見做す。なお、閾値ｎ₁は閾値記憶部１９から読み出す。次に、回転角度θ°を数段階変えて同様に処理する。そして、その直線と見做された構成画素を全て除去する。最後に、直線除去された段階の画像データを生成しブラッシュ処理部１４へ伝送する。 The straight line removal unit 141 detects a place where black pixels continue linearly from the image data after the isolated point black pixels are removed, and removes all black pixels constituting them. Specifically, first, the image is rotated (or skew-transformed) by the rotation angle θ ° in the world coordinate system. It is assumed that the rotated image is also a binary image. Next, the image is scanned (inspected linearly) in the vertical direction (world coordinate system y-axis direction) or the horizontal direction (world coordinate system x-axis direction), and the run length L of the black pixel is measured. A case where the length L is larger than a specific threshold value n ₁ is regarded as a straight line. The threshold value n ₁ is read from the threshold value storage unit 19. Next, the rotation angle θ ° is changed in several steps and processed in the same manner. Then, all the constituent pixels regarded as the straight line are removed. Finally, the image data at the stage where the straight line has been removed is generated and transmitted to the brush processing unit 14.

ブラッシュ処理部１４は、直線除去部１４１から伝送されてきた全ての黒色画素をある距離（画素数）まで上下左右（画像座標系ｙ方向ｘ方向）に延ばす処理を行う。即ち、直線除去部１４１から伝送された画像データ全体に対して以下のようなルールに基づいた演算を行うものである。なお、ブラッシュ処理とは、全ての対象画素をある距離（画素数）まで放射状に延ばす処理である。そして、その演算結果である画像データを生成しモフォロジ処理部１５へ伝送する。 The brush processing unit 14 performs a process of extending all the black pixels transmitted from the straight line removal unit 141 in the vertical and horizontal directions (image coordinate system y direction x direction) to a certain distance (number of pixels). That is, the calculation based on the following rules is performed on the entire image data transmitted from the straight line removal unit 141. The brushing process is a process of extending all target pixels radially to a certain distance (number of pixels). Then, the image data as the calculation result is generated and transmitted to the morphology processing unit 15.

まず、注目画素から上（画像座標系の＋ｙ方向）ｎ₂画素数分及び下（画像座標系の−ｙ方向）ｎ₂画素数分の範囲内における画素のどれかが一つでも黒色画素であれば注目画素も黒色画素にする。なお、ｎ₂は、ブラッシュ処理における注目画素に関する近傍を定義する閾値であって、正の値を示す。閾値ｎ₂は閾値記憶部１９から読み出すものとする。 First, any one of the pixels within the range of n ₂ pixels above (+ y direction in the image coordinate system) and n ₂ pixels below (−y direction of the image coordinate system) from the target pixel is a black pixel. If so, the pixel of interest is also black. Note that n ₂ is a threshold value that defines the neighborhood related to the target pixel in the brushing process, and indicates a positive value. The threshold value n ₂ is read from the threshold value storage unit 19.

その後、注目画素から左（画像座標系の−ｘ方向）ｎ₂画素数分及び右（画像座標系の＋ｘ方向）ｎ₂画素数分の範囲内における画素のどれかが一つでも黒色画素であれば注目画素も黒色画素にする。 Thereafter, any one of the pixels within the range corresponding to the number of n ₂ pixels on the left (−x direction of the image coordinate system) and the number of n ₂ pixels on the right (the + x direction of the image coordinate system) from the target pixel is a black pixel. If so, the pixel of interest is also black.

以上が、本実施形態におけるブラッシュ処理になる。 The above is the brushing process in the present embodiment.

モフォロジ処理部１５は、ブラッシュ処理部１４から伝送された画像データに対してエロージョンおよびダイレーションを施し（即ち、モフォロジ処理を施し）、その施した画像データを生成し形状解析部１６へ伝送する。 The morphology processing unit 15 performs erosion and dilation on the image data transmitted from the brush processing unit 14 (that is, performs a morphology process), generates the applied image data, and transmits the generated image data to the shape analysis unit 16.

形状解析部１６では、形状解析として次の処理を行う。 The shape analysis unit 16 performs the following processing as shape analysis.

まず、モフォロジ処理部１５から伝送された画像データに対し黒色画素のラベリングを行い、同一ラベルの領域を一領域と考え各領域の外接矩形領域（以下、単に外接矩形という）を計算する（抽出する）。なお、外接矩形は斜めに回転した矩形（長方形）も許すものとする。 First, black pixel labeling is performed on the image data transmitted from the morphology processing unit 15, and a circumscribed rectangular area (hereinafter simply referred to as a circumscribed rectangle) of each area is calculated (extracted) by regarding the same label area as one area. ). Note that the circumscribed rectangle also allows a rectangle (rectangle) rotated obliquely.

そして、各外接矩形に対し、以下の式１乃至５のいずれか一つに当てはまる外接矩形を除去し、全くあてはまらない外接矩形を抽出する。なお、ｎ₃は、外接矩形内の黒色画素数の上限値を定義する閾値であって、正の値である。ｎ₄は、外接矩形内の黒色画素数の下限値を定義する閾値であって、正の値である。ｎ₅は、外接矩形の短辺の下限値を定義する閾値であって、正の値である。ｎ₆は、外接矩形内の黒色画素数と白色画素（第２値画素）数の比の下限値を定義する閾値であって、正の値である。ｎ₇は、外接矩形内の元画像（即ち、自然画像データ）のコントラストの下限値を定義する閾値であって、正の値である。前記コントラストは、矩形内の２値化前の元画像で計測を行い、例えば、明度ヒストグラムの分散等で定義される。閾値ｎ₃からｎ₇は、閾値記憶部１９から読み出して使用するものとする。 Then, for each circumscribed rectangle, circumscribed rectangles that apply to any one of the following formulas 1 to 5 are removed, and circumscribed rectangles that do not apply at all are extracted. Note that n ₃ is a threshold value that defines the upper limit value of the number of black pixels in the circumscribed rectangle, and is a positive value. n ₄ is a threshold value that defines a lower limit value of the number of black pixels in the circumscribed rectangle, and is a positive value. n ₅ is a threshold value that defines the lower limit value of the short side of the circumscribed rectangle, and is a positive value. n ₆ is a threshold value defining a lower limit value of the ratio of the number of black pixels and the number of white pixels (second value pixels) in the circumscribed rectangle, and is a positive value. n ₇ is a threshold value that defines the lower limit value of the contrast of the original image (that is, natural image data) in the circumscribed rectangle, and is a positive value. The contrast is measured with an original image before binarization in a rectangle, and is defined by, for example, the variance of a brightness histogram. The thresholds n ₃ to n ₇ are read from the threshold storage unit 19 and used.

ただし、Ｐ_bは矩形内の黒色画素数、Ｌ_sは矩形の短辺、Ｐ_wは矩形内の白色画素数、Ｐ_cは矩形内の元画像のコントラストである。 Here, P _b is the number of black pixels in the rectangle, L _s is the short side of the rectangle, P _w is the number of white pixels in the rectangle, and P _c is the contrast of the original image in the rectangle.

抽出された外接矩形を重複領域除去部１７へ伝送する。 The extracted circumscribed rectangle is transmitted to the overlapping area removing unit 17.

重複領域除去部１７は、抽出された外接矩形の重なりを調べ、完全に包含するようなものに関しては小さい方（即ち、含まれる方の外接矩形）を除去し、残された外接矩形を文字図形列領域出力部１８へ伝送する。 The overlapping area removing unit 17 examines the overlap of the extracted circumscribed rectangles, removes the smaller one (ie, the included circumscribed rectangle) for those that completely include, and converts the remaining circumscribed rectangle into a character figure. The data is transmitted to the column area output unit 18.

文字図形列領域出力部１８は、残された外接矩形に関する情報（例えば、矩形の長辺の長さ、短辺の長さ、回転角度）を文字図形領域として出力する。例えば、外接矩形を元画像（即ち、画像入力部１１によって入力された画像）に重畳してディスプレイ装置に表示させる。 The character graphic string area output unit 18 outputs information about the remaining circumscribed rectangle (for example, the length of the long side of the rectangle, the length of the short side, and the rotation angle) as the character graphic area. For example, the circumscribed rectangle is superimposed on the original image (that is, the image input by the image input unit 11) and displayed on the display device.

本実施形態における文字図形列抽出方法を図２に基づいて説明する。なお、以下の説明で、図１中の符号と同じものの説明は省略する。 A character graphic string extraction method according to the present embodiment will be described with reference to FIG. In the following description, the same reference numerals as those in FIG. 1 are omitted.

まず、画像データ取得手段から自然画像データを入力する（Ｓ１０１）。 First, natural image data is input from the image data acquisition means (S101).

次に、入力された自然画像データに対しエッジ抽出処理を施し、さらに、そのエッジ抽出された画像データに対し２値化処理を施す（Ｓ１０２）。 Next, edge extraction processing is performed on the input natural image data, and further binarization processing is performed on the edge-extracted image data (S102).

次に、エッジ抽出処理及び２値化処理を施された画像データにおいて、１画素の孤立点黒色画素を削除する（Ｓ１０３）。 Next, one isolated point black pixel is deleted from the image data subjected to the edge extraction process and the binarization process (S103).

次に、孤立点黒色画素を削除された画像データに対して上述の直線除去を施す（Ｓ１０４１）。 Next, the above-described straight line removal is performed on the image data from which the isolated point black pixels have been deleted (S1041).

次に、直線除去された画像データ全体に対して上述のブラッシュ処理を施す（Ｓ１０４）。 Next, the above-described brushing process is performed on the entire image data from which straight lines have been removed (S104).

次に、ブラッシュ処理を施された画像データに対して、上述のモフォロジ処理を施す（Ｓ１０５）。 Next, the above-described morphology process is performed on the image data subjected to the brush process (S105).

次に、モフォロジ処理を施された画像データに対して、上述の形状解析を施し、外接矩形領域を抽出する（Ｓ１０６）。 Next, the above-described shape analysis is performed on the image data that has been subjected to morphology processing, and a circumscribed rectangular region is extracted (S106).

そして、抽出された外接矩形領域の重なりを調べ、完全に包含するようなものに関しては含まれる方を除去する（Ｓ１０７）。即ち、残された外接矩形領域が、閾値記憶部１９に記憶された閾値に基づくパターン（例えば、文字図形列パターン）に一致した領域である。 Then, the overlap of the extracted circumscribed rectangular areas is examined, and those that are completely included are removed (S107). That is, the remaining circumscribed rectangular area is an area that matches a pattern (for example, a character graphic string pattern) based on the threshold value stored in the threshold value storage unit 19.

なお、本実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）（あるいはＭＰＵ（ＭｉｃｒｏｐｒｏｃｅｓｓｉｎｇＵｎｉｔ））が記憶媒体に格納されたプログラムコードを読み出し実行することによっても、実現できる。その場合、記憶媒体から読み出されたプログラムコード自体が上述した実施の形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体、例えば、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｋＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＤＶＤ−ＲＯＭ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＣＤ−Ｒ（ＣｏｍｐａｃｔＤｉｓｋＲｅｃｏｒｄａｂｌｅ）、ＣＤ−ＲＷ（ＣｏｍｐａｃｔＤｉｓｋＲｅＷｒｉｔａｂｌｅ）、ＭＯ（Ｍａｇｎｅｔｏ−Ｏｐｔｉｃａｌｄｉｓｋ）、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）等は本発明を構成する。 Note that a storage medium that records a program code of software that realizes the functions of the present embodiment is supplied to a system or apparatus, and a CPU (Central Processing Unit) (or MPU (Microprocessing Unit)) of the system or apparatus stores the storage medium. This can also be realized by reading out and executing the program code stored in. In that case, the program code itself read from the storage medium realizes the functions of the above-described embodiment, and a storage medium storing the program code, for example, a CD-ROM (Compact Disk Read Only Memory), DVD-ROM (Digital Versatile Disk Read Only Memory), CD-R (Compact Disk Recordable), CD-RW (Compact Disk ReWriteable), MO (Magneto-Optical Disk), MO (Magneto-Optical Disk) To do.

また、高速化より文字図形検出精度を重視する場合、重複領域除去部の後に、「文字らしさ解析部」を追加した構成（例えば、図３及び図４の構成）した実施例も考えられる。 In addition, in the case where the accuracy of character / graphic detection is more important than the speedup, an embodiment in which a “characteristic analysis unit” is added after the overlapping region removal unit (for example, the configuration of FIGS. 3 and 4) is also conceivable.

図３中の画像データ入力部２１〜重複領域除去部２７，直線除去部２４１及び文字図形列領域出力部２９は、前記画像データ入力部１１〜重複領域除去部１７，直線除去部１４１および文字図形列領域出力部１８とそれぞれ同様の機能を有する。 3, the image data input unit 21 to the overlap region removal unit 27, the straight line removal unit 241 and the character / graphic string region output unit 29 are the image data input unit 11 to the overlap region removal unit 17, the straight line removal unit 141 and the character / figure graphic. Each of the column area output units 18 has the same function.

文字らしさ解析部２８は、抽出された文字図形領域候補（例えば、重複領域除去部２７によって残された外接矩形）それぞれにおいて、５種類の文字らしさ基準値を算出し、文字図形領域でないと判定されたものは除外する。ここで、文字図形領域候補は全て長方形の形状で検出されており、長方形の短辺がＳになるよう予め大きさ正規化されているものとする。なお、Ｓの値は予め決められているものとする。 The character likelihood analysis unit 28 calculates five types of character likelihood reference values for each of the extracted character graphic region candidates (for example, circumscribed rectangles left by the overlapping region removal unit 27), and determines that the character graphic region is not a character graphic region. Are excluded. Here, all the character / graphic region candidates are detected in a rectangular shape, and the size is normalized in advance so that the short side of the rectangle becomes S. Note that the value of S is determined in advance.

１つ目の文字らしさ基準値は、２値化のしやすさを数値化したものである。 The first character-likeness reference value is obtained by quantifying the ease of binarization.

まず、一つの文字図形領域候補に対し、図５のような明度ヒストグラムを算出する。なお、図５は、ｘ軸が明度、ｙ軸が頻度である。 First, a brightness histogram as shown in FIG. 5 is calculated for one character graphic region candidate. In FIG. 5, the x-axis is lightness and the y-axis is frequency.

次に、ヒストグラムから２値化のための閾値を求める。なお、閾値算出法は、例えば、非特許文献２の閾値算出法などが挙げられる。 Next, a threshold value for binarization is obtained from the histogram. Examples of the threshold calculation method include the threshold calculation method of Non-Patent Document 2.

次に、ヒストグラムから２つのピーク位置（例えば、図５中の左ピーク位置Ｐｌ，右ピーク位置Ｐｒ）を検出する。例えば、しきい値（例えば、図５中のしきい値ｔｈ）で区切られた２つの領域においてそれぞれ最大の頻度を示す位置として算出可能である。 Next, two peak positions (for example, the left peak position Pl and the right peak position Pr in FIG. 5) are detected from the histogram. For example, it can be calculated as a position indicating the maximum frequency in each of two regions separated by a threshold (for example, threshold th in FIG. 5).

そして、それぞれのピーク位置において前後に幅ｗ１の範囲の頻度を累積値として求め、全体の何パーセントを占めているかを調べる。なお、幅ｗ１の値は、予め決められているものとする。 Then, the frequency in the range of the width w1 is obtained as a cumulative value before and after each peak position, and the percentage of the whole is examined. Note that the value of the width w1 is determined in advance.

ここで、左ピーク位置の周囲から算出されたものの割合をＰｅｒＬ，右ピークから算出されたものの割合をＰｅｒＲということとする。ＰｅｒＬ及びＰｅｒＲが予め決められたＴｈｒｅ１より小さい（ＰｅｒＬ＜Ｔｈｒｅ１かつＰｅｒＲ＜Ｔｈｒｅ１）場合は、文字図形領域でないと見做して候補から除外する。 Here, the ratio calculated from the periphery of the left peak position is referred to as PerL, and the ratio calculated from the right peak is referred to as PerR. If PerL and PerR are smaller than the predetermined Thre1 (PerL <Thre1 and PerR <Thre1), it is regarded as not a character graphic area and is excluded from the candidates.

２つ目の文字らしさ基準値は、ストローク幅の分散の少なさを数値化したものである。なお、ストロークとは、文字を構成する線のことを指す。 The second character-likeness reference value is obtained by quantifying the small dispersion of the stroke width. Note that a stroke refers to a line that constitutes a character.

まず、一つの文字図形領域候補に対し、領域内を２値化した２値化文字図形領域候補を生成する。 First, a binarized character / graphic area candidate is generated by binarizing the area for one character / graphic area candidate.

次に、ある一つの黒画素に対し以下の処理を施す。 Next, the following processing is performed on one black pixel.

処理Ａ１として、ある一つの黒画素に対し、上下方向、左右方向、左下−右上方向、左上−右下方向への黒画素スキャンを行い、４方向のランレングスをそれぞれ測る。 As a process A1, black pixel scanning is performed in a vertical direction, a horizontal direction, a lower left-upper right direction, and an upper left-lower right direction with respect to a certain black pixel, and run lengths in four directions are measured.

処理Ａ２として、それらのランレングスのうち、最も小さい長さを示したものの数値を”ストローク幅”と定義し、図６のような「ストローク幅ヒストグラム」に投票する。なお、図６では、ｘ軸がストローク幅、ｙ軸が頻度である。 As the process A2, the numerical value of the run length indicating the smallest length is defined as “stroke width” and voted for the “stroke width histogram” as shown in FIG. In FIG. 6, the x-axis is the stroke width and the y-axis is the frequency.

処理Ａ１及び処理Ａ２を全ての黒画素に施す。 Processing A1 and processing A2 are performed on all black pixels.

次に、ストローク幅ヒストグラムからピーク位置Ｐ（例えば、図６中のピーク位置Ｐ）を求める。このピーク位置Ｐは、頻度最大の位置を探すことで求められる。 Next, the peak position P (for example, the peak position P in FIG. 6) is obtained from the stroke width histogram. The peak position P is obtained by searching for a position with the highest frequency.

次に、そのピーク位置において、前後に幅ｗ２の範囲の頻度を累積値として求め、全体の何パーセントを占めているか（割合；ＰｅｒＷ１）を調べる。なお、幅ｗ２の値は、予め決められたものとする。ＰｅｒＷ１を求めた前記処理を白黒逆転させた２値化文字図形領域候補に適応しＰｅｒＷ２を求める。ＰｅｒＷ１とＰｅｒＷ２を比較し、大きい方をＰｅｒＷとして採用する。なお、このときの白画素もしくは黒画素と文字成分と背景成分の対応関係を以下でも利用することにする。 Next, at the peak position, the frequency in the range of the width w2 before and after is obtained as a cumulative value, and what percentage is occupied (ratio; PerW1) is examined. Note that the value of the width w2 is determined in advance. PerW2 is obtained by applying the above-described processing for obtaining PerW1 to a binarized character graphic region candidate obtained by reversing black and white. PerW1 and PerW2 are compared, and the larger one is adopted as PerW. Note that the correspondence between the white pixel or black pixel, the character component, and the background component at this time is also used below.

最終的に、得られたＰｅｒＷが予め決められたＴｈｒｅ２より小さい（ＰｅｒＷ＜Ｔｈｒｅ２）場合は、文字図形領域でないと見做して候補から除外する。 Finally, if the obtained PerW is smaller than the predetermined Thre2 (PerW <Thre2), it is regarded as not a character graphic area and is excluded from the candidates.

３つ目の文字らしさ基準値は、ストローク幅の最大の大きさを数値化したものである。上述のストローク幅を全て黒画素（文字成分である黒画素）で算出し、それらの最大値を求める。最大値が長方形領域の短辺長さＳに対し、どの程度の大きさかを比として求めＰｅｒＤとする。ＰｅｒＤが予め決められたＴｈｒｅ３より大きい（ＰｅｒＤ＞Ｔｈｒｅ３）場合は、文字図形領域でないと見做して候補から除外する。 The third character-likeness reference value is a numerical value of the maximum stroke width. All the above stroke widths are calculated with black pixels (black pixels which are character components), and their maximum values are obtained. The maximum value is obtained as a ratio to the short side length S of the rectangular area as a ratio, and is defined as PerD. If PerD is larger than the predetermined Thre3 (PerD> Thre3), it is regarded as not a character graphic area and is excluded from the candidates.

４つ目の文字らしさ基準値は、２値化後の黒画素（文字成分である黒画素）領域と白画素（背景成分である白画素）領域との境界の複雑さを数値化したものである。 The fourth character-likeness reference value is obtained by quantifying the complexity of the boundary between the binarized black pixel (character pixel black pixel) region and the white pixel (background component white pixel) region. is there.

まず、一つの文字図形領域候補に対し、領域内を２値化する。 First, the area is binarized for one character / graphic area candidate.

次に、黒画素領域と白画素領域の境界にあたる画素を検出し、その数を集計する。それが、全体の画素数に対し、どの程度の割合かを求めＰｅｒＦとする。ＰｅｒＦが予め決められたＴｈｒｅ４より大きい（ＰｅｒＦ＞Ｔｈｒｅ４）場合は、文字図形領域でないとして候補から除外する。 Next, pixels corresponding to the boundary between the black pixel region and the white pixel region are detected, and the number thereof is totalized. PerF is determined as a percentage of the total number of pixels. If PerF is larger than the predetermined Thre4 (PerF> Thre4), it is excluded from the candidates as not being a character graphic area.

５つ目の文字らしさ基準値は、ストローク長さの最大の大きさを数値化したものである。 The fifth character-likeness reference value is a numerical value of the maximum stroke length.

処理Ｂ１として、ある一つの黒画素（文字成分である黒画素）に対し、上下方向、左右方向、左下−右上方向、左上−右下方法への黒画素のスキャンを行い、４方向のランレングスをそれぞれ測る。 As a process B1, a black pixel is scanned in a vertical direction, a horizontal direction, a lower left-upper right direction, and an upper left-lower right method with respect to a certain black pixel (character pixel black pixel). Measure each.

処理Ｂ２として、そのランレングスのうち、最も大きい長さを示したものの数値を「ストローク長さ」と定義する。 As the process B2, the numerical value of the run length indicating the largest length is defined as the “stroke length”.

処理Ｂ１及び処理Ｂ２を全ての黒画素に対して行い、その最大値を求める。その最大値が長方形領域の短辺長さＳに対し、どの程度の大きさかを比として求めＰｅｒＳと定義する。ＰｅｒＳが予め決められたＴｈｒｅ５より大きい（ＰｅｒＳ＞Ｔｈｒｅ５）場合は、文字図形領域でないと見做して候補から除外する。 Processing B1 and processing B2 are performed for all black pixels, and the maximum value is obtained. The maximum value is obtained as a ratio with respect to the short side length S of the rectangular region, and is defined as PerS. If PerS is larger than the predetermined Thre5 (PerS> Thre5), it is regarded as not a character graphic area and is excluded from the candidates.

以上の文字らしさ基準値の組み合わせによって残された外接矩形に関する情報は、文字図形列領域出力部２９によって出力される。 Information relating to the circumscribed rectangle left by the combination of the character quality reference values described above is output by the character graphic string region output unit 29.

図４に基づいて本実施例の処理を説明する。図４中の画像データ入力ステップＳ２０１〜重複領域除去ステップＳ２０７及び直線除去ステップＳ２０４１は、前記画像データ入力ステップＳ１０１〜重複領域除去ステップＳ１０７，直線除去ステップＳ１０４１とそれぞれ同様の処理を行う。 The processing of this embodiment will be described with reference to FIG. The image data input step S201 to the overlapping area removal step S207 and the straight line removal step S2041 in FIG. 4 perform the same processes as the image data input step S101 to the overlapping area removal step S107 and the straight line removal step S1041, respectively.

図４中の文字らしさ解析ステップＳ２０８は、上述の文字らしさ解析部２８と同様の処理を行うステップである。即ち、抽出された文字図形領域候補それぞれにおいて、上述の５種類の文字らしさ基準値を算出し、文字図形領域でないと判定されたものは除外するステップである。 Character character analysis step S208 in FIG. 4 is a step of performing processing similar to that of the character character analysis unit 28 described above. That is, in each of the extracted character / graphic region candidates, the above-described five types of character-likeness reference values are calculated, and those determined not to be character / character regions are excluded.

なお、上述のＳの値，Ｔｈｒｅ１からＴｈｒｅ５の値，ｗ１及びｗ２の値は、閾値記憶部１９から読み出して使用するものとする。 The S value, the Thre1 to Thre5 values, and the w1 and w2 values are read from the threshold storage unit 19 and used.

以上のように、本実施形態は、画像データから文字図形列領域を抽出する方法であって、エッジ抽出、孤立点除去、直線除去、ブラッシュ処理、モフォロジ処理、形状解析、重複領域除去の順に処理を行う。 As described above, the present embodiment is a method for extracting a character graphic string region from image data, and performs processing in the order of edge extraction, isolated point removal, straight line removal, brush processing, morphology processing, shape analysis, and overlapping region removal. I do.

また、本実施形態は、文字成分として抽出された部分領域の形状解析（Ｓ１０６）を、以下のような単純化した５つ処理にできる。 Further, according to the present embodiment, the shape analysis (S106) of the partial area extracted as the character component can be processed into the following five simplified processes.

第１処理は、特定の閾値より大きすぎる領域を削除する。 In the first process, an area that is larger than a specific threshold is deleted.

第２処理は、特定の閾値より小さすぎる領域を削除する。 In the second process, an area that is smaller than a specific threshold is deleted.

第３処理は、特定の閾値より細長すぎる領域を削除する。 The third process deletes an area that is too narrow than a specific threshold.

第４処理は、外接矩形内の値（黒色画素数を白色画素数で除算した値）が、特定の閾値より小さすぎる領域を削除する。 In the fourth process, an area in which the value in the circumscribed rectangle (the value obtained by dividing the number of black pixels by the number of white pixels) is too small than a specific threshold is deleted.

第５処理は、コントラストが、特定の閾値より極端に小さすぎる領域を削除する。 The fifth process deletes an area where the contrast is extremely smaller than a specific threshold value.

これらの処理はいずれも単純な処理であるため、非常に高速に文字図形列を自然画像から抽出できる。 Since both of these processes are simple processes, a character / graphic string can be extracted from a natural image at a very high speed.

例えば、上述の閾値ｎ₃からｎ₇の閾値セットに文字コード（例えば、ＪＩＳ（ＪａｐａｎｅｓｅＩｎｄｕｓｔｒｉａｌＳｔａｎｄａｒｄ）コード）を割り当てれば、コンピュータで扱われる文字列（例えば、テキスト）として扱うこともできる。 For example, if a character code (for example, JIS (Japan Industrial Standard) code) is assigned to the threshold set of the above-described threshold values n ₃ to n ₇ , it can be handled as a character string (for example, text) handled by a computer.

以上、本発明において、記載された具体例に対してのみ詳細に説明したが、本発明の技術思想の範囲で多彩な変形および修正が可能であることは、当業者にとって明白なことであり、このような変形および修正が特許請求の範囲に属することは当然のことである。 Although the present invention has been described in detail only for the specific examples described above, it is obvious to those skilled in the art that various changes and modifications are possible within the scope of the technical idea of the present invention. Such variations and modifications are naturally within the scope of the claims.

例えば、本実施形態の変形例として、閾値ｎ₁からｎ₇の閾値セットを複数セット管理する閾値記憶部であっても良い。即ち、その複数の閾値セットから１つの閾値セットを予め選択しておき、その閾値セットから各々の閾値を読み出すものである。 For example, as a modification of the present embodiment, a threshold value storage unit that manages a plurality of threshold value sets of threshold values n ₁ to n ₇ may be used. That is, one threshold set is previously selected from the plurality of threshold sets, and each threshold is read from the threshold set.

本実施形態における文字図形列抽出装置の構成図。The lineblock diagram of the character graphic sequence extraction device in this embodiment. 本実施形態における文字図形列抽出方法を示すフローチャート。The flowchart which shows the character figure string extraction method in this embodiment. 本実施例における文字図形列抽出装置の構成図。The block diagram of the character figure string extraction apparatus in a present Example. 本実施例における文字図形列抽出方法を示すフローチャート。The flowchart which shows the character figure string extraction method in a present Example. 本実施例における明度ヒストグラムの一例を示す図。The figure which shows an example of the brightness histogram in a present Example. 本実施例におけるストローク幅ヒストグラムの一例を示す図。The figure which shows an example of the stroke width histogram in a present Example.

Explanation of symbols

１，１’…画像処理部
１１，２１…画像データ入力部
１２，２２…エッジ抽出部
１３，２３…孤立点除去部
１４，２４…ブラッシュ処理部
１５，２５…モフォロジ処理部
１６，２６…形状解析部
１７，２７…重複領域除去部
１８，２９…文字図形列領域出力部
１９…閾値記憶部
２８…文字らしさ解析部
１４１，２４１…直線除去部
Ｐ…ピーク位置
Ｐｌ…左ピーク位置
Ｐｒ…右ピーク位置
ｔｈ…しきい値 DESCRIPTION OF SYMBOLS 1,1 '... Image processing part 11, 21 ... Image data input part 12, 22 ... Edge extraction part 13, 23 ... Isolated point removal part 14, 24 ... Brush processing part 15, 25 ... Morphology processing part 16, 26 ... Shape Analyzing unit 17, 27 ... Overlapping region removing unit 18, 29 ... Character graphic string region output unit 19 ... Threshold storage unit 28 ... Characteristicness analyzing unit 141, 241 ... Straight line removing unit P ... Peak position Pl ... Left peak position Pr ... Right Peak position th ... Threshold

Claims

A character graphic string extraction device for extracting a region forming a character graphic string included in image data,
Image data input means for inputting the image data from image data acquisition means;
An edge extraction unit that performs an edge extraction process on the input image data, and further generates a first stage image data obtained by performing a binarization process on the edge extracted image data;
Isolated point removing means for removing first value pixels isolated from other first value pixels with respect to the first step image data to generate second step image data;
Straight line removing means for detecting a region of first value pixels that are linearly continuous from the second step image data, and further generating third step image data from which the first value pixels constituting the region are removed;
A brush processing means for reading out a threshold value defining a neighborhood related to the target pixel from the threshold value storage unit, performing a brush process on the third-stage image data based on the threshold value, and generating a fourth-stage image data;
Morphology processing means for performing erosion and dilation on the fourth stage image data to generate fifth stage image data;
For the fifth stage image data, the first value pixel area is labeled, the same area is regarded as one area, and the first circumscribed rectangle of each area is calculated,
A first determination for determining true when the number of first value pixels in the first circumscribed rectangle exceeds the upper limit of the number of first value pixels;
A second determination for determining as true when the number of first value pixels in the first circumscribed rectangle exceeds a lower limit of the number of first value pixels;
A third determination for determining true when the short side length of the first circumscribed rectangle exceeds the lower limit of the short side length;
A fourth determination for determining as true when the ratio between the number of first value pixels and the number of second value pixels in the first circumscribed rectangle exceeds a lower limit value of the ratio;
A fifth determination for determining as true when the contrast of the portion corresponding to the inside of the first circumscribed rectangle in the input image data exceeds a lower limit value of contrast;
The first circumscribed rectangle determined to be true in any one of the above is regarded as the second circumscribed rectangle, the second circumscribed rectangle is removed from the calculated first circumscribed rectangle, and the remaining first circumscribed rectangle is replaced with the third circumscribed rectangle. A shape analysis means that is generated by considering it as a circumscribed rectangle;
An overlapping area removing unit that generates the third circumscribed rectangle of the third circumscribed rectangle in the inclusion relation by removing the included third circumscribed rectangle and considering the remaining third circumscribed rectangle as the fourth circumscribed rectangle;
A character / figure string region output means for outputting the fourth circumscribed rectangle as a character / figure string region;
A character graphic string extraction device comprising:

A character graphic string extraction method for extracting a region forming a character graphic string included in image data,
An image data input step of inputting the image data from an image data acquisition means;
An edge extraction step of performing an edge extraction process on the input image data, and further generating a first stage image data obtained by performing a binarization process on the image data extracted from the edge;
An isolated point removing step of removing first value pixels isolated from other first value pixels with respect to the first step image data, and generating second step image data;
A straight line removal step of detecting a region of first value pixels linearly continuous from the second step image data, and further generating third step image data from which the first value pixels constituting the region are removed;
A brush processing step of reading out a threshold value defining a neighborhood related to the target pixel from the threshold value storage unit, performing a brush process on the third-stage image data based on the threshold value, and generating a fourth-stage image data;
Morphology processing step for generating fifth stage image data by performing erosion and dilation on the fourth stage image data;
For the fifth stage image data, the first value pixel area is labeled, the same area is regarded as one area, and the first circumscribed rectangle of each area is calculated,
A first determination for determining true when the number of first value pixels in the first circumscribed rectangle exceeds the upper limit of the number of first value pixels;
A second determination for determining as true when the number of first value pixels in the first circumscribed rectangle exceeds a lower limit of the number of first value pixels;
A third determination for determining true when the short side length of the first circumscribed rectangle exceeds the lower limit of the short side length;
A fourth determination for determining as true when the ratio between the number of first value pixels and the number of second value pixels in the first circumscribed rectangle exceeds a lower limit value of the ratio;
A fifth determination for determining as true when the contrast of the portion corresponding to the inside of the first circumscribed rectangle in the input image data exceeds a lower limit value of contrast;
The first circumscribed rectangle determined to be true in any one of the above is regarded as the second circumscribed rectangle, the second circumscribed rectangle is removed from the calculated first circumscribed rectangle, and the remaining first circumscribed rectangle is replaced with the third circumscribed rectangle. A shape analysis step that is generated by considering it as a circumscribed rectangle;
An overlapping region removing step of generating the third circumscribed rectangle of the third circumscribed rectangle in the inclusion relation by removing the included third circumscribed rectangle and considering the remaining third circumscribed rectangle as the fourth circumscribed rectangle;
A character graphic string region output step for outputting the fourth circumscribed rectangle as a character graphic string region;
A character graphic string extraction method characterized by comprising:

A character graphic string extraction program characterized in that the character graphic string extraction method according to claim 2 is described as a computer program executable by a computer.

A recording medium comprising the character graphic string extraction method according to claim 2 described in a computer-executable program and the program recorded therein.

A character graphic string extraction device for extracting a region forming a character graphic string included in image data,
Image data input means for inputting the image data from image data acquisition means;
An edge extraction unit that performs an edge extraction process on the input image data, and further generates a first stage image data obtained by performing a binarization process on the edge extracted image data;
Isolated point removing means for removing first value pixels isolated from other first value pixels with respect to the first step image data to generate second step image data;
Straight line removing means for detecting a region of first value pixels that are linearly continuous from the second step image data, and further generating third step image data from which the first value pixels constituting the region are removed;
A brush processing means for reading out a threshold value defining a neighborhood related to the target pixel from the threshold value storage unit, performing a brush process on the third-stage image data based on the threshold value, and generating a fourth-stage image data;
Morphology processing means for performing erosion and dilation on the fourth stage image data to generate fifth stage image data;
For the fifth stage image data, the first value pixel area is labeled, the same area is regarded as one area, and the first circumscribed rectangle of each area is calculated,
A first determination for determining true when the number of first value pixels in the first circumscribed rectangle exceeds the upper limit of the number of first value pixels;
A second determination for determining as true when the number of first value pixels in the first circumscribed rectangle exceeds a lower limit of the number of first value pixels;
A third determination for determining true when the short side length of the first circumscribed rectangle exceeds the lower limit of the short side length;
A fourth determination for determining as true when the ratio between the number of first value pixels and the number of second value pixels in the first circumscribed rectangle exceeds a lower limit value of the ratio;
A fifth determination for determining as true when the contrast of the portion corresponding to the inside of the first circumscribed rectangle in the input image data exceeds a lower limit value of contrast;
The first circumscribed rectangle determined to be true in any one of the above is regarded as the second circumscribed rectangle, the second circumscribed rectangle is removed from the calculated first circumscribed rectangle, and the remaining first circumscribed rectangle is replaced with the third circumscribed rectangle. A shape analysis means that is generated by considering it as a circumscribed rectangle;
An overlapping area removing unit that generates the third circumscribed rectangle of the third circumscribed rectangle in the inclusion relation by removing the included third circumscribed rectangle and considering the remaining third circumscribed rectangle as the fourth circumscribed rectangle;
Of the fourth circumscribed rectangle, the ease of binarization of the circumscribed rectangular area, the small dispersion of the stroke width, the maximum stroke width, the first value pixel area and the second value after binarization The complexity of the boundary with the pixel area and the maximum stroke length are each digitized, and those that are determined to be non-character circumscribed rectangle areas are removed from the fourth circumscribed rectangle, and the remaining fourth circumscribed rectangle is replaced with the fourth circumscribed rectangle. 5 character character analysis means to be generated by considering it as a circumscribed rectangle;
A character / figure string region output means for outputting the fifth circumscribed rectangle as a character / figure string region;
A character graphic string extraction device comprising:

A character graphic string extraction method for extracting a region forming a character graphic string included in image data,
An image data input step of inputting the image data from an image data acquisition means;
An edge extraction step of performing an edge extraction process on the input image data, and further generating a first stage image data obtained by performing a binarization process on the image data extracted from the edge;
An isolated point removing step of removing first value pixels isolated from other first value pixels with respect to the first step image data, and generating second step image data;
A straight line removal step of detecting a region of first value pixels linearly continuous from the second step image data, and further generating third step image data from which the first value pixels constituting the region are removed;
A brush processing step of reading out a threshold value defining a neighborhood related to the target pixel from the threshold value storage unit, performing a brush process on the third-stage image data based on the threshold value, and generating a fourth-stage image data;
Morphology processing step for generating fifth stage image data by performing erosion and dilation on the fourth stage image data;
For the fifth stage image data, the first value pixel area is labeled, the same area is regarded as one area, and the first circumscribed rectangle of each area is calculated,
A first determination for determining true when the number of first value pixels in the first circumscribed rectangle exceeds the upper limit of the number of first value pixels;
A second determination for determining as true when the number of first value pixels in the first circumscribed rectangle exceeds a lower limit of the number of first value pixels;
A third determination for determining true when the short side length of the first circumscribed rectangle exceeds the lower limit of the short side length;
A fourth determination for determining as true when the ratio between the number of first value pixels and the number of second value pixels in the first circumscribed rectangle exceeds a lower limit value of the ratio;
A fifth determination for determining as true when the contrast of the portion corresponding to the inside of the first circumscribed rectangle in the input image data exceeds a lower limit value of contrast;
The first circumscribed rectangle determined to be true in any one of the above is regarded as the second circumscribed rectangle, the second circumscribed rectangle is removed from the calculated first circumscribed rectangle, and the remaining first circumscribed rectangle is replaced with the third circumscribed rectangle. A shape analysis step that is generated by considering it as a circumscribed rectangle;
An overlapping region removing step of generating the third circumscribed rectangle of the third circumscribed rectangle in the inclusion relation by removing the included third circumscribed rectangle and considering the remaining third circumscribed rectangle as the fourth circumscribed rectangle;
Of the fourth circumscribed rectangle, the ease of binarization of the circumscribed rectangular area, the small dispersion of the stroke width, the maximum stroke width, the first value pixel area and the second value after binarization The complexity of the boundary with the pixel area and the maximum stroke length are each digitized, and those that are determined to be non-character circumscribed rectangle areas are removed from the fourth circumscribed rectangle, and the remaining fourth circumscribed rectangle is replaced with the fourth circumscribed rectangle. A character-likeness analysis step that is generated by considering it as 5 circumscribed rectangles;
A character graphic string region output step for outputting the fifth circumscribed rectangle as a character graphic string region;
A character graphic string extraction method characterized by comprising:

A character graphic string extraction program characterized in that the character graphic string extraction method according to claim 6 is described as a computer program executable by a computer.

7. A recording medium, wherein the character graphic string extraction method according to claim 6 is described in a computer-executable program and the program is recorded.