JPH08249421A - Recognizing method for reverse character - Google Patents
Recognizing method for reverse characterInfo
- Publication number
- JPH08249421A JPH08249421A JP7052237A JP5223795A JPH08249421A JP H08249421 A JPH08249421 A JP H08249421A JP 7052237 A JP7052237 A JP 7052237A JP 5223795 A JP5223795 A JP 5223795A JP H08249421 A JPH08249421 A JP H08249421A
- Authority
- JP
- Japan
- Prior art keywords
- character
- value
- black
- reverse
- area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Character Input (AREA)
Abstract
Description
【0001】[0001]
【産業上の利用分野】この発明は、光学的文字読取装置
における文字認識の方法、なかんづく黒白反転印刷文字
の認識方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for recognizing characters in an optical character reader, and more particularly to a method for recognizing black and white reverse printed characters.
【0002】[0002]
【従来の技術】文字読取装置(OCR)1は、図3に例
示のようにイメージスキャナ21と文字認識プロセッサ22
からなる文字読取認識部2およびホストコンピュータ3
によって構成されており、文字認識プロセッサ22は、イ
メージスキャナ21が読み取り対象の文書を光学走査して
得た画像データを入力とし、概略を図4に示したフロー
の処理に従い、先ず画像データを観測して1文字として
処理すべき文字パターンデータ領域を切出し、切り出し
た文字パターンデータを解析して特徴パラメータを抽出
し、この特徴パラメータを読み取り対象範囲の各文字に
対応して予め用意した各文字に属する特徴パラメータの
辞書と照合して特徴パラメータが整合する文字を抽出す
ることによって読取った文字を認識し、該文字に割当た
文字区分コードを文字情報として出力する作用を基本機
能とするものである。このとき、ホストコンピュータ2
は、読み取り対象文書の読み取り条件の指示設定、読み
取り結果の表示等のマンマシンインターフェースとして
機能することとともに、読み取りによって得られた文書
情報をもととする文書の編集校正あるいはデータベース
構築などをも遂行する。2. Description of the Related Art A character reading device (OCR) 1 includes an image scanner 21 and a character recognition processor 22 as shown in FIG.
Character reading recognition unit 2 and host computer 3
The character recognition processor 22 receives image data obtained by optically scanning the document to be read by the image scanner 21 as input, and first observes the image data according to the process of the flow shown in FIG. Then, the character pattern data area to be processed as one character is cut out, the cut-out character pattern data is analyzed to extract the characteristic parameter, and this characteristic parameter is converted into each character prepared in advance corresponding to each character in the reading range. The basic function is to recognize a read character by extracting a character with which the characteristic parameter matches by collating with the dictionary of the characteristic parameter to which the character belongs and to output the character classification code assigned to the character as character information. . At this time, the host computer 2
Functions as a man-machine interface for setting the reading conditions of the document to be read and displaying the reading result, and also performs editing and proofreading of the document based on the document information obtained by reading or building a database. To do.
【0003】[0003]
【発明が解決しようとする課題】従来技術による文字認
識方法においては、読取り対象の文書中に見出などを強
調する目的で図2に例示のような反転文字で印刷された
領域がある場合、これを図や写真などと同類の非文字領
域として扱うこととし、文字認識の対象外としている。
しかしながら、見出しなどが強調のため反転文字で印刷
されていることは多々あり、それらを認識させたいとい
うニーズも存在する。In the character recognition method according to the prior art, when there is an area printed with reverse characters as shown in FIG. 2 for the purpose of emphasizing a finding etc. in the document to be read, This is treated as a non-character area similar to figures and photographs, and is not subject to character recognition.
However, headings and the like are often printed in reverse characters for emphasis, and there is also a need to recognize them.
【0004】本発明はこのニーズに答え、反転文字も認
識可能とする文字認識方法を提供することを課題とす
る。An object of the present invention is to provide a character recognizing method which answers this need and can recognize a reversed character.
【0005】[0005]
【課題を解決するための手段】本発明による文字認識方
法においては、文字認識の前処理として、イメージスキ
ャナが読取対象文書を光学走査して得た2値化された文
書画像データについて反転文字の有無を検査する領域を
指定し、この指定領域について黒白画素それぞれの数を
計数して黒画素密度特徴の値を求め、黒画素密度特徴の
値が予め基準として定めた反転判別基準値を超えている
場合には指定領域の画素データを反転し、黒画素密度特
徴の値が反転判別値を超えていないときには指定領域画
素データの反転処理を行わないようにする処理を設け
る。In the character recognition method according to the present invention, as a preprocessing of character recognition, inversion characters of binarized document image data obtained by optically scanning a document to be read by an image scanner are displayed. Specify the area to be inspected for existence, count the number of black and white pixels in this specified area to obtain the value of the black pixel density feature, and if the value of the black pixel density feature exceeds the inversion discrimination reference value set as a reference in advance. If so, the pixel data of the designated area is inverted, and if the value of the black pixel density feature does not exceed the inversion determination value, the processing of not inverting the designated area pixel data is provided.
【0006】[0006]
【作用】イメージスキャナが入力した2値化文書画像デ
ータについて反転文字の有無を検査する領域を指定する
と、この指定領域について黒画素密度特徴の値が求めら
れ、黒画素密度特徴の値が予め基準として定めた反転判
別基準値を超えている場合には2値化した指定領域の画
素データを反転することによって非反転通常印刷文字に
対応の文字画像データが得られ、黒画素密度特徴の値が
反転判別値を超えていないときには指定領域画素データ
の反転処理を行わないので通常印刷された文字の文字画
像データがそのまま残り、後段の文字認識処理では全て
の文字を非反転通常印刷されたものとして処理する。When a region to be inspected for inversion characters is designated in the binarized document image data input by the image scanner, the value of the black pixel density feature is obtained for this designated region, and the value of the black pixel density feature is used as a reference in advance. If it exceeds the inversion discrimination reference value defined as, the image data corresponding to the non-inverted normal print character is obtained by inverting the binarized pixel data of the specified area, and the value of the black pixel density feature is If the reversal judgment value is not exceeded, the reversal processing of the specified area pixel data is not performed, so the character image data of the characters normally printed remains as it is, and in the character recognition processing in the subsequent stage, all characters are assumed to be non-reversed normal printing To process.
【0007】[0007]
【実施例】本発明にもとづく図2に例示のような反転印
刷文字の認識方法の1実施例における処理のフローを図
1に示し、図1によって本発明の方法を説明する。な
お、本発明の方法を実行する文字読取装置の構成は、従
来技術の説明に用いた図3に例示の構成の装置と同等で
あり、以下説明に必要な場合図3中に付された符号を引
用する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 shows the flow of processing in one embodiment of the method for recognizing reverse printed characters according to the present invention as shown in FIG. 2, and the method of the present invention will be described with reference to FIG. The configuration of the character reading device that executes the method of the present invention is equivalent to the device of the configuration illustrated in FIG. 3 used in the description of the prior art, and the reference numerals that are given in FIG. 3 when necessary for the following description. To quote.
【0008】文字読取装置の文字認識プロセッサ22は、
イメージスキャナー21が走査して入力した2値化された
文書画像データ読取前にホストコンピュータ3の表示装
置31に一旦そのまま表示する(S2)。そして、この入
力表示画面の表示内容を参酌して文書画像データを読取
る際の読取り方向、読取り範囲その他の読取条件がホス
トコンピュータ3を通じて設定されるが(S31)、本
発明の方法では、この読取条件設定項目の中に入力され
た文書の画像データについて文字が反転印刷されている
か否かを検査する領域を選定指定する工程(S3)を設
けている。The character recognition processor 22 of the character reader is
Before reading the binarized document image data scanned and input by the image scanner 21, the image data is temporarily displayed on the display device 31 of the host computer 3 (S2). Then, the reading direction, reading range and other reading conditions at the time of reading the document image data are set through the host computer 3 in consideration of the display contents of the input display screen (S31), but in the method of the present invention, this reading is performed. There is provided a step (S3) of selecting and designating a region for inspecting whether or not characters are reverse printed in the image data of the document input in the condition setting item.
【0009】反転印刷検査領域を指定すると、該指定の
領域の文字の画像データについて黒画素と白画素それぞ
れの画素の数を数え、黒画素が占める割合を示す黒画素
密度特徴Fb の値を演算によって求める(S4)。図2
の例にも見られるように、反転印刷された文字の領域に
おいては非反転印刷領域にくらべ黒画素が占める通常割
合が大きくなっている。そこで、反転指定領域について
の黒画素密度特徴の値Fb を得たら、このFb の値を非
反転印刷文字領域について経験的に得られている黒画素
密度特徴の最大値Fbmを超える値を選んで設定した反転
判別基準値th と比較し(S5)、Fb の値が判定基準
値th を超えている場合には、指定領域が白黒反転印刷
された領域と判定できるので、この領域の2値化データ
を反転する処理を行う(S6)。When the reverse printing inspection area is designated, the number of pixels of each black pixel and white pixel is counted in the image data of the character of the designated area, and the value of the black pixel density feature Fb indicating the ratio occupied by the black pixels is calculated. (S4). Figure 2
As seen in the above example, the normal ratio of black pixels in the reverse printed character area is larger than that in the non-reversed printed area. Therefore, when the value Fb of the black pixel density feature for the inversion designated area is obtained, the value of this Fb is selected to exceed the maximum value Fbm of the black pixel density feature empirically obtained for the non-inversion printed character area. It is compared with the set reversal determination reference value th (S5), and if the value of Fb exceeds the determination reference value th, it is possible to determine that the designated area is black-and-white reverse printed area. A process of inverting the data is performed (S6).
【0010】一方、指定領域についての黒画素密度特徴
の値Fb が反転判別基準値th を超えていない場合に
は、反転印刷ではないと判定し、2値化画素データの反
転処理は行わない。上記のS5,S6の工程が終了した
段階で、反転文字の文字データは非反転文字相当に変換
され、入力文書画像の文字領域については、すべて非反
転文字としての文字画素データが得られるので以降従来
技術の方法による文字認識処理を実行して文字を認識す
る(S7)。On the other hand, when the value Fb of the black pixel density feature for the designated area does not exceed the inversion discrimination reference value th, it is determined that the printing is not inversion, and the inversion processing of the binarized pixel data is not performed. At the stage where the above steps S5 and S6 are completed, the character data of the reverse character is converted into the non-reverse character equivalent, and the character pixel data as the non-reverse character is obtained for all the character areas of the input document image. Character recognition processing is executed by the conventional method to recognize the character (S7).
【0011】[0011]
【発明の効果】本発明の文字認識処理における前処理方
法によれば、読取対象文書が黒白反転印刷文字を含む文
書であっても反転印刷領域を反転検査領域に指定する
と、該領域について黒画素密度特徴の値を求め、この値
が反転判定基準値を超えている場合には指定領域の2値
化画素データを反転して非反転通常印刷文字に対応の文
字データを得るようにしているので、反転印刷文字と非
反転通常印刷文字とが混在する文書であっても、反転印
刷領域を区分して別処理する必要がなくなり、文書読取
り認識の能率を向上できるという効力が得られる。According to the preprocessing method in the character recognition processing of the present invention, even if the document to be read is a document containing black and white reverse printing characters, if the reverse printing area is designated as the reverse inspection area, black pixels will be added to the area. The value of the density feature is obtained, and when this value exceeds the inversion judgment reference value, the binarized pixel data in the designated area is inverted to obtain the character data corresponding to the non-inverted normal print character. Even in a document in which reverse printed characters and non-reversed normal printed characters are mixed, there is no need to separate the reverse printed areas for separate processing, and the efficiency of document reading recognition can be improved.
【図1】本発明にもとづく文字認識前処理方法のフロー
図FIG. 1 is a flow chart of a character recognition preprocessing method according to the present invention.
【図2】濃淡反転印刷文字の例を示す図FIG. 2 is a diagram showing an example of light and shade reverse printing characters.
【図3】文字読取装置構成図FIG. 3 is a block diagram of a character reading device
【図4】文字読取装置における文字認識処理の基本フロ
ー図FIG. 4 is a basic flow chart of character recognition processing in the character reading device.
1 文字読取装置 2 文字認識装置 21 イメージスキャナ 22 文字認識プロセッサ 3 ホストコンピュータ 1 character reading device 2 character recognition device 21 image scanner 22 character recognition processor 3 host computer
Claims (1)
イメージスキャナが光学走査して得た2値化された文書
画像データについて反転文字の有無を検査する領域を指
定し、該指定領域について黒白画素それぞれの数を計数
して黒画素密度特徴の値を求め、黒画素密度特徴の値が
予め基準として定めた反転判別基準値を超えている場合
には指定領域の画素データを反転し、黒画素密度特徴の
値が反転判別値を超えていないときには指定領域の画素
データの反転行わないようにする処理を設けたことを特
徴とする反転文字の認識方法。1. As a pre-processing of character recognition, an area for inspecting the presence or absence of a reverse character is designated in binarized document image data obtained by optically scanning a document to be read by an image scanner, and the designated area is designated. The value of the black pixel density feature is obtained by counting the number of each black and white pixel, and if the value of the black pixel density feature exceeds the inversion determination reference value set as a reference in advance, the pixel data of the designated area is inverted, A method for recognizing a reversed character, characterized in that a process is provided to prevent pixel data in a designated area from being reversed when the value of the black pixel density feature does not exceed the inversion discrimination value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP7052237A JPH08249421A (en) | 1995-03-13 | 1995-03-13 | Recognizing method for reverse character |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP7052237A JPH08249421A (en) | 1995-03-13 | 1995-03-13 | Recognizing method for reverse character |
Publications (1)
Publication Number | Publication Date |
---|---|
JPH08249421A true JPH08249421A (en) | 1996-09-27 |
Family
ID=12909122
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP7052237A Pending JPH08249421A (en) | 1995-03-13 | 1995-03-13 | Recognizing method for reverse character |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPH08249421A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004097721A1 (en) * | 2003-04-25 | 2004-11-11 | Sharp Kabushiki Kaisha | Image processing device, image processing method, image processing program, and computer-readable recording medium containing the program |
CN102096906A (en) * | 2010-12-13 | 2011-06-15 | 汉王科技股份有限公司 | Panoramic binary image-based reversal processing method and device |
-
1995
- 1995-03-13 JP JP7052237A patent/JPH08249421A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004097721A1 (en) * | 2003-04-25 | 2004-11-11 | Sharp Kabushiki Kaisha | Image processing device, image processing method, image processing program, and computer-readable recording medium containing the program |
CN102096906A (en) * | 2010-12-13 | 2011-06-15 | 汉王科技股份有限公司 | Panoramic binary image-based reversal processing method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6788810B2 (en) | Optical character recognition device and method and recording medium | |
US6798906B1 (en) | Image processing apparatus and method including line segment data extraction | |
JP4261005B2 (en) | Region-based image binarization system | |
JP4574503B2 (en) | Image processing apparatus, image processing method, and program | |
JP2010186246A (en) | Image processing apparatus, method, and program | |
US6983071B2 (en) | Character segmentation device, character segmentation method used thereby, and program therefor | |
JPH08249421A (en) | Recognizing method for reverse character | |
JP2845370B2 (en) | Character recognition method | |
JP2003087562A (en) | Image processor and image processing method | |
JPH0333990A (en) | Optical character recognition instrument and method using mask processing | |
JP2001109887A (en) | Area extracting method, method and device for extracting address area, and image processor | |
Aparna et al. | A complete OCR system development of Tamil magazine documents | |
JPH06131495A (en) | Image information extraction system | |
JPH08272902A (en) | Method for recognizing character of different quality and different font | |
JP7532124B2 (en) | Information processing device, information processing method, and program | |
JP2003196592A (en) | Program for processing image, and image processor | |
JPH07230525A (en) | Method for recognizing ruled line and method for processing table | |
JPS5949671A (en) | Optical character reader | |
JP3162414B2 (en) | Ruled line recognition method and table processing method | |
JP2000331118A (en) | Image processor and recording medium | |
Cracknell et al. | A colour classification approach to form dropout | |
KR940003623B1 (en) | Method of taking out the part of korean characters | |
JP2001143076A (en) | Image processor | |
JPH05189604A (en) | Optical character reader | |
JP2902904B2 (en) | Character recognition device |