JPH045779A

JPH045779A - Character recognizing device

Info

Publication number: JPH045779A
Application number: JP2108287A
Authority: JP
Inventors: Koji Ito; 伊東　晃治; Yoshiyuki Yamashita; 山下　義征
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1990-04-24
Filing date: 1990-04-24
Publication date: 1992-01-09

Abstract

PURPOSE:To offer a character recognizing device with good accuracy and fast processing speed by extracting the feature of the borderline of a recognizing processing area assigned by a operator and changing the recognizing processing area automatically when the feature does not satisfy fixed conditions. CONSTITUTION:A character recognizing device 20 is provided with a picture storing part 30 of picture data S of characters and figures on a medium, a display part 40 of the picture data S, an input part 50 to assign a recognizing processing area among the picture data S, a recognizing part 60 to recognize the characters with cutting out the picture data at a character unit from the assigned recognizing processing area, a boundary feature extracting part 70 to extract the feature of the borderline of the assigned recognizing processing area, an area changing part 80 to change the assigned processing area based on the feature of the borderline extracted at the extracting part 70 and a control part 90 to control respective constituting components. Thus, recognizing processing is executed to the proper recognizing processing area.

Description

【発明の詳細な説明】（産業上の利用分野）この発明は、文字認識装置にマするもので、特にオペレ
ータか文字認識装置に対し認識処理させたい慶域を指示
する型の文字認識装置に叩するものである。[Detailed Description of the Invention] (Field of Industrial Application) The present invention is applicable to character recognition devices, and particularly to character recognition devices of the type in which an operator instructs the character recognition device to perform recognition processing. It's something to hit.

（従来の技術）文字認識装置のある種のものでは、オペレータか文字認
識装置に対し認識処理させたい領域（以下、この−ｔｉ
￥！認識処理領域と略称することもある。）を指示する
ことによりこの領域の認識処理が実行される。このため
、この種の文字認識装置は、媒体上の文字・図形の画像
データを格納する画像記憶部、画像データを表示する表
示部、画像データのうちの認識処理領１１ｉを指定する
入力部及び前記指定された認識処理領域から文字単位に
画像データを切り出して文字を認識する認識部を具えて
いた。(Prior Art) In some types of character recognition devices, an operator or character recognition device is required to perform recognition processing on an area (hereinafter referred to as -ti).
¥! It is sometimes abbreviated as recognition processing area. ), the recognition process for this area is executed. Therefore, this type of character recognition device includes an image storage section that stores image data of characters and figures on a medium, a display section that displays the image data, an input section that specifies the recognition processing area 11i of the image data, and an input section that specifies the recognition processing area 11i of the image data. The apparatus included a recognition unit that cuts out image data character by character from the specified recognition processing area and recognizes the characters.

具体的には、入力部は、ライトベンやマウス等のような
ポインティングデバイスで構成され、表示部はＣＲＴｉ
て構成されていた。そして、ＣＲＴ上に表示された画像
テークの所望の何重をオペレータかポインティングデバ
イスを用い指定することにより、認識処理領域か指定さ
れていた。Specifically, the input section consists of a pointing device such as a light ben or a mouse, and the display section consists of a CRTi.
It was composed of The recognition processing area is designated by the operator or a pointing device to designate the desired number of layers of the image take displayed on the CRT.

（発明か解決しようとする課題）しかしなから、認識処理領域を指定するために文字行と
文字行との間或は文字と文字との間の空白部をポインテ
ィングデバイスにより指示しようとすると、一般の印刷
文書は行間隔、文字間隔が狭いため、正確に指示するこ
とか難しい。(Problem to be solved by the invention) However, when trying to use a pointing device to indicate a blank space between character lines or between characters in order to specify a recognition processing area, Printed documents have narrow line spacing and character spacing, so it is difficult to give accurate instructions.

従って、しばしば、認識処理領域は文字の一部か欠けた
ものとなったり、この逆に余分な領域まで含むものとな
ることかあつ、このため、文字認識装置の認識精度の低
下や認識処理速度の低下を来すという問題点かあった。Therefore, the recognition processing area often ends up missing part of the character, or conversely, includes an extra area, which can reduce the recognition accuracy of the character recognition device and reduce the recognition processing speed. There was a problem in that it caused a decrease in

第４図（Ａ）及び（Ｂ）は、その異体例を示した図であ
る。FIGS. 4(A) and 4(B) are diagrams showing a variant example thereof.

第４図（Ａ）は、認識処理領域を規定する境界線１１の
下辺か本来認識させたい文字の上を横切ってしまって文
字の一部か欠けでしまった例を示した図である。このよ
うな認識対象領域に対し認識処理を行った場合、文字認
識装置は誤読または棄却の認識結果を出力することにな
る。このため、文字認識装置の認識精度を低下させてし
まう。FIG. 4(A) is a diagram showing an example in which the lower side of the boundary line 11 defining the recognition processing area crosses over the character originally desired to be recognized, resulting in part of the character being missing. If recognition processing is performed on such a recognition target area, the character recognition device will output a recognition result that is misread or rejected. Therefore, the recognition accuracy of the character recognition device is reduced.

また、藁４図ＣＢ）は、認識処理領域を規定する境界線
］１か認識対象でない領１１ｉ１３にまで及んでいる例
を示した図である。このような認識処理領域に対し認識
処理を行った場合、文字認識装置は不要な結果を含む認
識結果を出力するので、債に認識結果の修正か必要にな
る。このため、文字認識装置の処理速度を低下させてし
まう。Further, Figure 4 (CB) is a diagram showing an example in which the boundary line defining the recognition processing area extends to areas 11i13 that are not the recognition target. When recognition processing is performed on such a recognition processing area, the character recognition device outputs recognition results that include unnecessary results, so it is necessary to correct the recognition results. Therefore, the processing speed of the character recognition device is reduced.

この発明は、このような点に鑑みなされたものであり、
従ってこの発明の目的は、オペレータにより指定された
認識処理領域か適正か否かを判断し不適性な場合に認識
処理領域を適正なものに変更出来る文字認識装置ｔ％提
供することにある。This invention was made in view of these points,
SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to provide a character recognition device t% that can determine whether or not a recognition processing area specified by an operator is appropriate, and if it is inappropriate, change the recognition processing area to an appropriate one.

（課題を解決するための手段）この目的の達成を図るため、この発明によれば、媒体上
の文字・図形の画像データを格納する画像記憶部、前述
の画像テークを表示する表示部、前述の画像データのう
ちの認識処理領域を指定する入力部及び前述の指定され
た認識処理領域から文字単位に画像データを切り出して
文字を認識する認識部を具える文字認識装置において、
入力部により指定された認識処理領域の境界線の特徴を
抽出する境界特徴抽出部と、前述の入力部により指定された認識処理領域を前述の境
界線特徴抽出部で抽出された境界線の特徴に基づいて変
更する領域変更部とを具えたことを特徴とする。(Means for Solving the Problem) In order to achieve this object, the present invention provides an image storage section that stores image data of characters and figures on a medium, a display section that displays the above-mentioned image take, and the above-mentioned A character recognition device comprising: an input unit for specifying a recognition processing area of image data; and a recognition unit for cutting out image data character by character from the specified recognition processing area and recognizing characters;
a boundary feature extraction unit that extracts the boundary line features of the recognition processing area specified by the input unit; and a boundary line feature extraction unit that extracts the boundary line features of the recognition processing area specified by the input unit. and an area changing unit that changes the area based on.

なお、この発明の実施に当たり、前述の表示部を、前述
の画像テークと、前述の入力部により指定された前述の
認識処理領域と、前述の領域変更部により変更された変
更認識処理領域とを夫々が視覚的に区別出来る形式例え
ば異なる色や輝度で表示して重ねて表示する構成とする
のか好適である。In carrying out the present invention, the above-mentioned display section is configured to display the above-mentioned image take, the above-mentioned recognition processing area specified by the above-mentioned input section, and the above-mentioned changed recognition processing area changed by the above-mentioned area changing section. It is preferable to display them in a format that allows them to be visually distinguished, for example, in different colors or brightness, and to display them overlapping each other.

（作用）このような構成によれば、オペレータが認識処理領域の
指定時に認識すべき文字の一部を欠いてしまうような指
定を誤ってしでしまった場合でもこの文字認識装置は認
識すべき文字を含むように認識処理領域を変更し、また
これとは逆に余分な部分（例えば隣接認識処理領域の文
字の一部）を含むように指定をしてしまった場合でもこ
の文字認識装置は余分な部分を除くように認識処理領域
を変更する。(Function) According to this configuration, even if the operator mistakenly misses a part of the character to be recognized when specifying the recognition processing area, this character recognition device will still be able to recognize the character that should be recognized. Even if you change the recognition processing area to include a character, or conversely specify that it includes an extra part (for example, a part of the character in the adjacent recognition processing area), this character recognition device will not work. Change the recognition processing area to remove unnecessary parts.

（実施例）以下、図面を参照してこの発明の文字認識装置の実施例
につき説明する。(Embodiments) Hereinafter, embodiments of the character recognition device of the present invention will be described with reference to the drawings.

第１図は、実施例の文字認識装置２０の構成を概略的に
示したブロック図である。FIG. 1 is a block diagram schematically showing the configuration of a character recognition device 20 according to an embodiment.

この実施例の文字認識装置２０は、媒体上の文字・図形
の画像データＳを格納する画像記憶部３０、画像データ
８７Ｆ！、表示する表示部４０、画像データＳのうちの
認識処理領域を指定する入力部５０及び前記指定された
認識処理領域から文字単位に画像データを切り出して文
字を認識する認識部６０を具えると共に、入力部５０に
より指定された認識処理領域の境界線のｖｆｙ！を抽出
する境界特徴抽出部７０と、入力部５０により指定され
た認識処理領域を境界線特徴抽出部７０て抽出された境
界線の特徴に基づいて変更する領域変更部８０と、各構
成成分２０〜８０を制御する制御部９０とを具える。The character recognition device 20 of this embodiment includes an image storage unit 30 that stores image data S of characters and figures on a medium, and an image data 87F! , a display section 40 for displaying, an input section 50 for specifying a recognition processing area of the image data S, and a recognition section 60 for cutting out image data character by character from the specified recognition processing area and recognizing characters. , vfy! of the boundary line of the recognition processing area specified by the input unit 50! a boundary feature extraction unit 70 that extracts the recognition processing area specified by the input unit 50 based on the boundary line feature extracted by the boundary line feature extraction unit 70; .about.80.

ここで、画像テークＳは、この実施例の場合、図示しな
い光電変換部から画像記憶部３０に入力されるものであ
り媒体上の文字線部を黒画素として及び背景部を白画素
としで表現したものである。Here, in this embodiment, the image take S is input to the image storage unit 30 from a photoelectric conversion unit (not shown), and is expressed by using black pixels for the character line portion on the medium and white pixels for the background portion. This is what I did.

また、この実施例の画像記憶部３０は、画像テークを媒
体上の文字図形の２次元座標か再現出来る形式で格納出
来るメモリで構成しである。Further, the image storage section 30 of this embodiment is constituted by a memory capable of storing an image take in a format that can reproduce the two-dimensional coordinates of characters and figures on a medium.

また、この実施例の表示部４０は、例えばＣＲＴ等の表
示器を具える構成としである。Further, the display unit 40 of this embodiment is configured to include a display device such as a CRT.

また、この実施例の入力部５０は、例えばライトベン、
マウス等のようなポインチイングチバイス及びキーボー
ド等を具えた構成となっている。Further, the input section 50 of this embodiment may be, for example, a light ben,
The device is equipped with a pointing device such as a mouse, a keyboard, and the like.

オペレータは、この入力部５０を介し文字認識装置２０
に種々の指示を与えることが出来る。The operator inputs the character recognition device 20 via this input section 50.
Various instructions can be given.

また、この実施例の認識部６０は、画像記憶部３０に格
納しである画像データから文字行データを切つ出す行何
重検出部６２、文字行データから１文字すつの文字バタ
ンテークを切り出す文字切り出し部６４及び切り出した
文字バタンシデータを例えば辞書と照合し識別する識別
部６６を具えた構成となっている。The recognition unit 60 of this embodiment also includes a line number detection unit 62 that cuts out character line data from image data stored in the image storage unit 30, and a character line detection unit 62 that cuts out character line data from the image data stored in the image storage unit 30; The configuration includes a cutting section 64 and an identifying section 66 that compares and identifies the cut out character batanshi data with, for example, a dictionary.

また、この実施例の境界特徴抽出部７０は、入力部５０
により指定された認識処理領域の境界線上の黒画素数に
基づいて当該境界線の特徴を抽出する構成となっている
。Further, the boundary feature extraction unit 70 of this embodiment includes the input unit 50
Based on the number of black pixels on the boundary line of the recognition processing area specified by , the feature of the boundary line is extracted.

また、この実施例の領域変更部８０は、境界線の特徴か
所定の条件を満足しない場合境界線の近傍に複数の境界
候補線を設定しこれら境界候補線の特徴を柚比しその特
徴が上述の所定の条件を満足する境界候補線のうち境界
線［こ最も近い何重の境界候補線を新たな境界線とする
構成となっている。Furthermore, if the characteristics of the boundary line do not satisfy a predetermined condition, the area changing unit 80 of this embodiment sets a plurality of boundary candidate lines in the vicinity of the boundary line, compares the characteristics of these boundary candidate lines, and determines the characteristics. Among the boundary candidate lines that satisfy the above-mentioned predetermined conditions, the boundary line [the closest multiple boundary candidate line is set as the new boundary line.

また、制御部９０は、各構成成分３０〜８０を制御する
。Further, the control unit 90 controls each component 30 to 80.

次に、上述の各構成成分３０〜９０の詳細についで実施
例の文字認識装置の動作と共に説明する。Next, details of each of the above-mentioned components 30 to 90 will be explained together with the operation of the character recognition device of the embodiment.

画像記１ｗ部３０が画像データＳを格納すると、制御部
９０は画像記憶部１０内の画像データを読み込みこの画
像テーク及びこれを表示部４ｏて表示させる際の色デー
タを表示部４０に転送する。When the image recorder 1w section 30 stores the image data S, the control section 90 reads the image data in the image storage section 10 and transfers this image take and the color data for displaying it on the display section 4o to the display section 40. .

表示部４０は、制御部９０よつ転送された画像データを
制御部９０より指定された色により表示器に表示する。The display unit 40 displays the image data transferred by the control unit 90 on the display in a color specified by the control unit 90.

なお、表示の際の色を何色にするかについては、入力部
５０により任Ｍ１こ指定出来る。Note that the input unit 50 can specify any M1 color to be displayed.

次に、オペレータは、表示部４０の表示器に表示された
画像データのうちの認識処理領域を指定するため、入力
部５０のポインティジグデバイス等を用い認識処理領域
を指定する。Next, in order to specify the recognition processing area of the image data displayed on the display of the display unit 40, the operator uses the pointing device of the input unit 50 or the like to specify the recognition processing area.

これに応じ入力部５０は、指定された認識処理領域の座
標を制御部９０に出力する。In response to this, the input unit 50 outputs the coordinates of the designated recognition processing area to the control unit 90.

入力部５０を介してのオペレータによる認識処理領域の
指定は、この場合、表示部４０の表示器上の２点をポイ
ンティングデバイスにより指定することで行う構成とし
である。第２図は、その結果の具体例を示したもので、
画像記憶部３０中の画像データ］００と、入力部５０に
より指定した認識処理領域］０２どの間係を目にみえる
形で模式的に示した図である。第１の点へ及び第２の点
Ｂて囲まれる長方形状の領域か認識処理領域となってい
る。但し、この例の場合、オペレータの掃作ミスにより
、認識処理領域７０２を規定する境界線の下辺か本来認
識させたい文字の上を横切ってしまって文字の一部か欠
けでしまっている例である。In this case, the recognition processing area is specified by the operator via the input section 50 by specifying two points on the display of the display section 40 using a pointing device. Figure 2 shows a concrete example of the results.
It is a diagram schematically showing in a visually visible manner the image data in the image storage section 30]00 and the recognition processing area]02 designated by the input section 50. The rectangular area surrounded by the first point and the second point B is the recognition processing area. However, in this example, due to the operator's sweeping error, the lower part of the boundary line defining the recognition processing area 702 or the top of the character that was originally intended to be recognized was crossed, resulting in part of the character being missing. be.

次に、制御部９０は、入力部５０より入力された認識処
理領域の座標と、認識処理領域の境界線を表示部４０て
表示させる際の色データとを表示部４０に出力する。Next, the control unit 90 outputs to the display unit 40 the coordinates of the recognition processing area input from the input unit 50 and color data for displaying the boundary line of the recognition processing area on the display unit 40.

表示部４０は、表示器に、既１こ表示されている画像デ
ータに重ねて、制御部９０から転送されてきた認識処理
領域を示す座標及び色に従い認識処理領域の境界線を表
示する。The display unit 40 displays the boundary line of the recognition processing area on the display according to the coordinates and color indicating the recognition processing area transferred from the control unit 90, overlapping the already displayed image data.

また、制御部９０は、認識処理領域を規定する境界線の
特徴、ＫＪちこの実施例の場合は認識処理領域を規定す
る４辺について各辺の特徴を境界特徴抽出部７０により
抽出させるために、各辺の座標を境界特徴抽出部７０に
出力する。The control unit 90 also causes the boundary feature extraction unit 70 to extract the characteristics of the boundary line that defines the recognition processing area, or in the case of KJ Chiko's embodiment, the characteristics of each side of the four sides that define the recognition processing area. , and outputs the coordinates of each side to the boundary feature extraction unit 70.

境界特徴抽出部７０は、制御部９Ｑより入力された４辺
各々の黒画素数を計数し、黒画素数か所定数を越えるか
否かを判定し、その判定結果と、４辺の座標とを領域変
更部８０に出力する。なあ、ここで、上述の所定数とは
、これに限られるものではないかこの場合、該当する辺
を構成する画素数の１％に相当する数としている。つま
り、第２図の下辺の場合で説明すればＡ点及びＢ点各々
のＸ座標差で求まる画素数の１％の画素数としている。The boundary feature extraction unit 70 counts the number of black pixels on each of the four sides input from the control unit 9Q, determines whether the number of black pixels exceeds a predetermined number, and uses the determination result and the coordinates of the four sides. is output to the area changing section 80. Incidentally, the above-mentioned predetermined number is not limited to this, but in this case, it is a number corresponding to 1% of the number of pixels forming the corresponding side. In other words, in the case of the lower side of FIG. 2, the number of pixels is 1% of the number of pixels determined by the X coordinate difference between points A and B.

このような所定数に対し、境界線か文字行と文字行との
間の背景部に設定された場合はその境界線上では白画素
しか検出されないので所定数を越えないと検出出来、境
界線か文字行にかかって設定された場合はその境界線上
では黒画素か検出されることがら所定数を越える場合も
主しる。このようにして認識処理領域の指定が適正が否
かを判定する。For such a predetermined number, if it is set on the border line or the background between two character lines, only white pixels will be detected on that border line, so it will be detected unless the predetermined number is exceeded, and the border line will be detected. If the number is set over a character line, black pixels will be detected on the boundary line, so it is also possible that the number exceeds the predetermined number. In this way, it is determined whether the recognition processing area is properly specified.

第２図に示した例の場合は、境界線を構成する４辺のう
ちの下辺が文字行にががっているので黒画素数か所定数
を越える。境界特徴抽出部７゜は、その旨を、この実施
例の場合、領域変更番号という形で禦域変更部８０に出
力する。この領域変更番号は、４辺のうちのどの辺上に
おいて黒画素数が所定数をこえているかを領域変更部８
０に伝えるためのもので、４辺のうちの上辺の黒画素数
か所定数を越えた場合は「１」とされ、同じく左辺の場
合は「２」とされ、同しく下辺の場合は「４」とされ、
同しく右辺の場合は「８」とされるものとしでいる。従
って、４辺全てにおいて黒画素数か所定数を越えている
場合の領域変更番号は「］５」となり、一方、４辺全て
において黒画素数か所定数を越えない場合の領域変更番
号は「０」となる。In the case of the example shown in FIG. 2, the lower side of the four sides constituting the boundary line is separated by a character line, so the number of black pixels exceeds a predetermined number. In this embodiment, the boundary feature extraction unit 7° outputs this information to the area change unit 80 in the form of an area change number. This area change number is used by the area change unit 8 to indicate on which of the four sides the number of black pixels exceeds a predetermined number.
0, if the number of black pixels on the top side of the four sides exceeds a predetermined number, it is set as "1", if it is on the left side, it is set as "2", and if it is on the bottom side, it is set as "1". 4”,
Similarly, the right side is assumed to be "8". Therefore, if the number of black pixels on all four sides exceeds the predetermined number, the area change number will be "]5," while if the number of black pixels does not exceed the predetermined number on all four sides, the area change number will be "0".

領域変更部８０は、境界特徴抽出部７０がら入力された
領域変更番号がｒＱＪの場合は、４辺の座標を制ｍ部９
０にそのまま出力する。また、領域変更番号か「０以外
」の場合は、領域変更部８０は、領域変更番号に応じた
辺に平行な複数の境界候補線を設定し各境界候補線上の
黒画素数を抽出し黒画素数か上述の所定数を越えない境
界候補線のうちで最初に設定した境界線にボも近い何方
にある境界候補線の座標を制御部９０に出力する。蔦２
図に示した例で具体的に説明すれば、下辺のＹ座標を、
最初に設定されたＹ座標に対し±１、±２．・・・・・
・、±ｎまで変化させることによって得られる境界候補
線の中から新たな境界線１０４（第２図中破線で示すも
の）が決定される。If the area change number input from the boundary feature extraction unit 70 is rQJ, the area change unit 80 controls the coordinates of the four sides by controlling the m unit 9.
Output to 0 as is. In addition, if the area change number is "other than 0", the area change unit 80 sets a plurality of boundary candidate lines parallel to the sides according to the area change number, extracts the number of black pixels on each boundary candidate line, and extracts the number of black pixels on each boundary candidate line. The coordinates of a boundary candidate line that is closest to the first boundary line among the boundary candidate lines whose number of pixels does not exceed the above-mentioned predetermined number are output to the control unit 90. Ivy 2
To explain specifically using the example shown in the figure, the Y coordinate of the lower side is
±1, ±2 for the initially set Y coordinate.・・・・・・
A new boundary line 104 (indicated by a broken line in FIG. 2) is determined from among the boundary candidate lines obtained by changing the boundary lines up to .

従って、この例の場合は、認識処理領域は、下辺方向に
おいて拡大されるように、変更される。なあ、各境界候
補線の特徴の抽出は、この実施例では領域変更部８０に
おいて行う構成としているか、境界特徴抽出部７０にお
いで行う構成としでも勿論良い。Therefore, in this example, the recognition processing area is changed to be enlarged in the lower side direction. Incidentally, in this embodiment, extraction of the features of each boundary candidate line may be performed in the area changing section 80, or may be performed in the boundary feature extraction section 70, of course.

次に、制御部９０は、領域変更部８０がら入力された認
識処理領域を示す新たな座標及びこの新たな認識処理領
域を表示部４０で表示させる際の色データを表示部４０
に出力する。Next, the control unit 90 transmits to the display unit 40 the new coordinates indicating the recognition processing area input from the area changing unit 80 and the color data for displaying this new recognition processing area on the display unit 40.
Output to.

表示部４０は、表示器に、新たな認識処理領域を指定さ
れた色で、既に表示されているオペレータか指定した認
識処理領域に重ねて表示する。The display unit 40 displays the new recognition processing area in the specified color on the display, overlapping the recognition processing area that is already displayed and specified by the operator.

第３図は、この表示例を示したもので、第２図に示した
条件に対応する表示例を示した図である。第３図におい
て、］１０は表示器の表示画面、１１２は画像データ、
１１４はオペレータか指定した認識処理領域の境界線、
１１６は当該文字認識装置により変更された新たな認識
処理領域の境界線である。これに限られるものではない
か、この実施例の場合は、画像データ１１２は黒色で表
示し、オペレータか指定した認識処理領域の境界線］１
４は青色で表示し、変更された新たな認識処理領域の境
界線］１６は赤色で表示しでいる。FIG. 3 shows an example of this display, which corresponds to the conditions shown in FIG. 2. In FIG. 3, ]10 is the display screen of the display device, 112 is image data,
114 is the boundary line of the recognition processing area specified by the operator;
116 is a boundary line of a new recognition processing area changed by the character recognition device. In the case of this embodiment, the image data 112 is displayed in black, and the boundary line of the recognition processing area specified by the operator]1
4 is displayed in blue, and the border line of the new recognition processing area] 16 is displayed in red.

また、制御部９０は、頓域変更部８０から入力された認
識処理領域を示す新たな座標を、認識部６０内の行位置
検出部６２に出力する。Further, the control unit 90 outputs new coordinates indicating the recognition processing area inputted from the stop area changing unit 80 to the line position detection unit 62 in the recognition unit 60.

行位置検出部６２は、画像記憶部３０の、制御部９０か
ら入力された座標で与えられる領域（認識処理領域）を
予め定めた方向（この実施例の場合は横書き文書を認識
対象物としでいるため水平方向）に走査し、行単位に黒
画素の累積数を計数して周辺分布を抽出し、この周辺分
布に基づいて行位置を検出し、この行位Ｍを文字切り出
し部６４に出力する。The line position detection unit 62 moves an area (recognition processing area) of the image storage unit 30 given by the coordinates input from the control unit 90 in a predetermined direction (in the case of this embodiment, a horizontally written document is the object to be recognized). (horizontal direction), counts the cumulative number of black pixels for each line, extracts the peripheral distribution, detects the line position based on this peripheral distribution, and outputs this line position M to the character cutting unit 64. do.

次に、文字切り出し部６４は、行位置検出部６２から入
力された行位置に基づいて、行傾域内を予め与えられた
方向（この実施例の場合は垂直方向）に走査し列単位に
黒画素の累積数を計数して周辺分布を抽出し、この周辺
分布に基づいて１文字単位に文字バタンデータを切り出
し、この文字バタンデータを識別部６６に出力する。Next, the character cutting unit 64 scans the line inclination area in a predetermined direction (in the case of this embodiment, the vertical direction) based on the line position input from the line position detection unit 62, and blackens each column by column. The cumulative number of pixels is counted to extract the peripheral distribution, character slam data is cut out for each character based on this peripheral distribution, and this character slam data is output to the identification section 66.

次に、識別部６６は、文字切り出し部６４から入力され
た文字バタンデータの特徴を従来から用いられている好
適な方法により抽出しこの特徴を予め用意した辞１と照
合して識別結果を制御部９０に出力する。Next, the identification unit 66 extracts the characteristics of the character slam data inputted from the character extraction unit 64 using a conventionally used suitable method, and controls the identification result by comparing the characteristics with the word 1 prepared in advance. 90.

次に、制御部９０は、識別部６６から入力された識別結
果を表示部４０に出力する。表示部４０は表示器に識別
結果を表示する。オペレータは、表示器に表示された認
識結果を確認し、誤読又は棄却かあった場合は入力部８
０により修正を行つ。Next, the control unit 90 outputs the identification result input from the identification unit 66 to the display unit 40. The display unit 40 displays the identification results on a display. The operator checks the recognition results displayed on the display, and if there is a misread or rejection, the operator
Correct by 0.

オペレータの確認の結果、識別か正確に行われていた場
合また修正か済んた場合、制御部９０は、その識別結果
を文字名出力として端子９２かう外部装置に出力する。As a result of the operator's confirmation, if the identification has been performed correctly or if the correction has been completed, the control unit 90 outputs the identification result to the terminal 92 as a character name output.

上述においては、この発明の文字認識装置の実施例につ
き説明したか、この発明は上述の実施例のみに限られる
ものではなく以下に説明するような種々の変更を加える
ことが出来る。Although the embodiments of the character recognition device of the present invention have been described above, the present invention is not limited to the above-described embodiments, and can be modified in various ways as described below.

例えば、上述の実施例では、境界特徴抽出部は、境界線
上の黒画素数を特徴どして抽出する構成であったか、そ
の構成はこれに限られるものではない。境界特徴抽出部
を、例えば、当該境界線上での黒画素から白画素への（
又は白画素から黒画素への）変化回数を特徴として抽出
する構成とじても良い。また、境界特徴抽出部を、当該
境界線での黒画素密度を抽出する構成、例えば当該境界
線上での単位長当たりの黒画素数を特徴として抽出する
構成若しくは黒画素かＮ＠　（Ｎは予め設定された個数
）以上連続する部分の個数を特徴としで抽出する構成と
しでも良い。For example, in the above-described embodiment, the boundary feature extraction unit is configured to extract the number of black pixels on the boundary line as a feature, but the configuration is not limited to this. For example, the boundary feature extraction unit is configured to convert black pixels to white pixels on the boundary line (
Alternatively, the configuration may be such that the number of changes (from a white pixel to a black pixel) is extracted as a feature. In addition, the boundary feature extraction unit may be configured to extract the black pixel density on the boundary line, for example, to extract the number of black pixels per unit length on the boundary line as a feature, or whether the black pixel density is N@ (N is The configuration may be such that the number of consecutive parts (a set number) or more is extracted as a feature.

また、上述の実施例では、表示部を、境界線、画像デー
タ等を色を違えで表示する構成としでいたか、輝度を違
えて表示する構成としでも良い。Further, in the above-described embodiments, the display section is configured to display the border lines, image data, etc. in different colors, or may be configured to display them in different brightness.

（発明の９カ果）上述した説明からも明らかなように、この発明の文字認
識装置によれば、オペレータか指定した認識処理領域の
境界線の特徴を抽出しこの特徴か所定の条件を満足する
か否かを判定し満足しない場合は認識処理領域を自動的
に変更する。(Nine Achievements of the Invention) As is clear from the above description, the character recognition device of the present invention extracts the feature of the boundary line of the recognition processing area specified by the operator, and extracts the feature of the boundary line of the recognition processing area specified by the operator, and extracts the feature of the boundary line of the recognition processing area specified by the operator, and extracts the feature of the boundary line of the recognition processing area specified by the operator, and extracts the feature of the boundary line of the recognition processing area specified by the operator. If it is not satisfied, the recognition processing area is automatically changed.

従って、オペレータか認識処理領域の指定時に認識すべ
き文字の一部を欠いてしまうような指定を誤ってしてし
まった場合でもこの文字認識装置は認識すべき文字を含
むように認識処理領域を自動的に変更し、またこれとは
逆（こ余分な部分（例えば隣接認識処理領域の文字の一
部）を含むように指定をしでしまった場合でもこの文字
認識装置は余分な部分を除くように認識処理領域を自動
的に変更する。この茫め、適正な認識処理領域に対し認
識処理か行われるので、精度良くかつ処理速度の速い文
字認識装置の提供か可能になる。Therefore, even if the operator makes a mistake in specifying the recognition processing area, causing part of the character to be recognized to be missing, this character recognition device will adjust the recognition processing area to include the character to be recognized. This character recognition device automatically changes the text and vice versa (even if you specify to include extra parts (for example, part of a character in the adjacent recognition processing area), this character recognition device will remove the extra parts. The recognition processing area is automatically changed so that the recognition processing area is automatically changed.Since the recognition processing is performed on the appropriate recognition processing area, it is possible to provide a character recognition device with high accuracy and high processing speed.

また、表示部は、オペレータカ）指定した認識処理領域
と、新たな認識処理領域とをオペレータか区別出来るよ
うに色や輝度を変えて重ねて表示出来る構成となってい
るので、新たな認識処理領域かオペレータか意図した領
域であるか否かの確認か容易に出来るのでざらに好適で
ある。In addition, the display unit is configured so that the recognition processing area specified by the operator and the new recognition processing area can be displayed overlappingly by changing the color and brightness so that the operator can distinguish between the recognition processing area and the new recognition processing area. This is particularly suitable because it is easy to check whether the area is the area intended by the operator or not.

[Brief explanation of the drawing]

！Ｔ図は、実施例の文字認識装置の構成を示すブロック
図、第２図は、画像記憶部中の画像テークと入力部により指
定した認識処理領域との関係を示した図、第３図は、第２図に対応する表示例を示した図、菓４図（Ａ）及び（Ｂ）は、従来技術の問題点の説明（
こ供する図である。Ｓ・・・画像テーク、　　　　２０・・−文字認識装置
３０・・・画像記憶部、　　４０−・・表示部５０・・
−人力部、　　　　６０・・・認識部６２・・・行位言
検出部、　６４・・・文字切り出し部６６・・・識別部
、　　　　　７０・−・境界特徴抽吊部８０・・・領域
変更部、　　９０・・・制御部９２・・・出力端子、　
　　　１００・・・画像テーク１０２・・・入力部によ
り指定した認識処理領域１１０・・・表示画面、　　　
１１２−・・画像データ１１４・・・オペレータか指定
した認識処理領域の境界線１１６・・・変更された新たな認識処理領域の境界線。特許出願人　　　　沖電気工業株式会社１１０表示画面１１２・画像チク１１４、オペし夕か指定した認識処理領域の境界線１１６、変更された新たな認識処理領域の境界線第２図
に対応する表示例を示し１．：７第３図第４図! Figure T is a block diagram showing the configuration of the character recognition device of the embodiment, Figure 2 is a diagram showing the relationship between the image take in the image storage unit and the recognition processing area specified by the input unit, and Figure 3 is , a diagram showing a display example corresponding to FIG.
This is a diagram to provide. S...Image take, 20...Character recognition device 30...Image storage unit, 40-...Display unit 50...
- Human power department, 60... Recognition unit 62... Linear word detection unit, 64... Character cutting unit 66... Identification unit, 70... Boundary feature drawing unit 80... Area changing unit , 90...control unit 92...output terminal,
100... Image take 102... Recognition processing area specified by the input unit 110... Display screen,
112--Image data 114--Boundary line of the recognition processing area specified by the operator 116--Boundary line of the new recognition processing area that has been changed. Patent applicant: Oki Electric Industry Co., Ltd. 110 Display screen 112, image tick 114, boundary line 116 of the recognition processing area specified after the operation, and display example corresponding to the changed new recognition processing area boundary line in FIG. 1. :7Figure 3Figure 4

Claims

[Claims]

(1) An image storage unit that stores image data of characters and figures on a medium, a display unit that displays the image data, an input unit that specifies a recognition processing area of the image data, and the specified recognition processing area A character recognition device comprising a recognition unit that extracts image data character by character from a character and recognizes characters, comprising: a boundary feature extraction unit that extracts characteristics of a boundary line of a recognition processing area specified by an input unit; A character recognition device comprising: an area changing unit that changes a designated recognition processing area based on the boundary line feature extracted by the boundary line feature extraction unit.

(2) If the characteristics of the boundary line do not satisfy a predetermined condition, the area changing unit sets a plurality of boundary candidate lines in the vicinity of the boundary line, extracts the characteristics of these boundary candidate lines, 2. The character recognition device according to claim 1, wherein a boundary candidate line closest to said boundary line among boundary candidate lines satisfying a condition is set as a new boundary line.

(3) The boundary feature extraction unit is configured to perform the boundary feature extraction section based on one or more of the following elements: the number of black pixels on the boundary line, the number of changes from black pixels to white pixels on the boundary line, and the density of black pixels on the boundary line. The character recognition device according to claim 1, characterized in that the character recognition device is configured to extract features of the boundary line.

(4) The display unit is arranged to overlap the image data, the recognition processing area designated by the input unit, and the changed recognition processing area changed by the area changing unit in a format that allows each to be visually distinguished. 2. The character recognition device according to claim 1, wherein the character recognition device is configured to display the character.

(5) The display unit is configured to display the image data, the recognition processing area specified by the input unit, and the changed recognition processing area changed by the area changing unit in different colors or brightnesses. Claim 1 characterized in that
Or the character recognition device described in 4.