JPH1021332A - Non-linear normalizing method - Google Patents

Non-linear normalizing method

Info

Publication number
JPH1021332A
JPH1021332A JP8173406A JP17340696A JPH1021332A JP H1021332 A JPH1021332 A JP H1021332A JP 8173406 A JP8173406 A JP 8173406A JP 17340696 A JP17340696 A JP 17340696A JP H1021332 A JPH1021332 A JP H1021332A
Authority
JP
Japan
Prior art keywords
character
line
rectangular frame
distance
circumscribed rectangular
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP8173406A
Other languages
Japanese (ja)
Inventor
Fumiyoshi Nishio
文祥 西尾
Yasutaka Watanabe
康隆 渡辺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tamura Electric Works Ltd
Original Assignee
Tamura Electric Works Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tamura Electric Works Ltd filed Critical Tamura Electric Works Ltd
Priority to JP8173406A priority Critical patent/JPH1021332A/en
Publication of JPH1021332A publication Critical patent/JPH1021332A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Input (AREA)

Abstract

PROBLEM TO BE SOLVED: To extract a curved character like a handwritten character as straight character lines. SOLUTION: In the case of finding out distances between respective character lines constituting a character to be discriminated and distances between respective sides of a circumscribed rectangular frame 11 and respective character lines adjacent to the frame 11, setting up the inverses of respective distances as line density and executing non-linear normalizing processing in accordance with the line density, the line density of a part adjacent to each character line is set up to zero. Thereby a slightly curved character is extracted as straight character lines. A circumscribed rectangular frame 12 is formed on the outside of the frame 11, distances between respective character lines adjacent to the frame 11 and respective sides of the frame 12 are found out and the inverses of the distances are set up as line density. Consequently a character picture obtained by the non-linear normalizing processing is prevented from being reduced.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【0001】[0001]

【発明の属する技術分野】本発明は、手書き文字等の被
識別文字を認識する文字認識技術に関し、特に被識別文
字画像を非線形正規化処理する非線形正規化方法に関す
る。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition technique for recognizing a character to be identified such as a handwritten character, and more particularly to a non-linear normalization method for performing non-linear normalization processing on an image of a character to be identified.

【0002】[0002]

【従来の技術】一般に文字認識を行う場合は図9に示す
認識処理フローに従った手順で行われる。即ち、まず前
処理ステップS1において、認識されるべき被認識文字
の原画像からノイズ成分を除去する。次いで方向パター
ン作成ステップS2において、ノイズ成分が除去された
被認識文字の原画像から輪郭線画像を得て、この輪郭線
画像を各画素に分割し、文字情報を有する黒画素または
文字情報を有しない白画素の2値画素として構成する。
そして、黒画素または白画素の各画素からなる2値画像
から輪郭線画像の方向パターンを抽出する。
2. Description of the Related Art In general, character recognition is performed according to a procedure according to a recognition processing flow shown in FIG. That is, first, in a preprocessing step S1, a noise component is removed from an original image of a character to be recognized. Next, in a direction pattern creation step S2, a contour image is obtained from the original image of the recognized character from which the noise component has been removed, and the contour image is divided into pixels, and a black pixel having character information or character information is stored. It is configured as a binary pixel of a white pixel which is not used.
Then, the direction pattern of the contour image is extracted from the binary image including the black pixels or the white pixels.

【0003】こうして方向パターンが抽出された後で、
次の非線形正規化ステップS3では後述するように輪郭
線画像をその画像の各点について異なる倍率で拡大また
は縮小変換する。そして非線形正規化処理された画像の
各方向パターンについて、次の特徴ベクトル作成ステッ
プS4では、8×8の小領域(正方領域)に等分割し、
各小領域の黒画素数を特徴量とすることで、各方向パタ
ーンの特徴ベクトルを作成する。
After the direction pattern is extracted in this way,
In the next non-linear normalization step S3, the outline image is enlarged or reduced at different magnifications for each point of the image as described later. In the next feature vector creation step S4, each direction pattern of the image subjected to the nonlinear normalization processing is equally divided into 8 × 8 small areas (square areas).
By using the number of black pixels of each small area as a feature amount, a feature vector of each direction pattern is created.

【0004】次に大分類ステップS5では、上記ステッ
プS4で求めた入力文字(被認識文字)の方向特徴ベク
トルに対し、予め設けた辞書のベクトルの中で距離の近
いベクトルを有する候補文字を距離の近いものから順に
複数抽出する。この距離計算は、式(1)に示す4−近
傍距離(4−neighor−distance)と呼
ばれる一般的な計算法を用いて演算される。
[0004] Next, in a large classification step S5, a candidate character having a vector having a short distance among vectors in a dictionary provided in advance is compared with the directional feature vector of the input character (character to be recognized) obtained in step S4. A plurality is extracted in order from the closest one. This distance calculation is performed using a general calculation method called 4-neighbor-distance shown in Expression (1).

【0005】即ち、 d4 ((i、j),(h,k))=|i−h|+|j−k| (1) (但し、(i、j)は入力文字の座標値、(h,k)は
辞書の文字の座標値) その後、詳細分類ステップS6では、精緻なパターン
(例えば、別に32×32次元の特徴ベクトルを用意す
る)を用いて詳細比較を行うことで最終的な文字識別を
行う。
That is, d4 ((i, j), (h, k)) = | i−h | + | j−k | (1) (where (i, j) is the coordinate value of the input character, (h, k) are the coordinate values of the characters in the dictionary.) Then, in the detailed classification step S6, a detailed comparison is performed using a fine pattern (for example, a separate 32 × 32-dimensional feature vector is prepared) to make the final comparison. Perform character identification.

【0006】図8は非線形正規化処理の概要を示す図で
ある。上述したように、方向パターン作成ステップS2
では図8(a)に示す原図形(輪郭線画像とする)か
ら、図8(b)に示す水平,垂直,右下がり及び右上が
りの各方向パターンを抽出するが、これとは別に、非線
形正規化ステップS3では、まず図8(a)の原図形か
ら線間隔を求めて図8(c)に示す線密度情報を得る。
FIG. 8 is a diagram showing an outline of the nonlinear normalization processing. As described above, the direction pattern creation step S2
In FIG. 8B, horizontal, vertical, downward-sloping and upward-sloping directional patterns shown in FIG. 8B are extracted from the original figure (contour image) shown in FIG. 8A. In the normalization step S3, first, a line interval is obtained from the original figure in FIG. 8A to obtain line density information shown in FIG. 8C.

【0007】ここで原図形を8×8の領域に分割する
が、このときの各領域の線密度が上記線密度情報と同一
の密度となるように分割線の位置を決定する非線形正規
化処理を行う。図8(d)は、その非線形正規化処理の
結果を示す図である。そしてこの分割線の位置を示す非
線形正規化情報をもとに図8(b)のそれぞれの方向パ
ターンに対して非線形写像を行い、図8(e)のような
各方向パターン毎の処理結果を得る。こうして非線形正
規化処理された画像の各方向パターンについて、次の特
徴ベクトル作成ステップS4で特徴ベクトルが抽出され
る。
Here, the original figure is divided into 8 × 8 regions. At this time, the position of the dividing line is determined so that the line density of each region is the same as the line density information. I do. FIG. 8D is a diagram showing a result of the nonlinear normalization processing. Then, non-linear mapping is performed on each direction pattern in FIG. 8B based on the non-linear normalization information indicating the position of the dividing line, and the processing result for each direction pattern as shown in FIG. obtain. In each of the directional patterns of the image subjected to the non-linear normalization processing, a feature vector is extracted in the next feature vector creation step S4.

【0008】[0008]

【発明が解決しようとする課題】従来の非線形正規化処
理では、原図形から線密度情報を得る場合、文字線と文
字線間の線間隔(線間幅)を計数して求めている。この
ため、手書き文字のような多少湾曲している直線状の文
字線の場合は、純然たる直線の文字線の場合には発生し
ない線密度が湾曲部に生じ、これに応じて文字線が湾曲
状態のまま抽出されることから、その後の文字認識に悪
影響を与えるという問題があり、直線の文字線として抽
出することが要望されている。従って本発明は、手書き
文字のような個人差のある直線状の文字線を直線文字線
として抽出することを目的とする。
In the conventional non-linear normalization processing, when obtaining line density information from an original figure, a line interval (line width) between character lines is counted and obtained. For this reason, in the case of a slightly curved straight character line such as a handwritten character, a line density that does not occur in the case of a purely straight character line occurs in the curved portion, and the character line is curved accordingly. Since it is extracted as it is, there is a problem that the subsequent character recognition is adversely affected, and it is desired to extract as a straight character line. Accordingly, an object of the present invention is to extract a linear character line having individual differences such as handwritten characters as a straight character line.

【0009】[0009]

【課題を解決するための手段】このような課題を解決す
るために本発明は、被識別文字に外接する第1の外接矩
形枠を設け、被識別文字を構成する各文字線間の距離、
及び第1の外接矩形枠の各辺と第1の外接矩形枠に隣接
する各文字線間の距離を求め、各距離の逆数を線密度と
してこの線密度情報に応じて第1の外接矩形枠を各小領
域に分割すると共に、分割した各小領域の各情報を規定
の大きさで等分割された各領域に各個に写像する非線形
正規化方法において、各文字線の近傍の線密度を零にす
るようにした方法である。従って、手書き文字のような
多少湾曲している文字線の場合はその近傍に生じる線密
度が取り除かれ、こうした湾曲文字線を正規な直線文字
線として抽出できる。また、第1の外接矩形枠の外部に
この第1の外接矩形枠の定数倍の第2の外接矩形枠を設
け、第1の外接矩形枠に隣接する各文字線と第2の外接
矩形枠の各辺との距離を求めこの距離の逆数を線密度と
する。この結果、第1の外接矩形枠の各辺に隣接した各
文字線が対応する各辺に近接するような文字画像の線密
度を小さくでき、従って非線形正規化した場合に生じる
その文字画像の縮小を防止できることから、非線形正規
化後に例えば「田」と「由」のような類似文字の判別が
困難になるといった問題を回避できる。
In order to solve the above-mentioned problems, the present invention provides a first circumscribed rectangular frame circumscribing a character to be identified, the distance between each character line constituting the character to be identified,
And determining a distance between each side of the first circumscribed rectangular frame and each character line adjacent to the first circumscribed rectangular frame, and using a reciprocal of each distance as a line density in accordance with the line density information, the first circumscribed rectangular frame. Is divided into small regions, and the linear density near each character line is reduced to zero in a non-linear normalization method in which each information of each divided small region is mapped to each of the equally divided regions of a prescribed size. This is the method that was used. Therefore, in the case of a slightly curved character line such as a handwritten character, the line density generated in the vicinity thereof is removed, and such a curved character line can be extracted as a regular straight character line. Further, a second circumscribed rectangular frame having a constant multiple of the first circumscribed rectangular frame is provided outside the first circumscribed rectangular frame, and each character line adjacent to the first circumscribed rectangular frame is connected to the second circumscribed rectangular frame. The distance to each side is determined, and the reciprocal of this distance is defined as the linear density. As a result, the line density of a character image in which each character line adjacent to each side of the first circumscribed rectangular frame is close to the corresponding side can be reduced, and therefore, the reduction of the character image caused by nonlinear normalization Therefore, it is possible to avoid a problem that it becomes difficult to distinguish similar characters such as “Ta” and “Yu” after nonlinear normalization.

【0010】[0010]

【発明の実施の形態】以下、本発明について図面を参照
して説明する。図1は本発明を適用した装置の構成を示
すブロック図である。同図において、本装置は、CPU
1、タッチパネル2、タッチパネルの下面に形成された
LCD3、メモリ4、及び辞書メモリ5からなる手書き
文字入力装置である。ここで、タッチパネル2上の例え
ばA点からB点までの間を、図示しない入力ペンにより
押下しながら移動させると、CPU1はこの間の入力ペ
ンの移動軌跡を複数の座標値としてタッチパネル2を介
して入力する。そして、メモリ4にそのA点からB点ま
での線分を記憶すると共に、LCD3の対応部分にその
線分を表示する。このようにしてペン入力に基づく手書
き文字をLCD3に表示することができる。
DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a configuration of an apparatus to which the present invention is applied. Referring to FIG.
1, a touch panel 2, an LCD 3 formed on the lower surface of the touch panel, a memory 4, and a dictionary memory 5. Here, when the touch panel 2 is moved while being pressed by an input pen (not shown) between points A and B on the touch panel 2, for example, the CPU 1 sets the movement locus of the input pen during this time as a plurality of coordinate values via the touch panel 2. input. Then, the line segment from the point A to the point B is stored in the memory 4 and the line segment is displayed on the corresponding portion of the LCD 3. In this manner, handwritten characters based on pen input can be displayed on the LCD 3.

【0011】また、こうしてメモリ4に記憶されかつL
CD3に表示された手書き文字について、CPU1は以
下に示すような各処理を行い、その処理結果と辞書メモ
リ5内に記憶された各字種とを比較することで、その手
書き文字がどの字種に該当するかを認識する。即ち、こ
の場合CPU1はまず被識別文字である入力文字(手書
き文字)からノイズ等を除去して滑らかな文字とする前
処理を行う。その後、図2(a)に示すこの原画像から
図2(b)に示すような輪郭線画像を得て、その輪郭線
画像を64×64の各画素として分割する。
The data stored in the memory 4 and L
The CPU 1 performs the following processes on the handwritten characters displayed on the CD 3, and compares the processing result with each of the character types stored in the dictionary memory 5. Recognize whether it corresponds to. That is, in this case, the CPU 1 first performs preprocessing to remove noise and the like from the input characters (handwritten characters), which are the characters to be identified, to make the characters smooth. Thereafter, a contour image as shown in FIG. 2B is obtained from the original image shown in FIG. 2A, and the contour image is divided into 64 × 64 pixels.

【0012】そして、その輪郭線を構成する全画素(黒
画素)について、それぞれ各画素が有している方向成分
を検出する。即ち、輪郭線を構成する任意の黒画素に着
目し、その着目画素と、その着目画素の前後の黒画素と
の接続パターンから、着目画素の方向成分を検出する。
図3はこのような黒画素の方向成分の検出状況を示す図
である。
Then, with respect to all the pixels (black pixels) constituting the outline, the direction component of each pixel is detected. That is, an arbitrary black pixel constituting the contour is focused on, and a directional component of the pixel of interest is detected from a connection pattern between the pixel of interest and black pixels before and after the pixel of interest.
FIG. 3 is a diagram showing a detection state of such a directional component of a black pixel.

【0013】ここで、図3において、斜線部分の画素a
〜cは輪郭線を構成する黒画素を示し、斜線部分以外の
画素は白画素であるとする。そして、着目画素をb、そ
の前後の黒画素をそれぞれa,cとすると、輪郭線は、
黒画素aから水平に着目画素bに入り、さらに着目画素
bを通って右上がり方向にある黒画素cに達することか
ら、着目画素bでは、水平方向成分と右上がり方向成分
の2つの方向成分を有することが分かる。そしてこれら
2つの方向成分が着目画素bの方向成分として検出され
る。なお、輪郭線を構成する黒画素の方向成分としては
上述の2つの方向成分の他に、垂直方向成分と右下がり
方向成分があるが、1つの黒画素については、これら4
つの方向成分のうち2つの方向成分が割り付けられる。
CPU1は、輪郭線の方向パターンを抽出する場合、輪
郭線上の全ての各黒画素について、各黒画素毎に順次方
向成分を検出し、検出した各黒画素の方向成分に基づき
輪郭線の水平,垂直,右上がり及び右下がりの各方向パ
ターンを抽出する。
Here, in FIG. 3, a pixel a in a shaded area
Cc indicate black pixels constituting the contour line, and pixels other than the hatched portions are assumed to be white pixels. If the pixel of interest is b and the black pixels before and after it are a and c, respectively, the outline is
The pixel of interest b enters the target pixel b horizontally from the black pixel a, and further reaches the black pixel c in the upward right direction through the pixel of interest b. Therefore, the target pixel b has two directional components of a horizontal component and a right upward component. It can be seen that Then, these two directional components are detected as the directional components of the pixel of interest b. Note that, in addition to the above-described two directional components, there are a vertical component and a rightward downward component as the directional components of the black pixels forming the contour line.
Two directional components of the one directional component are allocated.
When extracting the directional pattern of the contour, the CPU 1 sequentially detects the directional component for each black pixel for all the black pixels on the contour, and based on the detected directional components of the black pixels, detects the horizontal and horizontal components of the contour. Vertical, upward and downward directional patterns are extracted.

【0014】次にCPU1は輪郭線画像の非線形正規化
処理を行う。即ち、図4(a)に示すように、「田」と
いう文字画像の各文字線(文字ストローク)間の距離
(例えば、図中のA,B,C,D)を求めてこの逆数を
線密度とする。ここで、文字線間の距離は、水平方向と
垂直方向とを別々に計数し、その逆数をそれぞれ水平方
向の線密度及び垂直方向の線密度とする。このようにし
て各文字線間(即ち、図中の文字線(黒部分)以外の白
点)の線密度が求められる。なお文字線(即ち、黒点)
の線密度を「0」とする。また、図中において、11は
処理対象の文字に外接する外接矩形枠である。
Next, the CPU 1 performs a non-linear normalization process of the contour image. That is, as shown in FIG. 4 (a), the distance (for example, A, B, C, D in the figure) between each character line (character stroke) of the character image "" is obtained, and this reciprocal is represented by a line. Density. Here, the distance between the character lines is counted separately in the horizontal direction and the vertical direction, and the reciprocals thereof are defined as the line density in the horizontal direction and the line density in the vertical direction, respectively. In this way, the line density between character lines (that is, white points other than the character lines (black portions) in the figure) is obtained. Note that character lines (that is, black dots)
Is "0". In the figure, reference numeral 11 denotes a circumscribed rectangular frame that circumscribes a character to be processed.

【0015】次に外接矩形枠11を座標に置き換え、図
4(b)に示すように、全ての白点の水平方向の線密度
をx軸方向に、垂直方向の線密度をy軸方向に集積す
る。これにより、水平方向及び垂直方向における線密度
ヒストグラムが作成される。なお、図4(b)中の21
は任意の白点を示す。次にCPU1は線密度ヒストグラ
ムを8分割する。そしてこのとき、図4(c)に示すよ
うに、8分割された各々のヒストグラムの和が全て同一
値となるように分割位置を定める。そしてその分割位置
で定められる各分割線により外接矩形枠11の領域を水
平及び垂直方向にそれぞれ8分割し、8×8の小領域と
する。
Next, the circumscribed rectangular frame 11 is replaced with coordinates, and as shown in FIG. 4B, the horizontal line densities of all the white points are in the x-axis direction and the vertical line densities are in the y-axis direction. Collect. Thereby, a line density histogram in the horizontal direction and the vertical direction is created. Note that 21 in FIG.
Indicates an arbitrary white point. Next, the CPU 1 divides the line density histogram into eight. At this time, as shown in FIG. 4C, the division positions are determined so that the sum of each of the eight divided histograms has the same value. Then, the area of the circumscribed rectangular frame 11 is divided into eight in each of the horizontal and vertical directions by each division line determined by the division position, to be an 8 × 8 small area.

【0016】こうして分割された図4(d)に示す8×
8の各小領域を、規定の大きさで8×8に等分割された
図4(e)の正方矩形に写像する。なお、図4では説明
の都合上原画像が示されているが、ここで扱われる画像
は実際には原画像の輪郭線である。この場合、個々の領
域においては、写像元の大きさとそれに対応する写像先
の大きさとが異なるが、写像元の個々の領域を拡大また
は縮小することによって写像先の領域に対応させて写像
する。
The 8 × shown in FIG.
Each small area of 8 is mapped to a square rectangle of FIG. 4E equally divided into 8 × 8 with a specified size. Although the original image is shown in FIG. 4 for convenience of explanation, the image handled here is actually a contour line of the original image. In this case, the size of the mapping source and the size of the corresponding mapping destination are different in each region, but the individual regions of the mapping source are enlarged or reduced to map in correspondence with the region of the mapping destination.

【0017】このような非線形正規化処理は、輪郭線の
垂直,水平,右上がり及び右下がりの各方向パターン毎
に行われ処理の終了後、CPU1は等分割した各小領域
の黒画素数を計数し、各小領域毎の計数値をその方向パ
ターンの方向特徴ベクトルとして求める。そしてその
後、求めた方向特徴ベクトルに対し、辞書メモリ5内の
辞書のベクトルの中で距離の近いベクトルを有する候補
文字を距離の近いものから順に複数抽出し、抽出された
候補文字と入力文字とをさらに詳細に比較することで、
最終的な文字識別を行う。
Such a non-linear normalization process is performed for each of the vertical, horizontal, upward-sloping and downward-sloping contour patterns, and after the process is completed, the CPU 1 determines the number of black pixels of each equally divided small area. Counting is performed, and a count value for each small area is obtained as a directional feature vector of the directional pattern. Then, with respect to the obtained direction feature vector, a plurality of candidate characters having a vector having a short distance among the vectors of the dictionary in the dictionary memory 5 are extracted in order from the one with the shortest distance, and the extracted candidate character and the input character are extracted. By comparing in more detail,
Perform final character identification.

【0018】ところで、原図形から線密度を得る場合
は、文字線と文字線間の距離(線間幅)を計数してその
逆数として求めているが、図7(a)に示す多少湾曲し
ている直線状の文字線31の場合は、湾曲部の距離hに
より純然たる直線の文字線では発生しない線密度を生
じ、正規な直線文字として抽出することができない。ま
た、「園」や「田」といった文字は、図7(b)に示す
ように外接矩形枠11と文字線との間の線間隔A,B,
Cが小さい。このような場合、この文字画像を非線形正
規化処理すると、線間隔A,B,Cの部分ではその線間
隔の逆数である線密度の値は大となり、従って非線形正
規化後のその文字の外形は外接矩形枠11と離間し、図
7(d)に示すように本来の外形の大きさからかなり縮
小されたものになる。このため「田」と「由」のような
類似文字の判別が困難になる。
When the line density is obtained from the original figure, the distance between the character lines (line width) is counted and obtained as its reciprocal. However, the curve is slightly curved as shown in FIG. In the case of the straight character line 31, a line density that does not occur in a pure straight character line is generated due to the distance h of the curved portion, and it cannot be extracted as a normal straight character. In addition, characters such as “garden” and “field” have line spacings A, B, and B between the circumscribed rectangular frame 11 and the character line as shown in FIG.
C is small. In such a case, if this character image is subjected to nonlinear normalization processing, the value of the line density, which is the reciprocal of the line interval, becomes large at the line intervals A, B, and C. Therefore, the outline of the character after nonlinear normalization Is separated from the circumscribed rectangular frame 11, and is considerably reduced from the original outer size as shown in FIG. 7D. For this reason, it is difficult to distinguish similar characters such as “ta” and “yu”.

【0019】そこで、線密度を得る場合は、ある白点の
位置に着目し、その白点の周囲において、どの方向に文
字線が存在するかを調べ、その情報を基に文字線までの
距離を求めるようにする。そして求めた距離の逆数から
線密度を得る。まず図5に示すように、外接矩形枠11
の外部にこの外接矩形枠11の定数倍の外接矩形枠12
を設ける。そして、図5(a)のように白点21の上下
左右の何れか1つに文字線31が存在する場合は、文字
線31が存在する方向に関しては、文字線31と外接矩
形枠12の白点21が存在する側の一辺間の距離hを計
数しその逆数を線密度とする。また、文字線31が存在
しない方向に関しては、外接矩形枠12の各辺間の距離
wの逆数を線密度とする。
Therefore, when obtaining the line density, attention is paid to the position of a certain white point, the direction of the character line around the white point is examined, and the distance to the character line is determined based on the information. To ask. Then, the linear density is obtained from the reciprocal of the obtained distance. First, as shown in FIG.
Outside a circumscribed rectangular frame 12 which is a constant multiple of this circumscribed rectangular frame 11
Is provided. Then, as shown in FIG. 5A, when the character line 31 exists at any one of the upper, lower, left, and right of the white point 21, regarding the direction in which the character line 31 exists, the character line 31 and the circumscribed rectangular frame 12 The distance h between the sides where the white point 21 exists is counted, and the reciprocal thereof is defined as the linear density. In the direction in which the character line 31 does not exist, the reciprocal of the distance w between the sides of the circumscribed rectangular frame 12 is defined as the line density.

【0020】次に、図5(b)のように白点21の上下
に文字線31A,31Bが存在する場合(または、白点
21の左右に各文字線が存在する場合)は、文字線31
A,31Bが存在する方向に関しては、各文字線間の距
離hの逆数を線密度とする。また、文字線が存在しない
方向に関しては外接矩形枠12の各辺間の距離wの逆数
を線密度とする。次に、図5(c)のように白点21の
上と左に文字線31A,31Bが存在する場合(また
は、白点21の上と右、下と左、下と右の何れであって
も良い)は、各文字線31A,31Bと外接矩形枠12
の各辺のうち白点21が存在する側の一辺間の距離h,
wをそれぞれ計数しこれらの逆数をそれぞれ線密度とす
る。
Next, as shown in FIG. 5B, when the character lines 31A and 31B exist above and below the white point 21 (or when each character line exists to the left and right of the white point 21), the character line 31
Regarding the direction in which A and 31B exist, the reciprocal of the distance h between the character lines is defined as the line density. In the direction in which no character line exists, the reciprocal of the distance w between the sides of the circumscribed rectangular frame 12 is defined as the line density. Next, as shown in FIG. 5C, when the character lines 31A and 31B exist above and to the left of the white point 21 (or any of the upper and right sides, the lower and left sides, and the lower and right sides of the white point 21). May be used), each character line 31A, 31B and the circumscribed rectangular frame 12
Of each side of the side where the white point 21 exists, h,
w is counted, and the reciprocal thereof is defined as the linear density.

【0021】次に、図5(d)のように白点21の左右
と上に文字線31A,31B,31Cが存在する場合
(または、白点21の左右と下、上下と左、上下と右の
何れであっても良い)は、文字線31Aと外接矩形枠1
2の各辺のうち白点21が存在する側の一辺間の距離h
を計数してその逆数を線密度とする。また、各文字線3
1B,31C間の距離wを計数してその逆数を線密度と
する。次に、図5(e)のように白点21の上下左右す
べてに文字線31A〜31Dが存在する場合は、対向す
る各文字線間の距離h,wをそれぞれ計数し、これらの
逆数をそれぞれ線密度とする。
Next, as shown in FIG. 5D, when character lines 31A, 31B and 31C are present on the left and right of and above the white point 21 (or on the left and right and below the white point 21, up and down and left and up and down, respectively). (It may be any one on the right).
2 is a distance h between one side of the side where the white point 21 exists.
And the reciprocal thereof is defined as the linear density. In addition, each character line 3
The distance w between 1B and 31C is counted, and the reciprocal thereof is defined as the linear density. Next, as shown in FIG. 5E, when the character lines 31A to 31D exist at all of the upper, lower, left, and right sides of the white point 21, the distances h and w between the opposing character lines are counted, and the reciprocal thereof is calculated. Each is defined as a linear density.

【0022】このようにして文字画像の線密度を得るこ
とができる。従って、図6(f)に示すような、外接矩
形枠11と文字線との間の距離A,Dを求めてその逆数
を線密度とし、距離A,Dが短い場合に非線形正規化に
よって文字画像が縮小するといった問題を回避すること
ができる。即ち、図6(g)に示すように外接矩形枠1
1の外部に外接矩形枠12を設け、上記距離A,Dを外
接矩形枠12との間の距離A’,D’として距離値を広
げることにより線密度を小さくする。この結果、非線形
正規化処理後の文字外形の大きさは縮小されずに処理前
の大きさを保持することができ、従って例えば「田」と
「由」のような類似文字の判別が困難になるといった問
題を回避できる。
Thus, the linear density of the character image can be obtained. Therefore, as shown in FIG. 6 (f), the distances A and D between the circumscribed rectangular frame 11 and the character line are obtained, and the reciprocal thereof is defined as the line density. When the distances A and D are short, the character is obtained by nonlinear normalization. The problem that the image is reduced can be avoided. That is, as shown in FIG.
1, a circumscribed rectangular frame 12 is provided outside, and the distances A and D are set as distances A ′ and D ′ between the circumscribed rectangular frame 12 to increase the distance value, thereby reducing the line density. As a result, the size of the character outer shape after the non-linear normalization processing can be maintained without being reduced, and therefore, it is difficult to distinguish similar characters such as “ta” and “yu”, for example. Can be avoided.

【0023】また、図6(a)〜(d)に示すように、
白点21の水平位置を、垂直方向の文字線(例えば、文
字線31B)のほぼ先端位置に相当する位置にし、白点
21から水平文字線31Aまでの垂直方向距離hを計数
すると共に、垂直文字線31Bから白点21を通って水
平文字線31Aの他端の位置まで(或いは他の垂直文字
線31Cまで)の水平方向距離wを計数し、計数した垂
直方向距離h及び水平方向距離wの大小を比較すること
によって、水平文字線31Aのみを直線文字として抽出
することができる。
As shown in FIGS. 6A to 6D,
The horizontal position of the white point 21 is set to a position substantially corresponding to the leading end position of a vertical character line (for example, the character line 31B), and the vertical distance h from the white point 21 to the horizontal character line 31A is counted. The horizontal distance w from the character line 31B to the position of the other end of the horizontal character line 31A through the white point 21 (or to another vertical character line 31C) is counted, and the counted vertical distance h and horizontal distance w By comparing the sizes of the horizontal character lines 31A, only the horizontal character lines 31A can be extracted as straight-line characters.

【0024】即ち、垂直方向距離hが水平方向距離wに
比較して極端に小さい図6(b),(d)のような場合
は、その距離hを「0」とし、従ってその線密度を
「0」とする。従って、図6(b),(d)の場合は、
1つの文字線31Aと見なして直線文字線31のみを抽
出することができる。なお、垂直方向距離hが水平方向
距離wに比較して極端に小さくない図6(a),(c)
のような場合は、その距離h「0」とせずに保存しその
逆数を線密度とする。
That is, in the case where the vertical distance h is extremely small as compared with the horizontal distance w as shown in FIGS. 6B and 6D, the distance h is set to "0", and the linear density is set to "0". It is set to “0”. Therefore, in the case of FIGS. 6B and 6D,
Considering one character line 31A, only the straight character line 31 can be extracted. 6 (a) and 6 (c) that the vertical distance h is not extremely small compared to the horizontal distance w.
In such a case, the distance h is not set to “0” and stored, and the reciprocal thereof is set as the linear density.

【0025】このようにして、垂直方向距離hが水平方
向距離wに比較して極端に小さい場合は距離hは「0」
となり、また同様に水平方向距離wが垂直方向距離hに
比較して極端に小さい場合も距離wは「0」になる。従
って、図6(e)に示すような直線部分が多少湾曲して
いる手書き文字「上」の文字線の周囲は線間隔が「0」
(即ち、線密度は「0」)の領域を有することになる。
このように、垂直及び水平の2方向の距離を比較して一
方の距離が他方の距離より極端に小さい場合は、一方の
距離を「0」にするようにしたので、図7(a)に示す
ような多少湾曲している直線状の文字線31の周囲では
線密度は「0」となり、従って直線の文字線として抽出
できる。なお、この直線文字抽出の際には、必ずしも外
接矩形枠12を設ける必要はない。なお、この実施の形
態では、手書き文字の例について説明したが、OCRで
読み取った文字データなど、ビットマップデータであれ
ばどのような文字データでも良い。
In this way, when the vertical distance h is extremely small compared to the horizontal distance w, the distance h is "0".
Similarly, when the horizontal distance w is extremely smaller than the vertical distance h, the distance w is “0”. Accordingly, the line spacing is “0” around the character line of the handwritten character “upper” in which the straight line portion is slightly curved as shown in FIG.
(That is, the linear density is “0”).
As described above, when one distance is extremely smaller than the other distance by comparing the distances in the vertical and horizontal directions, one distance is set to “0”. The line density becomes “0” around the slightly curved straight character line 31 as shown, and thus can be extracted as a straight character line. Note that it is not always necessary to provide the circumscribed rectangular frame 12 at the time of this straight-line character extraction. In this embodiment, an example of handwritten characters has been described, but any character data may be used as long as it is bitmap data, such as character data read by OCR.

【0026】[0026]

【発明の効果】以上説明したように本発明によれば、被
識別文字に外接する第1の外接矩形枠を設け、被識別文
字を構成する各文字線間の距離、及び第1の外接矩形枠
の各辺と第1の外接矩形枠に隣接する各文字線間の距離
を求め、各距離の逆数を線密度としてこの線密度情報に
応じて第1の外接矩形枠を各小領域に分割すると共に、
分割した各小領域の各情報を規定の大きさで等分割され
た各領域に各個に写像する場合、各文字線の近傍の線密
度を零にするようにしたので、手書き文字のように多少
湾曲している文字線の場合はその近傍に生じる線密度が
除去され、従って正規な直線文字線として抽出できる。
また、第1の外接矩形枠の外部にこの第1の外接矩形枠
の定数倍の第2の外接矩形枠を設け、第1の外接矩形枠
に隣接する各文字線と第2の外接矩形枠の各辺との距離
を求めこの距離の逆数を線密度とするようにしたので、
第1の外接矩形枠の各辺に隣接した各文字線が対応する
各辺に近接するような文字画像の線密度を小さくでき、
従って非線形正規化した場合に生じるその文字画像の縮
小を防止できることから、非線形正規化後に例えば
「田」と「由」のような類似文字の判別が困難になると
いった問題を回避できる。
As described above, according to the present invention, the first circumscribed rectangular frame circumscribing the character to be identified is provided, the distance between each character line constituting the character to be identified, and the first circumscribed rectangle. The distance between each side of the frame and each character line adjacent to the first circumscribed rectangular frame is determined, and the first circumscribed rectangular frame is divided into small areas in accordance with the line density information using the reciprocal of each distance as the line density. Along with
When each information of each divided small area is mapped to each of the equally divided areas with the specified size, the line density near each character line is set to zero, so it is slightly different from handwritten characters. In the case of a curved character line, the line density occurring in the vicinity thereof is removed, and thus it can be extracted as a normal straight character line.
Further, a second circumscribed rectangular frame having a constant multiple of the first circumscribed rectangular frame is provided outside the first circumscribed rectangular frame, and each character line adjacent to the first circumscribed rectangular frame is connected to the second circumscribed rectangular frame. Since the distance to each side of was calculated and the reciprocal of this distance was used as the linear density,
The line density of a character image in which each character line adjacent to each side of the first circumscribed rectangular frame is close to each corresponding side can be reduced,
Therefore, since it is possible to prevent the character image from being reduced when the nonlinear normalization is performed, it is possible to avoid a problem that it is difficult to determine similar characters such as “ta” and “yu” after the nonlinear normalization.

【図面の簡単な説明】[Brief description of the drawings]

【図1】 本発明を適用した装置の構成を示すブロック
図である。
FIG. 1 is a block diagram showing a configuration of an apparatus to which the present invention is applied.

【図2】 本装置で認識処理される文字画像及びこの文
字画像の輪郭線を示す図である。
FIG. 2 is a diagram illustrating a character image to be recognized by the apparatus and a contour line of the character image.

【図3】 本装置における画素の方向成分の検出状況を
示す図である。
FIG. 3 is a diagram illustrating a detection state of a directional component of a pixel in the present apparatus.

【図4】 輪郭線の非線形正規化処理の状況を示す図で
ある。
FIG. 4 is a diagram illustrating a state of nonlinear normalization processing of a contour line.

【図5】 非線形正規化の際の文字線間の距離(線間
隔)の算出状況を示す図である。
FIG. 5 is a diagram illustrating a calculation state of a distance (line interval) between character lines during nonlinear normalization.

【図6】 文字線間の距離算出の際の処理の要部を示す
図である。
FIG. 6 is a diagram illustrating a main part of a process when calculating a distance between character lines.

【図7】 従来の非線形正規化の際に生じる不具合の例
を示す図である。
FIG. 7 is a diagram illustrating an example of a problem that occurs during conventional nonlinear normalization.

【図8】 従来の非線形正規化の処理手順を示す図であ
る。
FIG. 8 is a diagram showing a processing procedure of conventional nonlinear normalization.

【図9】 文字認識の過程を示す認識処理フローであ
る。
FIG. 9 is a recognition processing flow showing a process of character recognition.

【符号の説明】[Explanation of symbols]

1…CPU、2…タッチパネル、3…LCD、4…メモ
リ、5…辞書メモリ、11,12…外接矩形枠。
1 CPU, 2 touch panel, 3 LCD, 4 memory, 5 dictionary memory, 11 and 12 circumscribed rectangular frame.

Claims (2)

【特許請求の範囲】[Claims] 【請求項1】 被識別文字に外接する第1の外接矩形枠
を設け、前記被識別文字を構成する各文字線間の距離、
及び第1の外接矩形枠の各辺と第1の外接矩形枠に隣接
する各文字線間の距離を求め、前記各距離の逆数を線密
度としてこの線密度情報に応じて第1の外接矩形枠を各
小領域に分割すると共に、分割した各小領域の各情報を
規定の大きさで等分割された各領域に各個に写像する非
線形正規化方法において、 前記各文字線の近傍の線密度を零にするようにしたこと
を特徴とする非線形正規化方法。
A first circumscribing rectangular frame circumscribing the character to be identified; a distance between character lines constituting the character to be identified;
And determining a distance between each side of the first circumscribed rectangular frame and each character line adjacent to the first circumscribed rectangular frame, and defining a reciprocal of each distance as a line density in accordance with the line density information. A non-linear normalization method that divides a frame into each small area and maps each piece of information of each divided small area into each area equally divided into a prescribed size, wherein a line density near each of the character lines is provided. Is set to zero.
【請求項2】 請求項1において、 第1の外接矩形枠の外部にこの第1の外接矩形枠の定数
倍の第2の外接矩形枠を設け、第1の外接矩形枠に隣接
する各文字線と第2の外接矩形枠の各辺との距離を求め
この距離の逆数を線密度とすることを特徴とする非線形
正規化方法。
2. The character according to claim 1, wherein a second circumscribed rectangular frame having a constant multiple of the first circumscribed rectangular frame is provided outside the first circumscribed rectangular frame, and each character adjacent to the first circumscribed rectangular frame is provided. A non-linear normalization method, wherein a distance between a line and each side of a second circumscribed rectangular frame is obtained, and a reciprocal of the distance is used as a line density.
JP8173406A 1996-07-03 1996-07-03 Non-linear normalizing method Pending JPH1021332A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP8173406A JPH1021332A (en) 1996-07-03 1996-07-03 Non-linear normalizing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP8173406A JPH1021332A (en) 1996-07-03 1996-07-03 Non-linear normalizing method

Related Child Applications (1)

Application Number Title Priority Date Filing Date
JP8200089A Division JPH1021333A (en) 1996-07-30 1996-07-30 Non-linear normalizing method

Publications (1)

Publication Number Publication Date
JPH1021332A true JPH1021332A (en) 1998-01-23

Family

ID=15959844

Family Applications (1)

Application Number Title Priority Date Filing Date
JP8173406A Pending JPH1021332A (en) 1996-07-03 1996-07-03 Non-linear normalizing method

Country Status (1)

Country Link
JP (1) JPH1021332A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6795051B2 (en) 2000-05-22 2004-09-21 Nec Corporation Driving circuit of liquid crystal display and liquid crystal display driven by the same circuit
WO2013036329A1 (en) * 2011-09-06 2013-03-14 Qualcomm Incorporated Text detection using image regions

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6795051B2 (en) 2000-05-22 2004-09-21 Nec Corporation Driving circuit of liquid crystal display and liquid crystal display driven by the same circuit
WO2013036329A1 (en) * 2011-09-06 2013-03-14 Qualcomm Incorporated Text detection using image regions
US8942484B2 (en) 2011-09-06 2015-01-27 Qualcomm Incorporated Text detection using image regions

Similar Documents

Publication Publication Date Title
CN104751142B (en) A kind of natural scene Method for text detection based on stroke feature
JP2933801B2 (en) Method and apparatus for cutting out characters
CN110197153B (en) Automatic wall identification method in house type graph
US20080212837A1 (en) License plate recognition apparatus, license plate recognition method, and computer-readable storage medium
JP2002133426A (en) Ruled line extracting device for extracting ruled line from multiple image
JP2014153820A (en) Character segmentation device and character segmentation method
JP3830998B2 (en) Ruled line removal method and character recognition apparatus using the same
CN116052152A (en) License plate recognition system based on contour detection and deep neural network
JP2000235619A (en) Surface image processor and its program storage medium
JPH0950527A (en) Frame extracting device and rectangle extracting device
JPH08190690A (en) Method for determining number plate
JPH1021332A (en) Non-linear normalizing method
JP3172498B2 (en) Image recognition feature value extraction method and apparatus, storage medium for storing image analysis program
JP2868134B2 (en) Image processing method and apparatus
JP2005182660A (en) Recognition method of character/figure
JPH1021333A (en) Non-linear normalizing method
JP4194309B2 (en) Document direction estimation method and document direction estimation program
JP2871590B2 (en) Image extraction method
JP2785747B2 (en) Character reader
JPH02116987A (en) Character recognizing device
JPH0573718A (en) Area attribute identifying system
JP3077929B2 (en) Character extraction method
JP3343305B2 (en) Character extraction device and character extraction method
JP2803709B2 (en) Character recognition device and character recognition method
JPH1021398A (en) Method for extracting directional characteristic vector