JPS60153574A - Character reading system - Google Patents

Character reading system

Info

Publication number
JPS60153574A
JPS60153574A JP59009831A JP983184A JPS60153574A JP S60153574 A JPS60153574 A JP S60153574A JP 59009831 A JP59009831 A JP 59009831A JP 983184 A JP983184 A JP 983184A JP S60153574 A JPS60153574 A JP S60153574A
Authority
JP
Japan
Prior art keywords
character
pattern
cutting
characters
line
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP59009831A
Other languages
Japanese (ja)
Other versions
JPH0614372B2 (en
Inventor
Sueji Miyahara
末治 宮原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP59009831A priority Critical patent/JPH0614372B2/en
Publication of JPS60153574A publication Critical patent/JPS60153574A/en
Publication of JPH0614372B2 publication Critical patent/JPH0614372B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

PURPOSE:To attain accurate character reading even if characters having difference size are contacted each other by determining the number of divisions in accordance with the size when the lump of a black string on a character line is larger than a fixed section, and executing different charactor segmenting methods each other. CONSTITUTION:Characters on a form are converted into binary pattern data by a photoelectric conversion circuit and temporarily stored in a pattern memory 12 from an input terminal 11 to a pattern memory 12. A character segmenting part 13 segments a row pattern including characters for one line by the pattern memory 12, and while moving a remark point in the row direction, executes the scanning of the column direction and takes out data (black string data) obtained by indicating a part including the pattern by the number of black picture elements. In addition, the character segmenting part 13 segments an individual pattern or a forced separating pattern as a discrimination pattern from the row pattern on the basis of the black string data and sends the segmented pattern to a feature extracting part 14. The feature extracting part 14 extracts the feature of the character, a discrimination part 15 collates the extracted feature with a discrimination dictionary part 16 and a character decision part 17 processes the sent data and outputs the selected character as a character reading result.

Description

【発明の詳細な説明】 (技術分野) 本発明は文字ピッチが文字の大きさに等しいような接触
文字の多い文書の文字を高精度でかつ高速に読取ること
ができる文字読取方式に関するものである。
[Detailed Description of the Invention] (Technical Field) The present invention relates to a character reading method that can read characters in a document with many touching characters in which the character pitch is equal to the character size with high precision and at high speed. .

(従来技術) 本発明者は先に、帳票上の文章全走査光電変換し得られ
た文字行のパターンから一文字ずつ切出して文字認識全
行なう文字読取方式において、文字行上の予め定められ
た一定区間内に存在する点列の塊の個数を調べ、−個の
場合はその区間?−一文字ノミターンとみなして切出し
、複数個の場合は該点列の塊を順次適宜に組合わせた複
数の組合わせパターン全それぞれ一文字のノミターンと
みなして切出し、該切出したノミターンとその切出しに
関する情報全出力する切出し工程と、該切出したパター
ンの識別結果とその切出しに関する情報とより一文字の
パターンとみなされている場合はその識別結果をそのま
ま出力し、複数個のパターンとみなされている場合はそ
の複数の組合わせパターンの各々の識別結果の中から最
もパターン幅の長い組合わせ−ξターンに対応する識別
結果を出力する文字決定工程とを有する文字読取方式全
発明した。この発明は、本出願人によって特許出願(特
願昭57−222489号)中である。この先願発明は
文字ピッチが一定でない文書、全角や半角などの文字が
混在した文書などを精度よく、かつ高速に読取ることが
できる利点を有するものの文字の大きさが異なる文字が
接触した場合や接触した文字の一方がかけていた場合な
ど、目的とする文字読取結果が得られない場合も生ずる
おそれがあった。
(Prior Art) The present inventor previously developed a method for character reading in which characters are extracted character by character from a pattern of character lines obtained by full scanning photoelectric conversion of text on a document, and a predetermined constant value on a character line is extracted. Check the number of point sequence clusters that exist within the interval, and if there are - pieces, is it the interval? - Cut out the chisel turns as one character, and if there are multiple combinations of dots, cut out all of the combination patterns that suitably combine the clusters of dots sequentially. Based on the cutting out process to output, the identification result of the cut out pattern, and the information regarding the cutting out, if the pattern is considered to be a single character pattern, the identification result is output as is, and if it is considered to be multiple patterns, the identification result is output as is. A character reading system has been invented which includes a character determination step of outputting a recognition result corresponding to the combination with the longest pattern width - ξ turn from among the recognition results of each of a plurality of combination patterns. This invention is currently under patent application (Japanese Patent Application No. 57-222489) by the present applicant. This prior invention has the advantage of being able to accurately and quickly read documents with uneven character pitches, documents with a mixture of full-width and half-width characters, etc., but when characters of different sizes come into contact or In some cases, the desired character reading result may not be obtained, such as when one of the characters is crossed out.

(発明の目的) 本発明の目的は前述の間眺点に鑑み、文字の大きさが異
なる文字が接触した場合や接触した文字の一方がかけて
いた場合などにおいても、より一層高精度でかつ高速に
読取ることができる文字読取方式全提供することにある
(Objective of the Invention) In view of the above-mentioned point of view, the object of the present invention is to achieve even higher accuracy and even when characters of different sizes touch each other or when one of the touching characters overlaps. The purpose is to provide all character reading methods that can be read at high speed.

(発明の構成〕 前述の目的を達成するため、第1の発明は帳票上の文字
全走査光電変換して得られた黒白2値の文字行のパター
ンから一文字ずつ切出して文字認識を行なう文字読取方
式において、文字行上の予め定められた一定区間内に存
在する点列の塊の個数音調べ、文字行上の点列の塊が予
め定められた一定区間より大きい場合、点列の塊の大き
さに応じて分割数全快め、かつ互いに異なる複数種の文
字切出し方法を行って、文字切出しに関する情報と強制
分離した切出し−にターンを出力する文字切出し工程と
、該切出し・ξターンの識別結果とその切出しに関す不
情報を用い、複数種の文字切出し方法の中から最も確度
の高い値を示す文字切出し方法を最適な文字切出し方法
とみなし、その文字切出し方法で得られた識別結果全文
字読取結果として出力する文字決定工程とを有すること
全特徴とし、第2の発明は帳票上の文字全走査光電変換
して得られた黒白2値の文字行のノミターンから一文字
ずつ切出して文字認識を行なう文字読取方式において、
文字行上の予め定められた一定区間内に存在する点列の
塊の個数音調べ、文字行上の点列の塊が予め定められた
一定区間よシ大きい場合、点列の塊の分割数を点列り塊
の大きさに応じて複数種設定し、かつ互いに異なる複・
数種の文字切出し方法を行って、文字切出しに関する情
報と強制分離した切出しパターン全出力する文字切出し
工程と、該切出しノミターンの識別結果とその切出しに
関する情報?用い、複数種の文字切出し方法の中から最
も確度の高い値を示す文字切出し方法全最適な文字切出
し方法とみなし、その文字切出し方法で得られた識別結
果全文字読取結果として出力する文字決定工程と?有す
ることを特徴とする。
(Structure of the Invention) In order to achieve the above-mentioned object, the first invention is a character reading method that performs character recognition by cutting out each character from a black and white binary character line pattern obtained by full scanning photoelectric conversion of characters on a form. In this method, the number of clusters of dots that exist within a predetermined interval on a character line is counted, and if the cluster of dots on the character line is larger than a predetermined interval, the number of clusters of dots on the character line is counted. A character cutting process in which the number of divisions is fully divided according to the size, and a plurality of different character cutting methods are performed to output information on character cutting and turns in forcedly separated cutouts, and identification of the cutouts and ξ turns. Using the results and the unknown information regarding the extraction, the character extraction method that shows the highest accuracy value among multiple character extraction methods is regarded as the optimal character extraction method, and all identification results obtained with that character extraction method are The second invention is characterized by a character determination step that outputs a character reading result, and the second invention recognizes characters by cutting out each character from the chisel turn of a black and white binary character line obtained by full scanning photoelectric conversion of characters on a form. In the character reading method that performs
Check the number of clusters of dots that exist within a predetermined interval on the character line, and if the cluster of dots on the character line is larger than the predetermined interval, calculate the number of divisions of the cluster of dots. Multiple types are set depending on the size of the point array block, and different types and types are set.
A character cutting process in which several types of character cutting methods are performed to output all information on character cutting and the forcedly separated cut-out patterns, and the identification results of the cut-out chisels and information on the cutting-out? A character determination process in which the character extraction method that has the highest accuracy value among multiple types of character extraction methods is regarded as the optimal character extraction method, and the identification results obtained by that character extraction method are output as all character reading results. and? It is characterized by having.

(実施例〕 図面は本発明の実施例?示すものであって、図中11は
入力端子、12はパターンメモ1ハ13は文字切出し部
、14は特徴抽出部、15は識別部、16は識別辞書部
、17は文字決定部、18は出力端子である。
(Embodiment) The drawing shows an embodiment of the present invention, in which 11 is an input terminal, 12 is a pattern memo 1, 13 is a character cutting section, 14 is a feature extraction section, 15 is an identification section, and 16 is an identification section. 17 is a character determining section; 18 is an output terminal;

前述の構成における各部の動作全以下に説明する。まず
、帳票上の文字全光電変換装置(図示せず)により白黒
2値のパター・ンテータに変換し、これを入力端子ll
金介してノミターンメモリ12に一旦蓄える。文字切出
し部13は該ノぐターンメモリ12より第2図に示すよ
うな一行分の文字を含む行パターン20金切出し、次に
注目点全行方向(図中矢印X方向)に移動しつつ、列方
向(図中矢印Y方向)の走査を行ない、パターンが存在
する部分を黒画素の個数で表わし、存在しない部分音0
として表示したデータ(以下これ?点列データと称す)
′30 ’r取り出す。更に該文字切出し部131d黒
列データ30に基づいて後述する文字切出しの処理全実
行し、行パターン20よシ、個別ノミターン21や強制
分離ノミターン22など全識別用ノξターンとして切出
し、文字切出しに関する情報(行・ぞターン20におけ
る文字切出し位置、一定区間α内の点列の塊の個数N、
黒点列塊を検出するための動作全何回繰り返したかを表
す動作番号DNO,一定区間α定区間列内塊を組合せて
作成したAターン番号PNO,強制分離の鶴類数Mおよ
び強制分離の種類毎の分離数L)とともに一対の識別用
の文字ノミターンデータとして特徴抽出部14に順次送
出する。特徴抽出部14では送られて来た文字パターン
から文字の特徴?抽出し、特徴データと文字切出しに関
する情報とを識別部15に送出する。識別部15では識
別辞書部16との照合をとり、識別用の文字・ξターン
を順次識別し、その識別結果(たとえば文字コードと類
似度など)と文字切出しに関する情報と?一対のデータ
として文字決定部17に順次送出する。文字決定部17
では送られて来た該データに後述の処理を施し、そこで
選択されたものを文字読取結果として出力端子18に出
力する。
The operation of each part in the above configuration will be fully explained below. First, the characters on the form are converted into a binary black and white pattern by a photoelectric conversion device (not shown), and this is sent to the input terminal ll.
It is temporarily stored in the nomiturn memory 12 via the money. The character cutting unit 13 cuts out a line pattern 20 including one line of characters as shown in FIG. 2 from the turn memory 12, and then moves in the direction of all the lines of interest (in the direction of the arrow X in the figure) while Scanning is performed in the column direction (direction of arrow Y in the figure), and the part where the pattern exists is represented by the number of black pixels, and the part where the pattern does not exist is 0.
Data displayed as (hereinafter referred to as point sequence data)
'30 'r Take out. Furthermore, the character cutting unit 131d executes all character cutting processes to be described later based on the black column data 30, and cuts out all of the line patterns 20, individual chisel turns 21, forced separation chisel turns 22, etc. as ξ turns for identification. Information (character cutting position in row/zo turn 20, number N of clusters of point sequences within a certain interval α,
An operation number DNO indicating how many times the operation to detect a sunspot sequence cluster is repeated, an A turn number PNO created by combining the constant interval α constant interval clusters, the number M of cranes for forced separation, and the type of forced separation. It is sequentially sent to the feature extraction unit 14 as a pair of character nomiturn data for identification along with the number of separations L) for each character. The feature extraction unit 14 extracts character features from the received character pattern. Then, the feature data and information regarding character extraction are sent to the identification unit 15. The identification unit 15 performs a comparison with the identification dictionary unit 16, sequentially identifies characters and ξ turns for identification, and combines the identification results (for example, character code and similarity) with information regarding character extraction. The data is sequentially sent to the character determination unit 17 as a pair of data. Character determination section 17
Then, the sent data is subjected to processing to be described later, and the selected data is outputted to the output terminal 18 as a character reading result.

文字切出し部13における強制分離・ξターン22を作
成する文字切出しの処理は第3図に示すようになってい
る。行ノξターン20において、一定区間α内に点列の
塊が1個も存在しない場合畢1個あるいは複数個存在す
る場合は特願昭57−222489号に詳述されている
ので言及しない。点列の塊が一定区間αよシ大きい場合
は、点列の塊の大きさから文字パターンの切出し個数L
k決め(たとえば点列の塊の幅BLMA≦BL<(L十
−!−)MAのときL全文字切出し個数とする。また半
角文字の平均文字幅MA’との関係が(L′−1)Mへ
′≦BL〈(L′士−!−)MA′のとき2 2 L′も文字切出し個数となる)、この値?用いて文字切
出しを行う。この時の文字切出し方法は複数種(M種〕
行ない(たとえばM=3の時は塊の始まり位置から平均
文字幅MA年単位切出す方法■、塊の終了位置から平均
文字幅MA年単位切出す方法■、塊の始まり位置と終了
位置との間fflL等分する点を切出し候補位置とみな
して切出す方法■などがある)、文字パターンの切出し
個数り、切出し種類数M、1つの切出し方法で切出され
た文字パターンに順次付与されるパターン番号PN’0
(PNO=1〜L)9文字切出しの方法ごとに1から+
1ずつ増加させて付与した動作番号DNO(DNO=1
〜M)などの文字切出しに関する情報と切出されたノミ
ターンとを、後続の識別部15へ送出する。識別部15
では特徴抽出部14で抽出された文字パターンの特徴と
識別辞書部16に用意された文字特徴と全照合し、類似
度が一定値以上のものを選択して識別結果とし、文字切
出しに関する情報とともに文字コード、類似度など?文
字決定部17に出力する。文字決定部17では識別部1
6から送られてきた文字切出しに関する情報と識別結果
とから第4図に示す文字決定の処理全行う。
The character cutting process for creating forced separation/ξ turns 22 in the character cutting section 13 is as shown in FIG. In the row ξ turn 20, if there is no cluster of point sequences within the fixed interval α, or if there is one cluster or a plurality of clusters, this will not be discussed since this is detailed in Japanese Patent Application No. 57-222489. If the cluster of point sequences is larger than the fixed interval α, the number of character patterns to be cut out L from the size of the cluster of point sequences
K is determined (for example, when the width of a cluster of dots BLMA≦BL<(L0-!-)MA, L is the number of all characters to be extracted.Also, the relationship with the average character width MA' of half-width characters is (L'-1 ) to M'≦BL<(L'shi-!-) When MA', 2 2 L' is also the number of characters to be cut out), this value? Use this to cut out characters. At this time, there are multiple types of character cutting methods (M type)
(For example, when M = 3, the method of cutting out the average character width MA years from the starting position of the block ■, the method of cutting out the average character width MA years units from the ending position of the block ■, the method of cutting out the average character width MA years units from the starting position of the block There are methods such as (■) that consider the points that divide the space into equal parts as cropping candidate positions, etc.), the number of character patterns to be cut out, the number of types of cutouts M, and the character patterns that are cut out by one cutting method to be sequentially attached. Pattern number PN'0
(PNO=1 to L) 1 to + for each method of cutting out 9 characters
Operation number DNO incremented by 1 (DNO=1
- M) and the like and the cut out chisel turns are sent to the subsequent identification section 15. Identification section 15
Then, the features of the character pattern extracted by the feature extraction unit 14 are compared with the character features prepared in the identification dictionary unit 16, and those with a degree of similarity of more than a certain value are selected as identification results, along with information regarding character extraction. Character code, similarity, etc.? It is output to the character determination section 17. In the character determination section 17, the identification section 1
The entire character determination process shown in FIG. 4 is performed based on the information regarding character extraction sent from 6 and the identification results.

第4図では識別部16から送られてきた文字切出しに関
する情報から識別結果が強制分離ノミターンの識別結果
であるか否かを判定し、強制分離、。
In FIG. 4, it is determined from the information regarding character cutting sent from the identification unit 16 whether or not the identification result is a forced separation nomiturn identification result, and forced separation is performed.

ターンの場合は一次的に識別結果全バッファに格納し、
連続した強制分離−ξターンの最終識別結果が送られて
きた時点で、これまで格納したノ々ツファの中から確度
の高い文字切出し方法全選択してその読取結果として出
力する選択処理を行う。
In the case of a turn, the identification results are temporarily stored in the entire buffer,
When the final identification results of consecutive forced separation -ξ turns are sent, a selection process is performed in which all highly accurate character cutting methods are selected from among the notations stored so far and outputted as the reading results.

次に第2図の行ノξターフ20盆例にとって文字切出し
と文字決定の過程について説明する。
Next, the process of character extraction and character determination will be explained for the example of row no ξ turf 20 trays shown in FIG.

この中で文字決定における選択処理は識別結果として類
似度全周いる方法や識別結果の優先度(ランク)を用い
る方法などが考えられるがここでは類似度を用いて説明
する。行ノξターフ20において、対象区間2の、oタ
ーン「方定」は、点列データ30が予め定められた一定
区間αよシ太きいために文字切出し部13は強制分離の
処理を行う。このとき点列の塊BLが2MAにほぼ等し
いので分割数りがL=2と外シ、文字切出し方法として
前記の例M=3ヶ採用すると、文字切出し部13からの
出カバターンは第5図に示すように「遊J、rt」、r
方」、「定」、「右」「宗」 の6種のノミターンとそ
れぞれの文字切出しに関する情報とになる。文字決定部
17ではこの区間が強制分離ノミターンの区間であるこ
と全検知し、識別結果の中から最も確度の高いもの?選
択する。すなわち、各動作番号ごとにノミターン番号を
付与されタノξターンの識別結果に対して、その類似度
の平均値?求め、ナの値の最も高いもの(図5の例では
項番2となる)を最適な文字切出し方法として採用し、
その時の識別結果「方」、「定」全文字読取結果として
出力端子18に送出する。
Among these, the selection process in character determination can be performed using a method that uses similarity all around as the identification result or a method that uses the priority (rank) of the identification result, but here, similarity will be used for explanation. In the row ξ turf 20, the character segmentation unit 13 performs forced separation processing for the o-turn "direction" in the target section 2 because the point sequence data 30 is thicker than the predetermined constant section α. At this time, since the cluster of dots BL is almost equal to 2MA, the number of divisions is L = 2, and if the above example M = 3 is adopted as the character cutting method, the output pattern from the character cutting part 13 is shown in Figure 5. As shown in "Yu J, rt", r
There are six types of chimiturns: ``Hou'', ``Sei'', ``Right'', and ``Sou'', and information on how to cut out each character. The character determination unit 17 completely detects that this section is a forced separation nomiturn section, and selects the one with the highest accuracy among the identification results. select. In other words, what is the average value of the similarity for the identification results of Tano ξ turns with a Nomiturn number assigned to each action number? The method with the highest value of Na (item number 2 in the example in Figure 5) is adopted as the optimal character extraction method.
At that time, the identification results "Ko" and "Tei" are sent to the output terminal 18 as all character reading results.

このように上記実施例によれば、文字行の点列の塊の大
きさによって一つの文字パターンなのか、文字が接触し
た複数の文字ノミターンなのか全区別するようにしたた
め、−文字として切出す区間と複数の文字として切出す
べき区間なのかを区別することができ、また複数の文字
切出し数と複数種の文字切出し方法全行なうようにして
いるため、全角文字のみならず半角文字の接触も切出す
ことができ、文字読取精度?上げることができる。また
文字切出し部13では点列の塊の大きさに従って機械的
に・ξターンを切出すのみでよいことから、文字読取り
の処理全体?ノ々イブライン構成とすることができ、処
理の高速化がはかれる。
In this way, according to the above embodiment, it is possible to distinguish whether it is a single character pattern or a plurality of characters that touch each other depending on the size of the block of dots in the character line, so that it is cut out as a - character. It is possible to distinguish whether it is an interval or an interval that should be cut out as multiple characters, and since it is possible to perform multiple character cutting numbers and multiple types of character cutting methods, not only full-width characters but also half-width characters can touch each other. Can it be cut out and character reading accuracy? can be raised. In addition, since the character cutting unit 13 only needs to mechanically cut out ξ turns according to the size of the cluster of dots, it is necessary to cut out the entire character reading process. It can be configured as a no-no-b line configuration, and the processing speed can be increased.

前記実施例における文字切出し工程において、点列の塊
の分割数を点列の塊の大きさに応じて複数種設定し、か
つ互いに異なる複数種の文字切出し方法を行うようにし
てもよい。このようにすれば読取り精度がなお一層向上
する。
In the character cutting step in the embodiment, a plurality of types of division numbers of a dot sequence block may be set depending on the size of the dot sequence block, and a plurality of different character cutting methods may be performed. In this way, the reading accuracy is further improved.

(発明の効果)。(Effect of the invention).

以上説明したように本発明によれば、帳票上の文書を走
査光電変換して得られた文字行のパターンから一文字ず
つ切出して文字認識?行う文字切出し方式において、文
字行上の点列の塊の大きさ全調べ、予め定められた一定
区間より大きい場合、点列の塊の大きさに応亡て分割数
を決めかつ互いに異なる複数種の文字切出し方法7行っ
て、それぞれ?−文字パターンとみなして強制的に切出
し、該切出しタノξターンとその切出しに関する情報と
を出力し、該切出したノミターンの切出しに関する情報
から強制分離ノミターンであることを判定し、強制分離
ノミターンの識別結果から確度の高い切出し方法を検出
し、その識別結果1文字読取結果として出力するように
したため、接触が生じた文字?含む文書の読取りが複雑
な処理を行なうことなく一義的に行うことができ、処理
の高速化がはかれ、しかも高精度1となる。また点列の
塊の分割数全黒列の塊の大きさに応じて複数種設定しか
つ互いに異なるi数種の文字切出し方法全実行するもの
においては、数多くの種々の強制分離パターン?取出す
ことができ、したがって読取精度を、よシ一層向上でき
る等の利点がある。
As explained above, according to the present invention, character recognition is possible by cutting out each character from a character line pattern obtained by scanning and photoelectrically converting a document on a form. In the character extraction method to be used, all sizes of clusters of dots on a character line are checked, and if the size is larger than a predetermined interval, the number of divisions is determined according to the size of the cluster of dots, and multiple types of different types are determined. How to cut out characters in 7 ways? - Regard it as a character pattern and forcibly cut it out, output the cut-out Tano ξ-turn and information about the cut-out, determine that it is a forced separation chisel-turn from the information about the cut-out of the cut-out chisel-turn, and identify the forced-separation chisel-turn. A highly accurate extraction method is detected from the results, and the identification result is output as a single character reading result. The document contained therein can be read uniquely without performing complicated processing, speeding up the processing, and achieving high accuracy. In addition, in the case where multiple types are set depending on the number of divisions of a block of dots and the size of a block of all black lines, and several different character extraction methods are executed, there are many different forced separation patterns. This has the advantage that the reading accuracy can be further improved.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明方式を適用した文字読取装置の一実施例
?示すブロック図、第2図は行ノξターン及びその点列
データの一列を示す説明図、第3図は文字切出し部13
のフローチャート、第4図は文字決定部15のフローチ
ャート、第5図は行、oターンに対する文字切出し、識
別。 文字決定処理の実行のようす?示す説明図である。 11・・・入力端子、12・・すξターンメモリ、13
・・・文字切出し部、14・・・特徴抽出部、15・・
・識別部、16・・・識別辞書部、17・・文字決定部
、18・・・出力端子。 特許出願人 日本電信電話公社 代理人 弁理士 吉 1) 精 、孝
Fig. 1 is an example of a character reading device to which the method of the present invention is applied. FIG. 2 is an explanatory diagram showing a row no ξ turn and its point sequence data, and FIG. 3 is a block diagram showing the character cutting unit 13.
4 is a flowchart of the character determination unit 15, and FIG. 5 is a flowchart of character cutting and identification for rows and o-turns. How is the character determination process executed? FIG. 11...Input terminal, 12...Sξ turn memory, 13
...Character extraction section, 14...Feature extraction section, 15...
-Identification unit, 16...Identification dictionary unit, 17...Character determination unit, 18...Output terminal. Patent applicant Nippon Telegraph and Telephone Public Corporation agent Patent attorney Yoshi 1) Sei, Takashi

Claims (2)

【特許請求の範囲】[Claims] (1)帳票上の文字を走査光電変換して得られた黒白2
値の文字行のノミターンから一文字ずつ切出して文字認
識全行なう文字読取方式において、文字行上の予め定め
られた一定区間内に存在する点列の塊の個数を調べ、文
字行上の点列の塊が予め定められた一定区間よシ大きい
場合、点列の塊の大きさに応じて分割数?決め、゛かつ
互いに異なる複数種の文字切出し方法を行って、文字切
出しに関する情報と強制分離した切出しノソターンを出
力する文字切出し工程と、該切出しパターンの識別結果
とその切出しに関する情報を用い、複数種の文字切出し
方法の中から最も確度の高い値を示す文字切出し方法全
最適な文字切出し方法とみなし、その文字切出し方法で
得られた識別結果を文字読取結果として出力する文字決
定工程とを有することを特徴とする文字読取方1式。
(1) Black and white obtained by scanning and photoelectrically converting characters on a form 2
In a character reading method that performs all character recognition by cutting out each character from the chisel turn of a character line, the number of clusters of point sequences existing within a predetermined interval on the character line is checked, and the number of clusters of point sequences on the character line is calculated. If the block is larger than a predetermined certain interval, how many times should the point sequence be divided according to the size of the block? A character cutting step in which a plurality of different character cutting methods are performed to output information on character cutting and a forcedly separated cutout pattern; and a character determination step in which a character extraction method that exhibits the highest accuracy value among the character extraction methods is regarded as the overall optimal character extraction method, and an identification result obtained by that character extraction method is output as a character reading result. A character reading method featuring:
(2)帳票上の文字を走査光電変換して得られた黒白2
値の文字行のノミターンから一文字ずつ切出して文字認
識全行なう文字読取方式において、文字行上の予め定め
られた一定区間内に存在する点列の塊の個数音調べ、文
字行上の点列の塊が予め定められた一定区間より大きい
場合、点列の塊の分割数?点列の塊の大きさに応じて複
数種設定し、かつ互いに異なる複数種の文字切出し方法
を行って、文字切出しに関する情報と強制分離した切出
しノぐターン?出力する文字切出し工程と、該切出しノ
ミター/の識別結果とその切出しに関する情報を用い、
複数種の文字切出し方法の中から最も確度の高い値を示
す文字切出し方法を最適な文字切出し方法とみなし、そ
の文字切出し方法で得られた識別結果全文字読取結果と
して出力する文字決定工程とを有することを特徴とする
文字読取方式。
(2) Black and white 2 obtained by scanning and photoelectrically converting characters on a form
In a character reading method that performs character recognition by cutting out each character from the number turn of a character line, it is possible to count the number of clusters of dots that exist within a predetermined interval on a character line, and to identify the number of dots on a character line. If the block is larger than a predetermined certain interval, how many times should the block of points be divided? Is it possible to set multiple types depending on the size of the cluster of dots and use multiple different character extraction methods to forcefully separate the information related to character extraction? Using the character cutting process to output, the identification result of the cutting nomimeter, and information regarding the cutting,
A character determination step in which a character extraction method that shows the highest accuracy value among multiple types of character extraction methods is regarded as the optimal character extraction method, and the identification results obtained by that character extraction method are output as all character reading results. A character reading method characterized by:
JP59009831A 1984-01-23 1984-01-23 Character reading method Expired - Lifetime JPH0614372B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP59009831A JPH0614372B2 (en) 1984-01-23 1984-01-23 Character reading method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP59009831A JPH0614372B2 (en) 1984-01-23 1984-01-23 Character reading method

Publications (2)

Publication Number Publication Date
JPS60153574A true JPS60153574A (en) 1985-08-13
JPH0614372B2 JPH0614372B2 (en) 1994-02-23

Family

ID=11731073

Family Applications (1)

Application Number Title Priority Date Filing Date
JP59009831A Expired - Lifetime JPH0614372B2 (en) 1984-01-23 1984-01-23 Character reading method

Country Status (1)

Country Link
JP (1) JPH0614372B2 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6389990A (en) * 1986-10-03 1988-04-20 Nec Corp Character reading system
JPH02220188A (en) * 1989-02-22 1990-09-03 Nec Corp Character recognizing device
JPH05182025A (en) * 1992-01-06 1993-07-23 Omron Corp Character recognition device
JP2007058803A (en) * 2005-08-26 2007-03-08 Canon Inc Online hand-written character recognition device, and online hand-written character recognition method
US7480410B2 (en) 2001-11-30 2009-01-20 Matsushita Electric Works, Ltd. Image recognition method and apparatus for the same method
CN109614847A (en) * 2013-06-09 2019-04-12 苹果公司 Manage real-time handwriting recognition
US11016658B2 (en) 2013-06-09 2021-05-25 Apple Inc. Managing real-time handwriting recognition
US11194467B2 (en) 2019-06-01 2021-12-07 Apple Inc. Keyboard management user interfaces
US11640237B2 (en) 2016-06-12 2023-05-02 Apple Inc. Handwriting keyboard for screens

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6389990A (en) * 1986-10-03 1988-04-20 Nec Corp Character reading system
JP2570703B2 (en) * 1986-10-03 1997-01-16 日本電気株式会社 Character reader
JPH02220188A (en) * 1989-02-22 1990-09-03 Nec Corp Character recognizing device
JPH05182025A (en) * 1992-01-06 1993-07-23 Omron Corp Character recognition device
US7480410B2 (en) 2001-11-30 2009-01-20 Matsushita Electric Works, Ltd. Image recognition method and apparatus for the same method
JP2007058803A (en) * 2005-08-26 2007-03-08 Canon Inc Online hand-written character recognition device, and online hand-written character recognition method
US11016658B2 (en) 2013-06-09 2021-05-25 Apple Inc. Managing real-time handwriting recognition
JP2019164801A (en) * 2013-06-09 2019-09-26 アップル インコーポレイテッドApple Inc. Managing real-time handwriting recognition
CN109614847A (en) * 2013-06-09 2019-04-12 苹果公司 Manage real-time handwriting recognition
US11182069B2 (en) 2013-06-09 2021-11-23 Apple Inc. Managing real-time handwriting recognition
CN109614847B (en) * 2013-06-09 2023-08-04 苹果公司 Managing real-time handwriting recognition
US11816326B2 (en) 2013-06-09 2023-11-14 Apple Inc. Managing real-time handwriting recognition
US11640237B2 (en) 2016-06-12 2023-05-02 Apple Inc. Handwriting keyboard for screens
US11941243B2 (en) 2016-06-12 2024-03-26 Apple Inc. Handwriting keyboard for screens
US11194467B2 (en) 2019-06-01 2021-12-07 Apple Inc. Keyboard management user interfaces
US11620046B2 (en) 2019-06-01 2023-04-04 Apple Inc. Keyboard management user interfaces
US11842044B2 (en) 2019-06-01 2023-12-12 Apple Inc. Keyboard management user interfaces

Also Published As

Publication number Publication date
JPH0614372B2 (en) 1994-02-23

Similar Documents

Publication Publication Date Title
US4903312A (en) Character recognition with variable subdivisions of a character region
US6151423A (en) Character recognition with document orientation determination
JPS5827551B2 (en) Online handwritten character recognition method
JPS60153574A (en) Character reading system
JP2986074B2 (en) Neighboring point detection method and pattern recognition device
JPS60153575A (en) Character reading system
JPH0210472B2 (en)
JP2005149395A (en) Character recognition device and license plate recognition system
JPH08161432A (en) Method and device for segmenting character
JP3710164B2 (en) Image processing apparatus and method
KR910007032B1 (en) A method for truncating strings of characters and each character in korean documents recognition system
JPH0713994A (en) Character recognizing device
JP2993533B2 (en) Information processing device and character recognition device
JPH11120291A (en) Pattern recognition system
JP2746345B2 (en) Post-processing method for character recognition
JP2571236B2 (en) Character cutout identification judgment method
JPS63118993A (en) Character recognizing method
JP2570311B2 (en) String recognition device
JPS6343788B2 (en)
JP2851102B2 (en) Character extraction method
KR19990010213A (en) Character Recognition Method with Improved Matching Speed
EP1173823A1 (en) Handwriting coding and recognition
JPH1011540A (en) Character recognition method
JPH09297817A (en) Character segmenting method
JPS59149569A (en) Optical character reader