JPH11224310A

JPH11224310A - Character recognizing device

Info

Publication number: JPH11224310A
Application number: JP10038101A
Authority: JP
Inventors: Hiroshi Sasaki; 佐々木　　寛; Yoshinori Ookuma; 好憲大熊
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1998-02-04
Filing date: 1998-02-04
Publication date: 1999-08-17

Abstract

PROBLEM TO BE SOLVED: To attain decrease in the number of times of arithmetic processing by limiting the kind of a secondary segment provided by combining primary segments. SOLUTION: A primary segment extracting part 105 provides plural primary segments by finding and dividing the segment candidate position of a character from the image of a character string to be a recognizing object. A certainty flag set part 106 finds out a position having the wide interval of primary segments, judges it as the boundary of characters and sets a certainty flag. While considering this certainty flag, a secondary segment preparing part determines the combination of primary segments, prepares the secondary segment and defines it as the object of character recognition.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、記入された文字の
イメージからその文字を認識して文字コード化するよう
な場合に使用される文字認識装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognizing device used for recognizing a character from an image of a written character and converting the character into a character code.

【０００２】[0002]

【従来の技術】帳票上に記入された文字をスキャナで読
み取り、文字コード化する文字認識装置は、コンピュー
タにデータをエントリするような場合に広く活用されて
いる。この装置は、各文字のイメージを切り取ってその
特徴を抽出し、予め登録された特徴辞書と比較照合して
その認識を行う。活字を対象としていた従来の文字認識
装置では、活字文字の等間隔に並ぶ性質を利用して、入
力画像から一定間隔で文字を切り出していた。しかし、
手書き文字やプロポーショナルフォントのような文字が
対象となると、文字間隔や文字形状が大きく変化した
り、隣接文字と接触したりする。従って、上記のような
「一定間隔」で切り出す方法では、文字を切り出す位置
を誤る「切出し誤り」が多くなってしまうという問題が
ある。2. Description of the Related Art A character recognizing device which reads a character written on a form by a scanner and converts the character into a character code is widely used when data is entered into a computer. This device cuts out the image of each character, extracts its feature, compares it with a previously registered feature dictionary, and recognizes it. In a conventional character recognition device for printing characters, characters are cut out at regular intervals from an input image by utilizing the property of printing characters at regular intervals. But,
When a character such as a handwritten character or a proportional font is targeted, the character spacing or the character shape changes greatly, or the character contacts adjacent characters. Therefore, in the above-described method of extracting characters at “fixed intervals”, there is a problem that “cutting errors” in which characters are extracted at incorrect positions are increased.

【０００３】そこで、この問題の対処方法として、特開
平３−３７７８２号公報に記載されたものが紹介されて
いる。この技術では、文字を一定間隔で切り出すのでは
なく、文字を構成する黒点の分布を求めたヒストグラム
を解析して、文字の端と思われる個所を調べて切出し候
補位置にしていく。この段階で、２つの切出し候補位置
に挟まれた部分が、文字の部分パタンであり、これを組
み合わせて文字候補を作成している。[0003] To cope with this problem, a method described in Japanese Patent Laid-Open Publication No. 3-37782 is introduced. In this technique, a character is not cut out at regular intervals, but a histogram obtained from a distribution of black points constituting the character is analyzed, and a portion considered to be an end of the character is checked to be a cutout candidate position. At this stage, the portion sandwiched between the two extraction candidate positions is a character partial pattern, which is combined to create a character candidate.

【０００４】[0004]

【発明が解決しようとする課題】ところで、上記のよう
な従来の技術には次のような解決すべき課題があった。
従来技術では、各切出し候補位置で分割して得た文字の
部分パタンを組み合わせて、新しい候補文字パタンを作
り、可能な限りの切出し方を考えるようにしている。し
かしながら、可能な全ての切出し候補位置を考えると、
認識処理すべき文字数に比例して計算量が膨大になると
いう問題がある。またこれにより、文字認識のための処
理時間も長くなってしまうという問題がある。However, the above-mentioned prior art has the following problems to be solved.
In the related art, a new candidate character pattern is created by combining partial patterns of characters obtained by dividing at each extraction candidate position, and a method of extracting as much as possible is considered. However, considering all possible extraction candidate positions,
There is a problem that the amount of calculation becomes enormous in proportion to the number of characters to be recognized. In addition, this causes a problem that the processing time for character recognition becomes longer.

【０００５】[0005]

【課題を解決するための手段】本発明は以上の点を解決
するため次の構成を採用する。〈構成１〉入力画像中に並んだ文字列について、各文字
を構成する黒点の分布を表すヒストグラムを取得し、文
字の並んだ方向に対してほぼ垂直方向に向いた複数の切
出し候補位置を求め、この切出し候補位置を境に上記文
字列を複数に分割して１次セグメント群を得る１次セグ
メント抽出部と、互いに隣接する文字の境目にある１次
セグメントを、その他のセグメントと区別するために確
度フラグをセットする確度フラグセット部と、上記確度
フラグに着目して、文字の境目を挟む１次セグメントの
組合せを除外しながら、上記１次セグメント抽出部が抽
出した互いに隣接する１次セグメントを統合して、２次
セグメントを作成する２次セグメント作成部と、上記１
次セグメントまたは２次セグメントで囲まれる候補文字
画像を文字認識する文字認識部と、候補木を用いて文字
認識結果を評価する候補木選択部とを備えたことを特徴
とする文字認識装置。The present invention employs the following structure to solve the above problems. <Structure 1> For a character string arranged in an input image, a histogram representing the distribution of black points constituting each character is obtained, and a plurality of extraction candidate positions oriented substantially perpendicular to the direction in which the characters are arranged are obtained. A primary segment extracting unit that divides the character string into a plurality of parts at the cutout candidate position to obtain a primary segment group, and distinguishes a primary segment at a boundary between mutually adjacent characters from other segments. A probability flag setting unit for setting a probability flag in the first segment and a primary segment extracted by the primary segment extraction unit while excluding a combination of primary segments sandwiching a character boundary by focusing on the accuracy flag. A secondary segment creation unit for creating a secondary segment by integrating
A character recognition device comprising: a character recognition unit that character-recognizes a candidate character image surrounded by a next segment or a secondary segment; and a candidate tree selection unit that evaluates a character recognition result using a candidate tree.

【０００６】〈構成２〉構成１に記載の装置において、
上記確度フラグセット部は、上記第１次セグメントの全
てに対して、その１次セグメントと隣接する１次セグメ
ントとの間のスペース幅を計算し、そのスペース幅が予
め与えられた閾値以上の場合、当該セグメントに確度フ
ラグをセットして、そのセグメント間に文字の境界を挟
むという情報をいずれか一方の１次セグメントに付加す
ることを特徴とする文字認識装置。<Structure 2> In the apparatus described in Structure 1,
The accuracy flag set unit calculates a space width between the primary segment and an adjacent primary segment for all of the primary segments, and when the space width is equal to or larger than a predetermined threshold value. A character recognition apparatus, wherein a certainty flag is set for the segment and information indicating that a character boundary is sandwiched between the segments is added to one of the primary segments.

【０００７】〈構成３〉構成１に記載の装置において、
上記確度フラグセット部は、上記１次セグメントの全て
に対して、そのセグメントの形状を表す特徴量を求め
て、その特徴量が予め与えられた範囲にある場合、当該
セグメントに確度フラグを付加することを特徴とする文
字認識装置。<Structure 3> In the device described in Structure 1,
The certainty flag setting unit obtains a feature amount representing the shape of the primary segment for all of the primary segments, and adds a certainty flag to the segment when the feature amount is within a predetermined range. A character recognition device characterized in that:

【０００８】〈構成４〉構成１に記載の装置において、
上記確度フラグセット部は、上記１次セグメントの全て
に対して、一度文字認識を行い、文字の特定部分の図形
を認識する辞書と照合して、該当する図形と認識された
場合には、当該１次セグメントに確度フラグを付加する
ことを特徴とする文字認識装置。<Structure 4> In the device described in Structure 1,
The accuracy flag set unit performs character recognition once for all of the primary segments, compares the character with a dictionary for recognizing a graphic of a specific part of the character, and if the corresponding graphic is recognized, A character recognition device, wherein a probability flag is added to a primary segment.

【０００９】[0009]

【発明の実施の形態】以下、本発明の実施の形態を具体
例を用いて説明する。《具体例１》〈全体の構成〉図１は、具体例１の文字認識装置全体の
ブロック図である。この図の説明をする前に、まず、文
字のヒストグラム取得と切出し候補位置の検出の概念に
ついて説明をする。図２は、従来知られたヒストグラム
と文字認識方法の説明図である。図２（ａ）には、手書
きされた「八王子」という文字列について、各文字を構
成する黒点の分布を表すヒストグラムを取得し、その文
字の並んだ方向に対してほぼ垂直方向に向いた切出し候
補位置を求めた状態を示す。図の破線２１、２２、２３
が切出し候補位置である。こうして、ヒストグラムの連
続性に基づいた基本パタン２４〜２７を抽出する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiments of the present invention will be described below using specific examples. << Specific Example 1 >><OverallConfiguration> FIG. 1 is a block diagram of the entire character recognition device of Specific Example 1. Before explaining this figure, first, the concept of obtaining a histogram of a character and detecting a cutout candidate position will be described. FIG. 2 is an explanatory diagram of a conventionally known histogram and character recognition method. In FIG. 2A, for a character string “Hachioji” handwritten, a histogram representing the distribution of black spots constituting each character is obtained, and the cutout is oriented substantially perpendicular to the direction in which the characters are arranged. This shows a state in which candidate positions have been obtained. Dashed lines 21, 22, 23 in the figure
Is the extraction candidate position. Thus, the basic patterns 24 to 27 based on the continuity of the histogram are extracted.

【００１０】従来は、単独の基本パタンまたは連続する
複数個の基本パタンを図の（ｂ）に示す要領で結合した
パタンを文字候補としてその評価値を求めていた。これ
では、図の（ｃ）に示した候補木に示すように、最適な
文字の切出し候補位置を決定するまでに非常に多くの組
合せに基づく演算処理の繰り返しを必要とする。これで
は処理時間が長くなる。Conventionally, a single basic pattern or a pattern obtained by combining a plurality of continuous basic patterns in the manner shown in FIG. 1B is used as a character candidate to determine the evaluation value. In this case, as shown in the candidate tree shown in FIG. 3C, it is necessary to repeat the arithmetic processing based on a very large number of combinations before determining the optimal character extraction candidate position. This increases the processing time.

【００１１】本発明では、この図の（ａ）のようにして
文字の切出し候補位置を求め、上記基本パタンに相当す
る部分（本発明ではこれを１次セグメントと呼ぶ）を得
ると共に、切出し候補位置の確度という新たな概念を導
入する。これにより、１次セグメントを組み合わせて得
る２次セグメントの数を削減し、演算処理回数の減少を
図っている。In the present invention, a candidate position for extracting a character is obtained as shown in FIG. 2A to obtain a portion corresponding to the basic pattern (this is called a primary segment in the present invention). Introduce a new concept of position accuracy. As a result, the number of secondary segments obtained by combining the primary segments is reduced, and the number of calculation processes is reduced.

【００１２】再び図１に戻って、図の装置の構成を説明
する。図の装置は、全体の処理の制御を司る制御部１０
０と、入力画像を格納する画像格納部１０１、文字の中
間情報である１次セグメントと２次セグメントを一時格
納する１次・２次セグメント情報格納部１０２、文字画
像とその認識結果である文字コードを一時格納するため
の文字情報格納部１０３、そして、１次セグメントと２
次セグメントとの組合せを候補木として格納する候補木
格納部１０４を備える。Returning to FIG. 1, the configuration of the apparatus shown in the figure will be described. The apparatus shown in the figure has a control unit 10 which controls the entire processing.
0, an image storage unit 101 for storing an input image, a primary / secondary segment information storage unit 102 for temporarily storing primary and secondary segments as intermediate information of a character, a character image and a character as a recognition result thereof A character information storage unit 103 for temporarily storing a code;
A candidate tree storage unit 104 is provided for storing a combination with the next segment as a candidate tree.

【００１３】また、画像格納部１０１から入力画像を読
み取り、１次セグメントを抽出し、１次・２次セグメン
ト情報格納部１０２に、当該１次セグメント情報を格納
するための、１次セグメント抽出部１０５と、１次・２
次セグメント情報格納部から、上記格納された１次セグ
メント情報を読み取り、確度フラグをセットする確度フ
ラグセット部１０６を備える。A primary segment extraction unit for reading an input image from the image storage unit 101, extracting a primary segment, and storing the primary segment information in a primary / secondary segment information storage unit 102. 105, primary, 2
There is provided a certainty flag setting unit 106 for reading the stored primary segment information from the next segment information storage unit and setting a certainty flag.

【００１４】さらに、１次・２次セグメント情報格納部
１０２から１次セグメント情報を読み取り、当該１次セ
グメントと当該１次セグメントに右隣接する他の１次セ
グメントとの間のスペース幅が予め用意された条件を満
足する場合は、当該１次セグメントと当該隣接１次セグ
メントとを統合し、２次セグメントにし、当該２次セグ
メントを１次・２次セグメント情報格納部に記録する２
次セグメント作成部１０７を備える。Further, the primary segment information is read from the primary / secondary segment information storage unit 102, and a space width between the primary segment and another primary segment right adjacent to the primary segment is prepared in advance. If the set conditions are satisfied, the primary segment and the adjacent primary segment are integrated into a secondary segment, and the secondary segment is recorded in the primary / secondary segment information storage unit.
A next segment creation unit 107 is provided.

【００１５】また、上記１次・２次セグメント情報格納
部１０２から１次セグメント情報と２次セグメント情報
を読み取り、対応する座標にある画像を、上記画像格納
部１０１から読み取り、当該画像をセグメントで囲われ
た文字パタンとして、文字情報格納部１０３に格納する
文字パタン切出し部１０８と、上記文字情報格納部１０
３から上記切り出した文字パタンを文字認識し、認識結
果である文字コードの上位第Ｋ位までを文字情報格納部
に格納する文字認識部１０９を設ける。Also, the primary and secondary segment information are read from the primary / secondary segment information storage unit 102, an image at the corresponding coordinates is read from the image storage unit 101, and the image is segmented. A character pattern cutout unit 108 to be stored in the character information storage unit 103 as an enclosed character pattern;
3 is provided with a character recognizing unit 109 for recognizing the character pattern extracted from the character pattern and storing up to the K-th character code as a recognition result in the character information storage unit.

【００１６】このほかに、上記１次・２次セグメント情
報格納部１０２からセグメント情報を、上記文字情報格
納部１０３から文字情報を読み取り、文字切出しの組合
せ（以下候補木という）を作成し、候補木格納部に格納
する候補木作成部１１０と、上記候補木格納部から最良
の切出し・認識結果であるセグメントの組合せを出力す
る結果出力部１１１を設ける。In addition, the segment information is read from the primary / secondary segment information storage unit 102 and the character information is read from the character information storage unit 103, and a combination of character extraction (hereinafter referred to as a candidate tree) is created. A candidate tree creation unit 110 to be stored in the tree storage unit, and a result output unit 111 that outputs the best combination of segments that are the result of extraction and recognition from the candidate tree storage unit are provided.

【００１７】〈確度フラグセット部〉次に、確度フラグ
セット部１０６の構成について説明する。図３は、具体
例１の確度フラグセット部ブロック図である。確度フラ
グセット部１０６は、１次・２次セグメント情報格納部
１０２から１次セグメント情報を取得するためのセグメ
ントＡ取得部８００とセグメントＢ取得部８０１と、上
記取得した１次セグメントＡと１次セグメントＢとの間
のスペース幅を計算するためのスペース幅計算部８０２
と、計算結果であるスペース幅値を一時的に格納するた
めのスペース幅格納部８０３を備える。<Probability Flag Set Unit> Next, the configuration of the accuracy flag set unit 106 will be described. FIG. 3 is a block diagram of the certainty flag setting unit according to the first embodiment. The accuracy flag set unit 106 includes a segment A acquisition unit 800 and a segment B acquisition unit 801 for acquiring primary segment information from the primary / secondary segment information storage unit 102, and the acquired primary segment A and primary Space width calculation unit 802 for calculating the space width between segment B
And a space width storage unit 803 for temporarily storing a space width value as a calculation result.

【００１８】また、確度フラグをセットするか否かの基
準となる閾値を格納するためのスペース幅閾値格納部８
０５と、上記スペース幅格納部に格納されたスペース幅
値と上記スペース幅閾値格納部に格納された閾値を参照
し、確度フラグをセットするか否かを決定するためのフ
ラグセット判定部８０４と、確度フラグセット時に当該
セグメントに確度フラグをセットするためのセグメント
設定部８０６を備える。上記の各機能ブロックは、制御
部１００と接続されている。Further, a space width threshold value storage unit 8 for storing a threshold value as a reference for determining whether or not to set the accuracy flag.
A flag set determining unit 804 for determining whether or not to set an accuracy flag by referring to the space width value stored in the space width storage unit and the threshold value stored in the space width threshold value storage unit; And a segment setting unit 806 for setting a certainty flag in the segment when setting a certainty flag. Each of the above functional blocks is connected to the control unit 100.

【００１９】〈２次セグメント作成部〉図４は、具体例
１の２次セグメント作成部のブロック図である。２次セ
グメントは、複数の隣り合う１次セグメントを１つに統
合してできるセグメントである。図の２次セグメントの
作成部は、１次・２次セグメント情報格納部１０２から
１次セグメントＡの情報を取得するための１次セグメン
トＡ情報取得部１０００と、同じく１次セグメントＢ情
報取得部１００１と、その両セグメント情報を一時的に
格納しておくための１次セグメントＡ情報格納部１００
２と１次セグメントＢ情報格納部１００３を備える。<Secondary Segment Creation Unit> FIG. 4 is a block diagram of the secondary segment creation unit according to the first embodiment. The secondary segment is a segment formed by integrating a plurality of adjacent primary segments into one. A primary segment A information acquiring unit 1000 for acquiring primary segment A information from the primary / secondary segment information storage unit 102, and a primary segment B information acquiring unit similarly 1001 and a primary segment A information storage unit 100 for temporarily storing both segment information.
2 and a primary segment B information storage unit 1003.

【００２０】また、上記１次セグメントＡと左端点を同
じくする統合セグメントを格納するための統合セグメン
ト情報格納部１００４と、統合セグメントの右端点に確
度フラグがセットされているかを検査するための確度フ
ラグ検査部１００５と、セグメントの統合の可能性判定
に使う距離閾値を、予め格納しておくための統合閾値条
件格納部１００６と、該閾値格納部に格納されている閾
値と統合セグメントと１次セグメントＢとの距離Ｄとに
基づいて統合可能性を判定するための統合可能性判定部
１００７とを備える。An integrated segment information storage unit 1004 for storing an integrated segment having the same left end point as that of the primary segment A, and an accuracy for checking whether an accuracy flag is set at the right end of the integrated segment. A flag inspection unit 1005, an integrated threshold condition storage unit 1006 for storing in advance a distance threshold used for determining the possibility of segment integration, a threshold stored in the threshold storage unit, an integrated segment, and a primary An integration possibility determination unit 1007 for determining integration possibility based on the distance D to the segment B.

【００２１】さらに、上記統合可能性判定部１００７の
結果に基づいて、統合セグメントと１次セグメントＢを
統合して統合セグメントを作成し、該統合セグメントを
２次セグメントとして、１次・２次セグメント情報格納
部１０２に格納するためのセグメント統合部１００８を
備える。Further, based on the result of the integration possibility determination section 1007, the integrated segment and the primary segment B are integrated to create an integrated segment, and the integrated segment is defined as a secondary segment, and the primary and secondary segments A segment integration unit 1008 for storing the information in the information storage unit 102 is provided.

【００２２】〈候補木作成部〉図５は、具体例１の候補
木作成部ブロック図である。図の候補木作成部１１０
は、候補木を作成するのに必要な節をキューとして格納
するための候補節キュー格納部１２０４と、候補木に追
加するために１次・２次セグメント情報格納部１０２か
ら１次セグメントあるいは２次セグメントの情報を取得
するための１次・２次セグメント情報取得部１２００
と、上記１次・２次セグメント情報格納部１０２で取得
したセグメント情報を候補木に追加するための候補木追
加部１２０１を備える。<Candidate Tree Creation Unit> FIG. 5 is a block diagram of the candidate tree creation unit according to the first embodiment. Figure candidate tree creation unit 110
Are stored in a candidate node queue storage unit 1204 for storing a node necessary for creating a candidate tree as a queue, and a primary segment or a secondary segment information stored in the primary / secondary segment information storage unit 102 for addition to the candidate tree. Primary / secondary segment information acquisition section 1200 for acquiring information on the next segment
And a candidate tree adding unit 1201 for adding the segment information acquired by the primary / secondary segment information storage unit 102 to the candidate tree.

【００２３】また、上記候補木に追加する際に、親ノー
ドの評価値に上記１次・２次セグメントの評価値を加え
る計算をするための評価値計算部１２０５と、上記候補
木の節に相当するセグメント情報を候補節キュー格納部
に追加するキュー追加部１２０２と、逆に、キューから
情報を取得するためのキュー取得部１２０３を備える。Also, when adding to the candidate tree, an evaluation value calculation unit 1205 for calculating the addition of the evaluation values of the primary and secondary segments to the evaluation value of the parent node, A queue adding unit 1202 for adding the corresponding segment information to the candidate clause queue storage unit, and conversely, a queue acquiring unit 1203 for acquiring information from the queue.

【００２４】〈全体処理〉図６には、この装置全体の動
作フローチャートを示す。この動作は、図の開始の位置
から処理が始まり、一連の処理を行い、終了の位置で終
わる。 (1)ステップＳ１．始めに処理対象の画像を入力する。
ここで入力される画像は、文字が行一列に並んだ状態の
画像を指し、一般に「行画像」と言われる画像である。
この行画像を読み取り、図１に示す画像格納部１０１へ
格納することを行う。<Overall Processing> FIG. 6 is a flowchart showing the operation of the entire apparatus. This operation starts from the start position in the figure, performs a series of processes, and ends at the end position. (1) Step S1. First, an image to be processed is input.
The image input here refers to an image in which characters are arranged in a line, and is an image generally called a “line image”.
This line image is read and stored in the image storage unit 101 shown in FIG.

【００２５】(2)ステップＳ２．次に１次セグメント抽
出処理を行う。この１次セグメント抽出処理は、従来か
らよく知られるヒストグラムを利用する方法で容易に実
現が可能である。図７（ａ）には、１次セグメント抽出
の例説明図を示す。具体的には、この図７（ａ）に示す
ように、文字の並びに垂直の方向に黒点のヒストグラム
ＫＶ（以下、垂直方向ヒストグラム）を計算する。次
に、上記抽出した垂直方向ヒストグラムをその横軸に沿
って走査し、ある閾値以上の値が連続する範囲を抽出す
る。(2) Step S2. Next, a primary segment extraction process is performed. This primary segment extraction processing can be easily realized by a method using a well-known histogram. FIG. 7A illustrates an example of primary segment extraction. Specifically, as shown in FIG. 7A, a black point histogram KV (hereinafter, a vertical direction histogram) is calculated in the vertical direction of the character sequence. Next, the extracted vertical direction histogram is scanned along the horizontal axis to extract a range in which values equal to or greater than a certain threshold value are continuous.

【００２６】これにより、水平方向の第１次セグメント
の切出し候補位置が求められる。さらに、上記抽出した
水平方向の切出し点の組について、水平方向のヒストグ
ラムＫＨ（以下、水平方向ヒストグラム）を計算し、垂
直方向の切出し候補位置を求める。例えば図の左端の１
次セグメントについては、図中に示した下向きと横向き
の矢印が切出し候補位置になる。次に当該１次セグメン
トの幅Ｗと高さＨとその比率ρ（以下縦横比とする）を
計算し、この情報と共に、上記抽出された１次セグメン
トを図１に示した１次・２次セグメント情報格納部１０
２に格納する。As a result, a candidate position for cutting out the primary segment in the horizontal direction is obtained. Further, a horizontal direction histogram KH (hereinafter, horizontal direction histogram) is calculated for the extracted set of horizontal cutout points, and a vertical cutout candidate position is obtained. For example, 1 at the left end of the figure
For the next segment, the downward and sideward arrows shown in the figure are the extraction candidate positions. Next, the width W and height H of the primary segment and its ratio ρ (hereinafter referred to as aspect ratio) are calculated, and together with this information, the extracted primary segment is converted into the primary / secondary shown in FIG. Segment information storage unit 10
2 is stored.

【００２７】(3)ステップＳ３．次に確度フラグセット
処理をする。図７（ｂ）には、確度フラグのセット例説
明図を示す。その具体的な構成と処理方法については後
述することとし、ここでは確度フラグについて説明す
る。確度フラグは、文字の右端位置となる可能性を示す
度合いを表すものであり、１次セグメントに確度フラグ
がセットされれば、「該１次セグメントの右端点が文字
の右端点になる」ことを意味している。(3) Step S3. Next, a certainty flag setting process is performed. FIG. 7B illustrates an example of setting the accuracy flag. The specific configuration and processing method will be described later, and here, the accuracy flag will be described. The accuracy flag indicates the degree of possibility that the character will be at the right end position of the character. If the accuracy flag is set in the primary segment, "the right end point of the primary segment becomes the right end point of the character" Means

【００２８】換言すれば、確度フラグは、互いに隣接す
る文字の境目にある１次セグメントや２次セグメント
を、その他のセグメントと区別するための情報である。
この確度フラグを１次セグメント（１次セグメントが組
み合わされて２次セグメントになったときはその２次セ
グメント）に対してセットすることで、次のステップで
作成される２次セグメントの数を抑えることができる。
さらに、この２次セグメントの数を抑えることが、後述
する候補木の大きさも抑えることになり、全体として計
算量を抑えた処理を実現することができる。In other words, the accuracy flag is information for distinguishing a primary segment or a secondary segment at a boundary between adjacent characters from other segments.
By setting this accuracy flag for the primary segment (or the secondary segment when the primary segment is combined into a secondary segment), the number of secondary segments created in the next step is suppressed. be able to.
Furthermore, reducing the number of secondary segments also reduces the size of the candidate tree described later, and can realize processing with a reduced amount of calculation as a whole.

【００２９】なお、セグメントの確度フラグをセットす
れば、互いに隣接する２つの１次セグメント間に文字の
境界を挟むという情報が付加される。これは、隣接する
１次セグメントのうちのいずれか一方の１次セグメント
に付加すればよく、また、両方に付加しても差し支えな
い。またあるいは、そのほかの１次セグメントに、上記
境界に接したセグメントでないという情報を付加するよ
うにして、間接表示をしてもよい。If the segment accuracy flag is set, information that a character boundary is sandwiched between two adjacent primary segments is added. This may be added to any one of the adjacent primary segments, or may be added to both. Alternatively, indirect display may be performed by adding information indicating that the segment does not touch the boundary to the other primary segments.

【００３０】(4)ステップＳ４．次に２次セグメントの
作成処理を行う。図８には、２次セグメントの作成例説
明図を示す。具体的な構成と処理方法については後述す
ることとし、ここでは、簡単に処理の説明をする。上記
抽出した１次セグメント（以下１次セグメントＡと呼
ぶ）を左端から走査し、隣接する他の１次セグメント
（以下１次セグメントＢと呼ぶ）と統合し、新規セグメ
ントである２次セグメントを作成する。図８中の矩形枠
に囲まれた部分が２次セグメントである。このとき、当
該１次セグメントＡに確度フラグがセットされている場
合は、２次セグメントを作成しない。この処理により、
作成する２次セグメントの数を抑えている。作成した２
次セグメントは、図１に示す１次・２次セグメント情報
格納部１０２に格納する。(4) Step S4. Next, secondary segment creation processing is performed. FIG. 8 is a diagram illustrating an example of creating a secondary segment. The specific configuration and processing method will be described later, and the processing will be briefly described here. The extracted primary segment (hereinafter referred to as primary segment A) is scanned from the left end, and integrated with another adjacent primary segment (hereinafter referred to as primary segment B) to create a new secondary segment, a secondary segment. I do. The portion surrounded by the rectangular frame in FIG. 8 is the secondary segment. At this time, if the accuracy flag is set for the primary segment A, no secondary segment is created. With this process,
The number of secondary segments to be created is reduced. Created 2
The next segment is stored in the primary / secondary segment information storage unit 102 shown in FIG.

【００３１】(5)ステップＳ５．次に文字パタン切出し
処理を行う。上記作成した１次セグメント、及び、２次
セグメントの座標情報から、セグメント枠で囲まれる領
域の画像を、当該セグメントの文字パタンとして抽出
し、これを図１に示す文字情報格納部１０３に格納す
る。(5) Step S5. Next, character pattern cutout processing is performed. From the coordinate information of the primary segment and the secondary segment created as described above, an image of an area surrounded by a segment frame is extracted as a character pattern of the segment and stored in the character information storage unit 103 shown in FIG. .

【００３２】(6)ステップＳ６．次に文字認識処理を行
う。この文字認識処理は、文字情報格納部１０３から文
字パタン情報を読み取り、そのパタンを文字として認識
を行う。認識の結果である文字コードの確信度Ｒｉは上
位Ｋ個（本具体例ではＫ＝１としている）を、対応する
セグメントの文字情報として、再び文字情報格納部１０
３に格納する。(6) Step S6. Next, character recognition processing is performed. In this character recognition process, character pattern information is read from the character information storage unit 103, and the pattern is recognized as a character. As for the certainty factor Ri of the character code as a result of the recognition, the top K (K = 1 in this specific example) is again used as the character information of the corresponding segment, and the character information storage unit 10
3 is stored.

【００３３】(7)ステップＳ７．次に候補木作成処理を
行う。図９には、比較例の候補木説明図を示す。また、
図１０には、具体例１の候補木説明図を示す。具体的な
作成処理については、後述することとして、ここでは候
補木について説明する。候補木は、１次セグメント及び
２次セグメントの組合せからなる並びであり、上記入力
した文字列画像の複数の切出し方を表現している。図９
は確度フラグを使用しないときの従来方法を採用した候
補木を示し、図１０は確度フラグを使用したときの具体
例１の候補木を示している。候補木の左端の黒丸点（●
印）から順に矢印を辿り、右端のセグメントまで達する
道筋が、入力画像の複数考えられる切出し方の中の１つ
である。(7) Step S7. Next, candidate tree creation processing is performed. FIG. 9 shows a candidate tree explanatory diagram of the comparative example. Also,
FIG. 10 illustrates a candidate tree explanatory diagram of the first specific example. The specific creation processing will be described later, and here, the candidate tree will be described. The candidate tree is a row composed of a combination of a primary segment and a secondary segment, and expresses a plurality of cutout methods of the input character string image. FIG.
Shows a candidate tree adopting the conventional method when the certainty flag is not used, and FIG. 10 shows a candidate tree of the first specific example when the certainty flag is used. The black dot on the left end of the candidate tree (●
The path from the mark to the rightmost segment is one of a plurality of possible extraction methods of the input image.

【００３４】図９や図１０に示す候補木には左端に根が
あり、そこから右へ向かって枝が伸びているのがわか
る。つまり、図９の例の場合には、７本、図１０の例の
場合には、４本分の切出し方が候補として挙がったこと
になる。また、各ノードに描いた２次セグメントの下の
文字と数値は、それぞれそのセグメントの認識結果と評
価値である。評価値は大きいほど良い評価であることを
意味する。なお、本具体例では、縦横比と認識結果を利
用した評価値となっている。It can be seen that the candidate tree shown in FIGS. 9 and 10 has a root at the left end, and a branch extends from there to the right. In other words, in the case of the example of FIG. 9, seven pieces are cut out, and in the case of the example of FIG. 10, four pieces are cut out as candidates. The characters and numerical values below the secondary segment drawn on each node are the recognition result and the evaluation value of that segment, respectively. The larger the evaluation value, the better the evaluation. In this specific example, the evaluation value is an evaluation value using the aspect ratio and the recognition result.

【００３５】(8)ステップＳ８．最後に結果出力処理を
行う。上記作成された候補木の中から、最大の評価値を
持つ枝を、最終結果として出力する。図１０の具体例の
場合は、一番下の枝が最終結果であり、その認識結果は
“品川区”である。(8) Step S8. Finally, a result output process is performed. The branch having the largest evaluation value is output as the final result from the created candidate trees. In the case of the specific example of FIG. 10, the bottom branch is the final result, and the recognition result is “Shinagawa-ku”.

【００３６】〈確度フラグセット部〉ここでは、図６に
示した全体処理のステップＳ３で説明した確度フラグの
処理について説明する。図１１に具体例１の確度フラグ
セット処理フローチャートを示す。これは、図７を参照
しながら説明する。 (1)ステップＳ１１．先ず最左端の１次セグメントを１
次セグメントＡとし、当該１次セグメントＡの情報を取
得する。 (2)ステップＳ１２．当該１次セグメントＡの右隣に１
次セグメントが存在するか判断する。 (3)ステップＳ１３．もし存在するならば、当該１次セ
グメントＡの右隣にある１次セグメントを１次セグメン
トＢとし、その情報を取得する。そうでなければ、「Ｏ
ＵＴ」に進み処理を終了する。<Probability Flag Set Unit> Here, the process of the probability flag described in step S3 of the overall process shown in FIG. 6 will be described. FIG. 11 shows a flowchart of the accuracy flag setting process according to the first embodiment. This will be described with reference to FIG. (1) Step S11. First, set the leftmost primary segment to 1
As the next segment A, information on the primary segment A is obtained. (2) Step S12. 1 to the right of the primary segment A
Determine whether the next segment exists. (3) Step S13. If there is, the primary segment on the right of the primary segment A is set as the primary segment B, and the information is acquired. Otherwise, "O
UT "and the process ends.

【００３７】(4)ステップＳ１４．当該１次セグメント
Ａと当該１次セグメントＢの情報から、その間にあるス
ペース幅を計算し、これをｓｐとする。図７（ｂ）に
は、ｓｐ０〜ｓｐ３までを示した。 (5)ステップＳ１５．上記計算し格納しておいたｓｐ
と、予め用意しておいた閾値ＴＨの値を比較する。 (6)ステップＳ１６．比較の結果、ｓｐが閾値以上の値
ならば、当該１次セグメントＡに確度フラグをセットす
る。該確度フラグにより、当該１次セグメントＡの右端
点は、必ず文字の右端点となる。比較の結果、ｓｐが閾
値未満の値ならば、何も処理しない。 (7)ステップＳ１７．当該１次セグメントＢを１次セグ
メントＡとし、ステップＳ１へ戻る。これで、同様の処
理を更に右側のセグメントについて実行することにな
る。(4) Step S14. From the information of the primary segment A and the primary segment B, the space width between them is calculated, and this is set to sp. FIG. 7B shows sp0 to sp3. (5) Step S15. Sp calculated and stored above
And the value of the threshold value TH prepared in advance. (6) Step S16. As a result of the comparison, if sp is equal to or larger than the threshold value, the accuracy flag is set in the primary segment A. By the accuracy flag, the right end point of the primary segment A is always the right end point of the character. As a result of the comparison, if sp is less than the threshold value, no processing is performed. (7) Step S17. The primary segment B is set as the primary segment A, and the process returns to step S1. Thus, the same processing is executed for the further right segment.

【００３８】既に説明をした図７（ｂ）において、下向
きの三角形により確度フラグをセットした位置を示す。
この例では２つの位置で確度フラグがセットされてい
る。なお、ステップＳ５で、用いた閾値ＴＨは、行高さ
の０．４倍とした。これは、経験的に求めた値であり、
これに限定されるものではない。In FIG. 7B which has already been described, the position at which the accuracy flag is set is indicated by a downward triangle.
In this example, accuracy flags are set at two positions. Note that the threshold value TH used in step S5 was set to 0.4 times the row height. This is an empirically determined value,
It is not limited to this.

【００３９】〈２次セグメント作成部〉図１２に、具体
例１の２次セグメントの作成部フローチャートを示す。 (1)ステップＳ２１．先ず、最左端の１次セグメントを
１次セグメントＡとし、当該１次セグメントＡの情報を
取得する。 (2)ステップＳ２２．当該１次セグメントＡを統合セグ
メントとし、図４の統合セグメント情報格納部１００４
に格納する。 (3)ステップＳ２３．当該統合セグメントの右隣に他の
１次セグメントＢが存在するかを判断する。ＮＯなら
ば、ＯＵＴへ進み、処理を終了する。 (4) ステップＳ２４．ＹＥＳならば、当該１次セグメン
トＢの情報を取得する。<Secondary Segment Creation Unit> FIG. 12 shows a flowchart of the creation unit of the secondary segment of the first embodiment. (1) Step S21. First, the leftmost primary segment is defined as primary segment A, and information on the primary segment A is acquired. (2) Step S22. The primary segment A is an integrated segment, and the integrated segment information storage unit 1004 in FIG.
To be stored. (3) Step S23. It is determined whether another primary segment B exists on the right of the integrated segment. If NO, proceed to OUT and end the process. (4) Step S24. If YES, the information of the primary segment B is obtained.

【００４０】(5)ステップＳ２５．当該統合セグメント
に確度フラグがセットされているかを判断する。 (6)ステップＳ３０．ＹＥＳならば、当該１次セグメン
トＢを１次セグメントＡとしてステップＳ２２に戻る。 (7)ステップＳ２６．ＮＯならば、当該統合セグメント
と当該１次セグメントＢとの距離Ｄを計算する。距離と
は、当該１次セグメントＢの左端点のＸ座標から当該統
合セグメントの左端点のＸ座標の間の差である。なお、
縦書き文字列が対象の場合、当該１次セグメントＢの上
端点のＹ座標から当該統合セグメントの上端点のＹ座標
の間の距離になる。(5) Step S25. It is determined whether or not the accuracy flag is set for the integrated segment. (6) Step S30. If YES, the process returns to step S22 with the primary segment B as the primary segment A. (7) Step S26. If NO, the distance D between the integrated segment and the primary segment B is calculated. The distance is a difference between the X coordinate of the left end point of the primary segment B and the X coordinate of the left end point of the integrated segment. In addition,
In the case of a vertically written character string, the distance is the distance between the Y coordinate of the upper end point of the primary segment B and the Y coordinate of the upper end point of the integrated segment.

【００４１】(8)ステップＳ２７．上記計算した距離Ｄ
は、予め用意された閾値以下か判断する。 (9)ステップＳ２８．ＹＥＳならば、当該統合セグメン
トと当該１次セグメントＢを統合し、新たに統合セグメ
ントとする。このとき、当該１次セグメントＢにセット
されている確度フラグも該統合セグメントにセットされ
る。 (10)ステップＳ２９．更に、該統合セグメントを２次セ
グメントとして、１次・２次セグメント情報格納部１０
２へ格納し再びステップＳ２３へ戻る。 (11)なお、ステップＳ２７の判断の結果がＮＯならば、
ステップＳ３０で当該１次セグメントＢを１次セグメン
トＡとし、ステップＳ２２へ戻る。(8) Step S27. Distance D calculated above
Is determined to be equal to or less than a prepared threshold. (9) Step S28. If YES, the integrated segment and the primary segment B are integrated to form a new integrated segment. At this time, the accuracy flag set in the primary segment B is also set in the integrated segment. (10) Step S29. Further, the integrated segment is set as a secondary segment, and the primary / secondary segment information storage unit 10 is used.
2 and return to step S23 again. (11) If the result of the determination in step S27 is NO,
In step S30, the primary segment B is set as the primary segment A, and the process returns to step S22.

【００４２】既に説明をした図８には、２種類の２次セ
グメントの例を示した。図中、左がこの具体例で確度フ
ラグを利用して作成された２次セグメントであり、右が
確度フラグを使用せずに作成された２次セグメントの例
である。なお、上記ステップＳ２６で用いた距離Ｄの閾
値ＴＨは、行高さの１．２倍とした。この数値は、経験
的に設定したものであり、これに限定されない。以上の
ように、確度フラグに着目して、文字の境目を挟む１次
セグメントの組合せを除外しながら、１次セグメント抽
出部が抽出した互いに隣接する１次セグメントを統合し
て、削減された数の２次セグメントを作成する。FIG. 8 described above shows an example of two types of secondary segments. In the figure, the left is an example of the secondary segment created using the accuracy flag in this specific example, and the right is an example of the secondary segment created without using the accuracy flag. Note that the threshold value TH of the distance D used in step S26 was 1.2 times the row height. This numerical value is set empirically and is not limited to this. As described above, by paying attention to the accuracy flag, the primary segments that are adjacent to each other and extracted by the primary segment extraction unit are integrated while excluding the combination of the primary segments sandwiching the boundary of the character, and the number of reduced primary segments is reduced. Create a secondary segment of

【００４３】〈候補木作成部〉次に、図６の全体処理の
ステップＳ７で簡単に説明した候補木の作成処理につい
て説明する。図１３には、具体例１の候補木作成部フロ
ーチャートを示す。 (1)ステップＳ３１．始めに、候補節キューＱを空にす
る。これは、キューの初期化を意味する。 (2)ステップＳ３２．次に、最左端に位置する「全ての
１次セグメント、あるいは、２次セグメント」の情報を
候補節キューＱに追加する。また、同時に、全ての当該
最左端セグメントを、候補木の根ノードの子ノードとし
て、評価値を計算し、候補木に追加する。なお、評価値
の計算については、後で図１４を参照しながら詳しく説
明をする。<Candidate Tree Creation Unit> Next, the candidate tree creation processing briefly described in step S7 of the overall processing in FIG. 6 will be described. FIG. 13 shows a flowchart of the candidate tree creating unit according to the first embodiment. (1) Step S31. First, the candidate clause queue Q is emptied. This means initialization of the queue. (2) Step S32. Next, information on “all primary segments or secondary segments” located at the leftmost end is added to the candidate node queue Q. At the same time, evaluation values are calculated for all the leftmost segments as child nodes of the root node of the candidate tree, and are added to the candidate tree. The calculation of the evaluation value will be described later in detail with reference to FIG.

【００４４】(3)ステップＳ３３．候補節キューＱは空
かどうかを判断する。ＹＥＳならば、処理を終了するた
めにＯＵＴへ進む。 (4)ステップＳ３４．ＮＯならば、候補節キューＱの先
頭要素の１次セグメント、あるいは、２次セグメントの
情報を取得し、当該セグメントを親ノードＰとする。(3) Step S33. It is determined whether the candidate clause queue Q is empty. If YES, the process proceeds to OUT to end the processing. (4) Step S34. If NO, the primary node or the secondary segment information of the head element of the candidate clause queue Q is obtained, and the segment is set as the parent node P.

【００４５】(5)ステップＳ３５．親ノードＰの右に隣
接する、全ての１次セグメント、あるいは２次セグメン
トが存在するかどうかを判断する。ＮＯならば、ステッ
プＳ３３へ戻る。 (6)ステップＳ３６．ＹＥＳならば、全ての当該右隣接
セグメントを、親ノードＰの子ノードとして、評価値を
計算し、候補木に追加する。 (7)ステップＳ３７．全ての当該右隣接セグメントを、
候補節キューＱに追加する。その後ステップＳ３３へ戻
る。(5) Step S35. It is determined whether all primary segments or secondary segments adjacent to the right of the parent node P exist. If NO, the process returns to step S33. (6) Step S36. If YES, the evaluation value is calculated for all the right adjacent segments as child nodes of the parent node P and added to the candidate tree. (7) Step S37. All relevant right adjacent segments
It is added to the candidate clause queue Q. Thereafter, the process returns to step S33.

【００４６】既に説明をした図１０において、まず、左
端の根から木の作成を開始し、徐々に木を伸ばしてい
き、最終的に枝が４本の候補木となった。各葉の評価値
を、ノードの２次セグメント下の括弧中に数値で示し
た。この段階で２次セグメントの個数を抑えることで、
後段の候補木のサイズを抑えることができる。In FIG. 10 which has already been described, a tree is first created from the root at the left end, and the tree is gradually extended, eventually becoming a candidate tree having four branches. The evaluation value of each leaf is shown numerically in parentheses below the secondary segment of the node. By reducing the number of secondary segments at this stage,
The size of the subsequent candidate tree can be reduced.

【００４７】〈評価値の計算〉次に、具体例１における
評価値の計算方法について説明する。図１４には、評価
値の計算例説明図を示す。まず、次のように評価値を定
義する。１つのセグメントの評価値＝正規化係数ｎと縦横比ρと
認識結果の確信度Ｒの積候補木の根から１つの葉に至る枝の評価値＝その枝を構
成するノードの全評価値の和<Calculation of Evaluation Value> Next, a method of calculating the evaluation value in the specific example 1 will be described. FIG. 14 is a diagram illustrating an example of calculating an evaluation value. First, an evaluation value is defined as follows. Evaluation value of one segment = product of normalization coefficient n, aspect ratio ρ, and certainty factor R of recognition result Evaluation value of branch from root of candidate tree to one leaf = sum of all evaluation values of nodes constituting the branch

【００４８】このとき、現在構成中の候補木の各枝の評
価値は、次の手順で容易に計算できる。 (1)ステップＳ４１．始めに親ノードの評価値Ｖｐを取
得する。 (2)ステップＳ４２．追加する子ノード自身の評価値ｎ
×ρ×Ｒの値を取得する。 (3)ステップＳ４３．親ノードの評価値Ｖと子ノード自
身の評価値の和（Ｖｐ＝Ｖｐ＋ｎ×ρ×Ｒ）を計算す
る。こうして、図の矢印に沿って右に進むように見ていく
と、各ノードの評価値が順に累積された結果となる。At this time, the evaluation value of each branch of the currently configured candidate tree can be easily calculated by the following procedure. (1) Step S41. First, the evaluation value Vp of the parent node is obtained. (2) Step S42. Evaluation value n of the child node itself to be added
Obtain the value of × ρ × R. (3) Step S43. The sum (Vp = Vp + n × ρ × R) of the evaluation value V of the parent node and the evaluation value of the child node itself is calculated. In this way, when viewed rightward along the arrow in the figure, a result is obtained in which the evaluation values of each node are sequentially accumulated.

【００４９】〈具体例１の効果〉従来技術では、１次セ
グメントを組み合わせて２次セグメントを作成する際、
全ての組合せを考慮に入れるため、多くの２次セグメン
トが作られる仕組みとなっていた。更に、それが原因
で、後段で作成される候補木が大きくなり、計算量も膨
大となり、処理速度が低下した。<Effect of Specific Example 1> In the prior art, when a secondary segment is created by combining primary segments,
Many secondary segments were created to take into account all combinations. Furthermore, due to this, the candidate tree created in the subsequent stage becomes large, the amount of calculation becomes enormous, and the processing speed decreases.

【００５０】一方、この具体例に示した確度フラグを導
入すると、２次セグメントの個数を抑えることができる
ので、候補木のサイズを大幅に小さくすることができ、
計算量と処理速度の面で大幅な向上が見られる。また、
確度フラグに基づいて正解枝を確実に残すため、誤切出
しを大幅に減少でき、ひいては認識精度の大幅な向上を
図ることができる。特に、文字間隔を空けて記入された
文字列データでは、非常に大きな効果を示す。On the other hand, when the accuracy flag shown in this specific example is introduced, the number of secondary segments can be suppressed, so that the size of the candidate tree can be greatly reduced,
Significant improvements are seen in terms of computational complexity and processing speed. Also,
Since the correct answer branch is reliably left based on the accuracy flag, erroneous extraction can be significantly reduced, and the recognition accuracy can be greatly improved. In particular, character string data written with a space between characters has a very large effect.

【００５１】例えば図９の比較例と図１０の具体例とを
比べてわかるように、確度フラグを使用していない候補
木よりも、確度フラグを使用した候補木のほうが、枝の
数が約半分に減少している。For example, as can be seen by comparing the comparative example of FIG. 9 with the specific example of FIG. 10, the candidate tree using the certainty flag has a smaller number of branches than the candidate tree not using the certainty flag. It has been reduced by half.

【００５２】《具体例２》〈全体の構成〉具体例２の装置の全体構成は、図１に示
した具体例１と同様である。よって、ここでの説明を省
略する。ただし、この具体例２では、確度フラグセット
部の構成が異なる。<< Specific Example 2 >><OverallConfiguration> The overall configuration of the device of the specific example 2 is the same as that of the specific example 1 shown in FIG. Therefore, the description here is omitted. However, in the specific example 2, the configuration of the accuracy flag setting unit is different.

【００５３】〈確度フラグセット部の構成〉図１５に
は、具体例２の確度フラグセット部のブロック図を示
す。図の確度フラグセット部１０６＊は、１次・２次セ
グメント情報格納部１０２から、１次セグメントを取得
するための１次セグメントＡ取得部２０００と、該取得
した１次セグメントＡの幅・高さ・縦横比を計算するた
めのセグメントの幅・高さ・縦横比計算部２００２と、
該計算した１次セグメントＡの幅・高さ・縦横比を一時
的に格納するための１次セグメントＡの幅・高さ・縦横
比格納部２００３と、該計算した１次セグメントＡの幅
・高さ・縦横比と比較するための閾値を格納しておくた
めのセグメントの高さ・幅・縦横比閾値格納部２００４
を備える。<Structure of Probability Flag Set Unit> FIG. 15 is a block diagram of the accuracy flag set unit according to the second embodiment. The accuracy flag set unit 106 * in the figure includes a primary segment A acquisition unit 2000 for acquiring a primary segment from the primary / secondary segment information storage unit 102, and a width / height of the acquired primary segment A. A width / height / aspect ratio calculating unit 2002 for calculating a height / aspect ratio;
A width / height / aspect ratio storage unit 2003 for temporarily storing the calculated width / height / aspect ratio of the primary segment A, and a width / height / width ratio of the calculated primary segment A Segment height / width / aspect ratio threshold storage unit 2004 for storing a threshold for comparison with the height / aspect ratio
Is provided.

【００５４】また、上記計算した１次セグメントＡの幅
・高さ・縦横比と、対応する閾値とを比較し、当該１次
セグメントＡに確度フラグをセットするかどうかを判定
するためのフラグセット判定部２００５と、上記判定の
結果、確度フラグを設定するための１次セグメントＡ設
定部２００６を備える。さらに、確度フラグセット部
は、上記フラグセット判定部２００５にて、フラグセッ
トされた１次セグメントＡの左に隣接する他の１次セグ
メントを１次セグメントＢとして取得するための１次セ
グメントＢ取得部２００１と、上記取得した１次セグメ
ントＢに確度フラグをセットするための１次セグメント
Ｂ設定部２００７を備える。A flag set for comparing the calculated width / height / aspect ratio of the primary segment A with a corresponding threshold value and determining whether or not to set the accuracy flag for the primary segment A A determination unit 2005 and a primary segment A setting unit 2006 for setting an accuracy flag as a result of the determination are provided. Further, the accuracy flag set unit obtains a primary segment B for obtaining, as the primary segment B, another primary segment adjacent to the left of the flag-set primary segment A in the flag set determination unit 2005. And a primary segment B setting unit 2007 for setting a certainty flag in the acquired primary segment B.

【００５５】〈全体の処理〉全体の処理手順は、確度フ
ラグセット部１０６＊の処理を除いて、具体例１と同じ
である。よって、確度フラグセット部の処理以外は、図
６を参照して説明をする。 (1)ステップＳ１．まず、処理対象となる画像を入力す
る。ここで入力される画像は、具体例１の場合と同様
に、文字が行一列に並んだ状態の画像を指し、一般に行
画像といわれる画像である。この行画像を読み取り、画
像格納部へ格納することを行う。<Overall Processing> The overall processing procedure is the same as that of the first embodiment, except for the processing of the accuracy flag setting unit 106 *. Therefore, description will be made with reference to FIG. 6 except for the processing of the accuracy flag setting unit. (1) Step S1. First, an image to be processed is input. The image input here refers to an image in which characters are arranged in a line, as in the case of the specific example 1, and is an image generally called a line image. This line image is read and stored in the image storage unit.

【００５６】(2)ステップＳ２．次に、１次セグメント
抽出の処理を行う。この１次セグメント抽出処理も、具
体例１で説明した方法と同様である。図１６（ａ）に、
具体例２のデータに対する１次セグメント抽出の例説明
図を示した。図の「純一郎」という文字パタンの上に重
ねて描かれた矩形枠が１次セグメントである。また、抽
出した１次セグメントの幅Ｗと高さＨとその比率ρ（以
下縦横比とする）を計算し、この情報と共に、上記抽出
された１次セグメントを１次・２次セグメント情報格納
部に格納する。この情報は、後段の処理で使用する情報
である。(2) Step S2. Next, a primary segment extraction process is performed. This primary segment extraction process is the same as the method described in the first embodiment. In FIG. 16A,
An example explanatory diagram of the primary segment extraction for the data of the specific example 2 is shown. A rectangular frame drawn over the character pattern "Junichiro" in the figure is the primary segment. Also, the width W and height H of the extracted primary segment and its ratio ρ (hereinafter, referred to as aspect ratio) are calculated, and together with this information, the extracted primary segment is stored in a primary / secondary segment information storage unit. To be stored. This information is information used in the subsequent processing.

【００５７】(3)ステップＳ３．次に、確度フラグセッ
トの処理をする。具体的な処理方法については後述す
る。 (4)ステップＳ４．次に、２次セグメントの作成処理を
行う。具体例２では、具体例１で説明した方法と同じ処
理を行う。図１６（ｂ）と（ｃ）にその例を示した。(3) Step S3. Next, the processing of the accuracy flag set is performed. A specific processing method will be described later. (4) Step S4. Next, a secondary segment creation process is performed. In the specific example 2, the same processing as the method described in the specific example 1 is performed. FIGS. 16B and 16C show examples.

【００５８】また、図１７には、確度フラグをセットす
る条件を示す。さらに、図１８には、２次セグメントの
作成例を示した。図の例では、確度フラグが「一」の文
字の前後にセットされているため、図１８（ａ）に示す
ような２つの２次セグメントが作成され、図１８（ｂ）
に示すような確度フラグを中間に含めた２次セグメント
は作成されない。FIG. 17 shows conditions for setting the accuracy flag. FIG. 18 shows an example of creating a secondary segment. In the example of the figure, since the accuracy flag is set before and after the character of "1", two secondary segments as shown in FIG.
The secondary segment including the certainty flag in the middle is not created.

【００５９】(5)ステップＳ５．次に、文字パタン切出
し処理を行う。上記作成した１次セグメント、及び、２
次セグメントの座標情報から、セグメント枠で囲まれる
領域の画像を、当該セグメントの文字パタンとして抽出
し、これを文字情報格納部に格納する。 (6)ステップＳ６．次に、文字認識処理を行う。この文
字認識処理は、文字情報格納部から文字パタン情報を読
み取り、そのパタンを文字として認識を行う。認識の結
果である文字コードの確信度Ｒｉは上位Ｋ個（本具体例
ではＫ＝１としている）を、対応するセグメントの文字
情報として、文字情報格納部に格納する。(5) Step S5. Next, a character pattern cutout process is performed. Primary segment created above, and 2
From the coordinate information of the next segment, an image of the area surrounded by the segment frame is extracted as a character pattern of the segment, and stored in the character information storage unit. (6) Step S6. Next, character recognition processing is performed. In this character recognition process, character pattern information is read from the character information storage unit, and the pattern is recognized as a character. As the certainty factor Ri of the character code as a result of the recognition, the upper K (K = 1 in this specific example) is stored in the character information storage unit as the character information of the corresponding segment.

【００６０】(7)ステップＳ７．次に、候補木作成処理
を行う。具体的な作成処理は、具体例１と同じである。
図１９には比較例の候補木、図２０には具体例２の候補
木を示す。図１９に示す候補木は、確度フラグを導入し
なかったときに作成される候補木であり、図２０に示す
候補木は、確度フラグを導入したときに作成される候補
木である。この具体例２では、確度フラグを導入するこ
とで、候補木のサイズを４／９のサイズの抑えることに
成功している。この場合、当然処理速度も向上する。(7) Step S7. Next, a candidate tree creation process is performed. The specific creation processing is the same as in the first embodiment.
FIG. 19 shows a candidate tree of the comparative example, and FIG. 20 shows a candidate tree of the specific example 2. The candidate tree shown in FIG. 19 is a candidate tree created when the probability flag is not introduced, and the candidate tree shown in FIG. 20 is a candidate tree created when the probability flag is introduced. In the specific example 2, by introducing the accuracy flag, the size of the candidate tree is successfully reduced to 4/9. In this case, the processing speed naturally increases.

【００６１】(8)ステップＳ８．最後に、結果出力処理
を行う。候補木の中から最終結果を求める処理は、具体
例１と同じである。図２０に示す例では、一番下の枝が
評価値最大なので、これが最終結果となっている。(8) Step S8. Finally, a result output process is performed. The process of obtaining the final result from the candidate tree is the same as in the first embodiment. In the example shown in FIG. 20, since the lowest branch has the maximum evaluation value, this is the final result.

【００６２】〈確度フラグセット部の処理〉図２１は、
具体例２の確度フラグのセット部の処理手順を示すフロ
ーチャートである。 (1)ステップＳ４１．まず、１次セグメントの情報を取
得する。 (2)ステップＳ４２．次に、図１７に示したように、当
該１次セグメントの幅Ｗと高さＨと縦横比Ｒを計算す
る。これらを特徴量とする。<Process of accuracy flag setting unit> FIG.
13 is a flowchart illustrating a processing procedure of a setting section of a certainty flag according to a second specific example. (1) Step S41. First, information on the primary segment is obtained. (2) Step S42. Next, as shown in FIG. 17, the width W, the height H, and the aspect ratio R of the primary segment are calculated. These are used as feature amounts.

【００６３】(3)ステップＳ４３．上記計算した特徴量
が、予め用意した閾値ＴＨの範囲内かどうかを判断す
る。図１７に示した式を満足するかどうかの判断であ
る。例えばＷがＴＨ１とＴＨ２の間になければ、確度フ
ラグはセットされない。ＨがＴＨ３とＴＨ４の間になけ
れば、確度フラグはセットされない。ＲがＴＨ５とＴＨ
６の間になければ、確度フラグはセットされない。３つ
全部の式を満足する場合に限り確度フラグをセットす
る。ＮＯならば、何もしないでステップＳ４７に進む。 (4)ステップＳ４４．ＹＥＳならば、当該１次セグメン
トに確度フラグをセットする。 (5)ステップＳ４５．１次セグメントＡの左側の１次セ
グメントＢの情報を取得する。(3) Step S43. It is determined whether the calculated feature amount is within a range of a threshold value TH prepared in advance. This is a determination as to whether or not the expression shown in FIG. 17 is satisfied. For example, if W is not between TH1 and TH2, the accuracy flag is not set. If H is not between TH3 and TH4, the accuracy flag is not set. R is TH5 and TH
If not, the accuracy flag is not set. The accuracy flag is set only when all three expressions are satisfied. If NO, the process proceeds to step S47 without doing anything. (4) Step S44. If YES, a probability flag is set in the primary segment. (5) Step S45. The information of the primary segment B on the left side of the primary segment A is obtained.

【００６４】(6)ステップＳ４６．１次セグメントＢに
確度フラグをセットする。 (7)ステップＳ４７．当該１次セグメントの右に他の１
次セグメントがあるかどうかを判断する。ＹＥＳなら
ば、ステップＳ４１へ戻り、ＮＯならば、ＯＵＴへ進み
処理を終了する。(6) Step S46. A probability flag is set in the primary segment B. (7) Step S47. Another one to the right of the primary segment
Determine if there is a next segment. If YES, the process returns to step S41, and if NO, the process proceeds to OUT and the process ends.

【００６５】この具体例２では、図１６に示す例では、
１次セグメントに対して、確度フラグをセットしてい
る。「一」の前後に確度フラグがセットされている理由
は、予め「一」という文字の特徴量（幅、高さ、縦横
比）の値の範囲を統計的に求めておき、図１７の確度フ
ラグセットの条件にあるＴＨ１からＴＨ６の形で用意し
ておいたからである。このように、特徴量の閾値は、特
定の形状をした文字について、妥当な大きさや形状判断
をするための経験的な値として用意する。従って、特徴
量の閾値は、判定が可能な限り、何組用意されてもよ
い。In the specific example 2, in the example shown in FIG.
The accuracy flag is set for the primary segment. The reason why the accuracy flag is set before and after “1” is that the range of the value of the characteristic amount (width, height, aspect ratio) of the character “1” is statistically obtained in advance, and the accuracy shown in FIG. This is because they are prepared in the form of TH1 to TH6 in the condition of the flag set. As described above, the threshold value of the feature amount is prepared as an empirical value for determining a proper size and shape of a character having a specific shape. Therefore, any number of sets of threshold values of the feature amount may be prepared as long as the determination is possible.

【００６６】〈具体例２の効果〉予め確度フラグを設定
するための特徴量の閾値を用意することで、２次セグメ
ントの作成個数を抑えることができ、後段の候補木のサ
イズを抑えることができる。これにより、計算量と処理
速度、及び、処理時間を大幅に抑えることができ、従来
問題を解決することができる。特に、この具体例２で示
した確度フラグの計算方法では、横棒の形状をしている
文字（具体例に示した漢数字の一、英字のハイフォン）
等は、隣接文字と接近している場合でも、独立した一つ
の文字であると断定できるため、非常に大きな効果があ
る。この効果は、図１９と図２０で示したように、候補
木のサイズが確度フラグ未使用時に対して４５％まで減
少していることからも理解できる。<Effect of Specific Example 2> By preparing a threshold value of the feature amount for setting the accuracy flag in advance, the number of secondary segments to be created can be reduced, and the size of the subsequent candidate tree can be reduced. it can. As a result, the amount of calculation, the processing speed, and the processing time can be significantly reduced, and the conventional problem can be solved. In particular, in the calculation method of the accuracy flag shown in the specific example 2, the character having the shape of a horizontal bar (one of the kanji numerals shown in the specific example, an alphabetic hyphen)
Are very independent because they can be determined to be one independent character even when they are close to adjacent characters. This effect can also be understood from the fact that the size of the candidate tree is reduced to 45% of that when the accuracy flag is not used, as shown in FIGS. 19 and 20.

【００６７】《具体例３》〈全体の構成〉図２２に、具体例３の装置全体のブロッ
ク図を示す。この具体例３の装置は、具体例１や２の装
置の構成とほぼ同じであるが、確度フラグセット部２２
０６と文字情報格納部２２０３の関係が異なっている。
この図を用いて、本具体例の構成を説明する。<< Embodiment 3 >><OverallConfiguration> FIG. 22 is a block diagram of the entire apparatus of Embodiment 3. The device of the third embodiment is substantially the same as the configuration of the device of the first or second embodiment.
06 and the character information storage unit 2203 are different.
The configuration of this example will be described with reference to FIG.

【００６８】この装置は、全体の処理の制御を司る制御
部２２００と、入力画像を格納する画像格納部２２０
１、文字の中間情報である１次セグメントと２次セグメ
ントを一時格納する第１次・第２次セグメント情報格納
部２２０２、文字画像とその認識結果である文字コード
を一時格納するための文字情報格納部２２０３と、１次
セグメントと２次セグメントの組合せを候補木として格
納する候補木格納部２２０４とを備える。This apparatus comprises a control unit 2200 for controlling the entire processing and an image storage unit 220 for storing the input image.
1. A primary / secondary segment information storage unit 2202 for temporarily storing primary and secondary segments as intermediate information of characters, character information for temporarily storing a character image and a character code as a recognition result thereof The storage unit includes a storage unit 2203 and a candidate tree storage unit 2204 that stores a combination of a primary segment and a secondary segment as a candidate tree.

【００６９】また、画像格納部から入力画像を読み取
り、１次セグメントを抽出し、１次・２次セグメント情
報格納部に、当該１次セグメント情報を格納する第１次
セグメント抽出部２２０５と、１次・２次セグメント情
報格納部から、上記格納された１次セグメント情報を読
み取り確度フラグをセットする確度フラグセット部２２
０６と、具体例１と同じく１次・２次セグメント情報格
納部から１次セグメント情報を読み取り、当該１次セグ
メントと当該１次セグメントに隣接する他の１次セグメ
ントとの距離が予め用意された条件を満足する場合は、
当該１次セグメントと当該隣接１次セグメントとを統合
し、２次セグメントにし、当該２次セグメントを１次・
２次セグメント情報格納部に記録する第２次セグメント
作成部２２０７を備える。A primary segment extraction unit 2205 that reads an input image from the image storage unit, extracts a primary segment, and stores the primary segment information in a primary / secondary segment information storage unit. A certainty flag setting unit 22 that reads the stored primary segment information from the next / secondary segment information storage unit and sets a certainty flag.
06, the primary segment information is read from the primary / secondary segment information storage unit as in the first embodiment, and the distance between the primary segment and another primary segment adjacent to the primary segment is prepared in advance. If you meet the conditions,
The primary segment and the adjacent primary segment are integrated into a secondary segment.
A secondary segment creation unit 2207 for recording in the secondary segment information storage unit is provided.

【００７０】さらに、上記１次・２次セグメント情報格
納部から１次セグメント情報と２次セグメント情報を読
み取り、対応する座標にある画像を、上記画像格納部か
ら読み取り、当該切り出した画像をセグメントで囲われ
た文字パタンとして、文字情報格納部に格納する文字パ
タン切出し部２２０８と、上記文字情報格納部から上記
切り出した文字パタンを文字認識し、認識結果である文
字コードの上位第Ｋ位までを文字情報格納部に格納する
文字認識部２２０９とを備える。Further, the primary segment information and the secondary segment information are read from the primary / secondary segment information storage unit, the image at the corresponding coordinates is read from the image storage unit, and the cut out image is segmented. As the enclosed character pattern, a character pattern cutout unit 2208 stored in the character information storage unit and the character pattern cut out from the character information storage unit are subjected to character recognition, and the uppermost K-th character code as a recognition result is displayed. A character recognition unit 2209 for storing the character information in the character information storage unit.

【００７１】また、上記１次・２次セグメント情報格納
部からセグメント情報を、上記文字情報格納部から文字
情報を読み取り、文字切出しの組合せ（以下候補木とす
る）を作成し、これを候補木格納部に格納する候補木作
成部２２１０と、上記候補木格納部から最良の切出し・
認識結果であるセグメントの組合せを出力する結果出力
部２２１１を備える。The segment information is read from the primary / secondary segment information storage unit, the character information is read from the character information storage unit, and a combination of character extraction (hereinafter referred to as a candidate tree) is created. A candidate tree creating unit 2210 to be stored in the storage unit;
A result output unit 2211 for outputting a combination of segments as a recognition result is provided.

【００７２】〈確度フラグセット部〉図２３に、具体例
３の確度フラグセット部のブロック図を示す。この図に
より、具体例３の確度フラグセット部の構成について説
明する。この確度フラグセット部は、１次・２次セグメ
ント情報格納部から１次セグメント情報を取得するため
の１次セグメント取得部２８００と該１次セグメントに
対応する文字情報を取得するための文字情報取得部２８
０１と、該取得した１次セグメントの文字情報を一時的
に格納するためのセグメントの文字コード情報格納部２
８０２と確度フラグのセットできる文字コードを格納す
るための確度フラグのセットできる文字コード格納部２
８０３を備える。<Accuracy Flag Set Unit> FIG. 23 is a block diagram of the accuracy flag set unit according to the third embodiment. With reference to this figure, the configuration of the accuracy flag setting unit of the third embodiment will be described. The certainty flag set unit includes a primary segment acquisition unit 2800 for acquiring primary segment information from the primary / secondary segment information storage unit and a character information acquisition unit for acquiring character information corresponding to the primary segment. Part 28
01, and a segment character code information storage unit 2 for temporarily storing the acquired primary segment character information.
802 and a character code storage unit 2 that can set a certainty flag for storing a character code that can set a certainty flag
803.

【００７３】また、上記文字コードを比較し、確度フラ
グをセットするかどうかを判定するための確度フラグセ
ット判定部２８０４と、該判定部の結果に基づいてフラ
グを設定するための１次セグメント設定部２８０５を備
える。Also, a certainty flag set determining unit 2804 for comparing the character codes and determining whether to set the certainty flag, and a primary segment setting for setting the flag based on the result of the determining unit. A section 2805 is provided.

【００７４】〈全体の動作の説明〉具体例３における装
置の動作手順は、具体例１や具体例２とほぼ同じであ
る。異なる点は、２点あり、確度フラグセット部の処理
と、全体処理の中で認識処理を行う位置である。よっ
て、以下では、全体処理と確度フラグ処理について説明
をし、それ以外の処理については、具体例１と同じとす
る。<Explanation of Overall Operation> The operation procedure of the apparatus in the embodiment 3 is almost the same as that of the embodiment 1 and the embodiment 2. There are two different points, that is, the processing of the accuracy flag setting unit and the position where the recognition processing is performed in the overall processing. Therefore, the entire process and the accuracy flag process will be described below, and the other processes will be the same as in the first embodiment.

【００７５】図２４は、具体例３の装置の全体処理フロ
ーチャートを示す。図の開始の位置から処理が始まり、
一連の処理を行い、終了の位置で終わる。 (1)ステップＳ５１．先ず処理対象の画像を入力する。
ここで入力される画像は、具体例１や２と同様に、文字
が行一列に並んだ状態の画像を指し、一般に行画像とい
われる画像である。この行画像を読み取り、画像格納部
へ格納することを行う。 (2)ステップＳ５２．次に、１次セグメント抽出処理を
行う。この１次セグメント抽出処理は、従来からよく知
られるヒストグラムを利用する方法で容易に実現が可能
である。FIG. 24 is a flowchart showing the entire processing of the apparatus of the third embodiment. Processing starts at the start of the figure,
A series of processing is performed and ends at the end position. (1) Step S51. First, an image to be processed is input.
The image input here refers to an image in which characters are arranged in a line, as in the specific examples 1 and 2, and is an image generally called a line image. This line image is read and stored in the image storage unit. (2) Step S52. Next, a primary segment extraction process is performed. This primary segment extraction processing can be easily realized by a method using a well-known histogram.

【００７６】図２５（ａ）には、１次セグメント抽出の
例説明図を図示した。図のように、「めぐみ」というか
な文字の画像から、５つの１次セグメントが求められた
ことが判る。FIG. 25A is a diagram for explaining an example of primary segment extraction. As shown in the figure, it can be seen from the image of the character “Megumi” that five primary segments have been obtained.

【００７７】(3)ステップＳ５３．次に、１次セグメン
トの文字パタンの切出し処理を行う。上記抽出された１
次セグメントの情報から、対応する座標の画像を切り出
し、文字パタンとし、これを文字情報格納部２２０３に
格納する。 (4)ステップＳ５４．次に、１次セグメントの文字認識
処理を行う。上記抽出した１次セグメントの文字パタン
を文字認識部２２０９に渡して、文字認識を行う。Ｋ個
の認識結果の文字コードと確信度を文字情報格納部へ格
納する（本具体例ではＫ＝１とした）。(3) Step S53. Next, the character pattern of the primary segment is cut out. The extracted 1
From the information of the next segment, an image of the corresponding coordinates is cut out and used as a character pattern, which is stored in the character information storage unit 2203. (4) Step S54. Next, a character recognition process for the primary segment is performed. The character pattern of the extracted primary segment is passed to the character recognition unit 2209 to perform character recognition. The character codes and the degrees of certainty of the K recognition results are stored in the character information storage unit (K = 1 in this specific example).

【００７８】図２５（ｂ）には、１次セグメント認識結
果の例説明図を示す。図のように、左から３番目の１次
セグメントの認識結果を、１位から５位まで（この１次
セグメントでは１位のみ）得る。FIG. 25 (b) shows an example of the primary segment recognition result. As shown in the figure, the recognition result of the third primary segment from the left is obtained from the first to fifth places (only the first place in this primary segment).

【００７９】(5)ステップＳ５５．次に、確度フラグセ
ットの処理をする。この具体例では、具体例１、具体例
２とは異なり、確度フラグを認識結果を元にセットする
処理をしている。具体的な処理は後述する。(5) Step S55. Next, the processing of the accuracy flag set is performed. In this specific example, unlike the specific examples 1 and 2, processing for setting the accuracy flag based on the recognition result is performed. Specific processing will be described later.

【００８０】図２６には、確度フラグのセットの例を示
した。この図に示すように、下向きの三角形で示した確
度フラグは、第３番目と第４番目の１次セグメントの間
にセットされている。 (6)ステップＳ５６．次に、２次セグメントの作成処理
を行う。処理手順は、具体例１、具体例２と同じであ
る。FIG. 26 shows an example of setting the accuracy flag. As shown in this figure, the accuracy flag indicated by a downward-pointing triangle is set between the third and fourth primary segments. (6) Step S56. Next, a secondary segment creation process is performed. The processing procedure is the same as in the first and second examples.

【００８１】(7)ステップＳ５７．次に、２次セグメン
トの文字パタンの切出し処理処理を行う。２次セグメン
トの座標情報から、セグメント枠で囲まれる領域の画像
を、当該セグメントの文字パタンとして抽出し、これを
文字情報格納部２２０３に格納する。 (8)ステップＳ５８．次に、２次セグメントの文字認識
処理を行う。認識結果は、文字情報格納部２２０３に格
納する。 (9)ステップＳ５９．次に、候補木作成処理を行う。処
理手順は、具体例１、具体例２と全く同じである。(7) Step S57. Next, a process of extracting a character pattern of the secondary segment is performed. From the coordinate information of the secondary segment, an image of the area surrounded by the segment frame is extracted as a character pattern of the segment, and this is stored in the character information storage unit 2203. (8) Step S58. Next, a character recognition process for the secondary segment is performed. The recognition result is stored in the character information storage unit 2203. (9) Step S59. Next, a candidate tree creation process is performed. The processing procedure is exactly the same as the specific examples 1 and 2.

【００８２】図２７と図２８に、候補木作成結果の例を
示した。図２７は、確度フラグなしの場合の候補木作成
結果を示し、図２８は、確度フラグありの場合の候補木
作成結果を示す。これらの例から判るように、確度フラ
グを導入すると候補木のサイズが１０本から６本に少な
くなっている。 (10)ステップＳ６０．最後に、結果出力処理を行う。上
記作成された候補木の中から、最大の評価値を持つ枝
を、最終結果として出力する。図２８の例の場合は、下
から３番目の枝が最終結果であり、その認識結果は“め
ぐみ”である。FIGS. 27 and 28 show examples of candidate tree creation results. FIG. 27 shows a candidate tree creation result without a certainty flag, and FIG. 28 shows a candidate tree creation result without a certainty flag. As can be seen from these examples, when the accuracy flag is introduced, the size of the candidate tree is reduced from ten to six. (10) Step S60. Finally, a result output process is performed. The branch having the largest evaluation value is output as the final result from the created candidate trees. In the case of the example of FIG. 28, the third branch from the bottom is the final result, and the recognition result is “Megumi”.

【００８３】〈確度フラグセット部〉ここでは、具体例
３における確度フラグの処理について説明する。図２９
は、具体例３の確度フラグセット部処理フローチャート
を示す。 (1)ステップＳ６１．先ず、最左端の１次セグメントか
ら順に、その情報を取得する。 (2)ステップＳ６２．当該１次セグメントの文字情報を
取得する。 (3)ステップＳ６３．当該文字情報の文字コードと予め
用意しておいた確度フラグを持つ文字コードとを比較す
る。<Probability Flag Setting Unit> The following describes the processing of the probability flag in the third embodiment. FIG.
9 shows a processing flowchart of the accuracy flag setting unit of the third embodiment. (1) Step S61. First, the information is acquired in order from the leftmost primary segment. (2) Step S62. The character information of the primary segment is obtained. (3) Step S63. The character code of the character information is compared with a character code having a certainty flag prepared in advance.

【００８４】(4)ステップＳ６４．上記比較の結果、同
一文字コードであるならば、当該１次セグメントに確度
フラグをセットする。それ以外の場合には、ステップＳ
６５へ直接進む。 (5)ステップＳ６５．当該１次セグメントの右に他の１
次セグメントが存在すればステップＳ６１へ戻り、存在
しなければ、終了する。図２６を見てわかるように、濁
点は文字の境界にあるはずだから、この右側に確度フラ
グがセットされている。(4) Step S64. If the result of the comparison is that the character codes are the same, a probability flag is set in the primary segment. Otherwise, step S
Proceed directly to 65. (5) Step S65. Another one to the right of the primary segment
If the next segment exists, the process returns to step S61; otherwise, the process ends. As can be seen from FIG. 26, since the voiced dot should be on the boundary of the character, the accuracy flag is set on the right side.

【００８５】〈具体例３の効果〉濁点のように、文字の
境界にくるはずの図形に対して、確度フラグがセットで
きるように、その図形を認識する辞書を用意すること
で、１次セグメントに確度フラグを容易にセットするこ
とができ、２次セグメントの作成個数を抑えることがで
きる。これにより、後段の処理に必要となる候補木のサ
イズを抑えられる。従って、計算量と処理速度、及び、
処理時間を抑えることができ、従来問題を解決すること
ができる。<Effect of Specific Example 3> A dictionary for recognizing a figure which is supposed to be at a character boundary, such as a cloud point, so that the accuracy flag can be set, is prepared. Can be set easily, and the number of secondary segments created can be reduced. As a result, the size of the candidate tree required for the subsequent processing can be suppressed. Therefore, the amount of calculation and processing speed, and
The processing time can be reduced, and the conventional problem can be solved.

【００８６】特に、具体例３に示した確度フラグは、必
ず確度フラグを付加する条件を満足することが予想でき
る、文字の特定部分の図形、例えば、「句読点」や「濁
点」、「半濁点」の特徴量を考慮して計算するものとす
れば効果的である。In particular, the certainty flag shown in the specific example 3 is a figure of a specific part of a character, for example, "punctuation mark", "white mark", "semi-dark point", which can be expected to satisfy the condition for adding the certainty flag. It is effective if the calculation is performed in consideration of the feature amount of “.

【００８７】〈利用形態〉 (1)以上の各具体例は、文字認識処理のために文字の画
像から文字を切り出す方法やその装置に広く利用でき
る。 (2)上記具体例１、２、３では、横書き文字を対象とし
て説明したが、縦書きや斜め方向に並べて記入された自
由な文字列に適用できる。 (3) 具体例１、２、３では、手書き文字を対象として説
明したが、活字による文字や記号の認識にも適用でき
る。<Usage Modes> (1) Each of the above specific examples can be widely used in a method and an apparatus for cutting out characters from a character image for character recognition processing. (2) In the first, second, and third examples, the description has been made for horizontally written characters. However, the present invention can be applied to a free character string written vertically or in a diagonal direction. (3) In the first, second, and third examples, the description has been made for handwritten characters. However, the present invention can also be applied to recognition of characters and symbols using printed characters.

【００８８】(4) 具体例１、２、３では、全ての例にお
いて候補木から最適な切出し方を選ぶために、種々の評
価値（例えば認識確信度）とそれを計算するための処理
（例えば認識処理）を採用した。しかし、切出しの確信
度を表すための確度フラグのセット方法とその利用に関
する部分が特徴であり、その他の処理、例えば、認識方
式や、認識結果の評価方法は、自由に変更して差し支え
ない。(4) In Examples 1, 2, and 3, various evaluation values (for example, recognition certainty) and processing for calculating the evaluation values (for example, the recognition certainty factor) are selected in order to select the optimal extraction method from the candidate tree in all the examples. For example, recognition processing) was adopted. However, the method is characterized by a method of setting a certainty flag for representing the certainty of the extraction and its use, and other processes, for example, a recognition method and a method of evaluating a recognition result may be freely changed.

[Brief description of the drawings]

【図１】具体例１の文字認識装置全体のブロック図であ
る。FIG. 1 is a block diagram of an entire character recognition device according to a first embodiment.

【図２】従来知られたヒストグラムと文字認識方法の説
明図である。FIG. 2 is an explanatory diagram of a conventionally known histogram and a character recognition method.

【図３】具体例１の確度フラグセット部ブロック図であ
る。FIG. 3 is a block diagram of a certainty flag setting unit according to a first specific example;

【図４】具体例１の２次セグメント作成部のブロック図
である。FIG. 4 is a block diagram of a secondary segment creation unit according to the first embodiment.

【図５】具体例１の候補木作成部ブロック図である。FIG. 5 is a block diagram of a candidate tree creating unit according to a specific example 1.

【図６】この装置全体の動作フローチャートである。FIG. 6 is an operation flowchart of the entire apparatus.

【図７】（ａ）は、１次セグメント抽出の例説明図であ
る。（ｂ）は、確度フラグのセット例説明図である。FIG. 7A is an explanatory diagram of an example of primary segment extraction. (B) is an explanatory diagram of a set example of the accuracy flag.

【図８】２次セグメントの作成例説明図である。FIG. 8 is an explanatory diagram of a creation example of a secondary segment.

【図９】比較例の候補木説明図である。FIG. 9 is an explanatory diagram of a candidate tree of a comparative example.

【図１０】具体例１の候補木説明図である。FIG. 10 is an explanatory diagram of a candidate tree of a specific example 1.

【図１１】具体例１の確度フラグセット処理フローチャ
ートである。FIG. 11 is a flowchart illustrating a certainty flag setting process according to the first specific example.

【図１２】具体例１の２次セグメントの作成部フローチ
ャートである。FIG. 12 is a flowchart of a secondary segment creation unit according to the first embodiment.

【図１３】具体例１の候補木作成部フローチャートであ
る。FIG. 13 is a flowchart of a candidate tree creating unit according to the first embodiment.

【図１４】評価値の計算例説明図である。FIG. 14 is an explanatory diagram of a calculation example of an evaluation value.

【図１５】具体例２の確度フラグセット部のブロック図
である。FIG. 15 is a block diagram of a certainty flag setting unit according to the second embodiment.

【図１６】１次セグメント抽出の例と確度フラグのセッ
ト例を示す説明図である。FIG. 16 is an explanatory diagram showing an example of primary segment extraction and an example of setting a probability flag.

【図１７】確度フラグセットの条件を示す説明図であ
る。FIG. 17 is an explanatory diagram showing conditions of a certainty flag set.

【図１８】２次セグメント作成例説明図である。FIG. 18 is an explanatory diagram of a secondary segment creation example.

【図１９】比較例の候補木説明図である。FIG. 19 is an explanatory diagram of a candidate tree of a comparative example.

【図２０】具体例２の候補木説明図である。FIG. 20 is an explanatory diagram of a candidate tree of a specific example 2.

【図２１】具体例２の確度フラグのセット部の処理手順
を示すフローチャートである。FIG. 21 is a flowchart illustrating a processing procedure of a certainty flag setting unit according to the second embodiment;

【図２２】具体例３の装置全体のブロック図である。FIG. 22 is a block diagram of the entire apparatus of the third embodiment.

【図２３】具体例３の確度フラグセット部のブロック図
である。FIG. 23 is a block diagram of a certainty flag setting unit according to the third embodiment.

【図２４】具体例３の装置の全体処理フローチャートで
ある。FIG. 24 is an overall processing flowchart of an apparatus according to a third embodiment.

【図２５】（ａ）は１次セグメント抽出の例説明図、
（ｂ）は１次セグメント認識結果の例説明図である。FIG. 25A is an explanatory diagram of an example of primary segment extraction,
(B) is an example explanatory view of a primary segment recognition result.

【図２６】確度フラグのセットの例説明図である。FIG. 26 is an explanatory diagram of an example of setting of a probability flag.

【図２７】確度フラグなしの場合の比較例候補木説明図
である。FIG. 27 is an explanatory diagram of a comparative example candidate tree when there is no accuracy flag.

【図２８】確度フラグありの場合の具体例３による候補
木説明図である。FIG. 28 is an explanatory diagram of a candidate tree according to a specific example 3 when a probability flag is present.

【図２９】具体例３の確度フラグセット部処理フローチ
ャートである。FIG. 29 is a processing flowchart of a certainty flag setting unit according to the third embodiment.

[Explanation of symbols]

１００制御部１０１画像格納部１０２１次・２次セグメント情報格納部１０３文字情報格納部１０４候補木格納部１０５１次セグメント抽出部１０６確度フラグセット部１０７２次セグメント作成部１０８文字パタン切出し部１０９文字認識部１１０候補木作成部１１１結果出力部 REFERENCE SIGNS LIST 100 control unit 101 image storage unit 102 primary / secondary segment information storage unit 103 character information storage unit 104 candidate tree storage unit 105 primary segment extraction unit 106 accuracy flag setting unit 107 secondary segment creation unit 108 character pattern extraction unit 109 Character recognition unit 110 Candidate tree creation unit 111 Result output unit

Claims

[Claims]

1. For a character string arranged in an input image, a histogram representing the distribution of black points constituting each character is obtained, and a plurality of extraction candidate positions oriented substantially perpendicular to the direction in which the characters are arranged are obtained. To obtain a primary segment group by dividing the character string into a plurality of parts at the cut candidate position.
A next segment extraction unit, a certainty flag setting unit that sets a certainty flag to distinguish a primary segment at a boundary between adjacent characters from other segments, and a character boundary by focusing on the certainty flag. A secondary segment creation unit that creates a secondary segment by integrating adjacent primary segments extracted by the primary segment extraction unit while excluding a combination of sandwiched primary segments; A character recognition device, comprising: a character recognition unit for character recognition of a candidate character image surrounded by a next segment; and a candidate tree selection unit for evaluating a character recognition result using the candidate tree.

2. The apparatus according to claim 1, wherein the accuracy flag setting unit calculates a space width between the primary segment and an adjacent primary segment for all of the primary segments. If the space width is equal to or greater than a predetermined threshold, a certainty flag is set for the segment, and information that a character boundary is inserted between the segments is added to one of the primary segments. Character recognition device.

3. The apparatus according to claim 1, wherein the accuracy flag setting unit obtains a characteristic amount representing a shape of the primary segment for all of the primary segments, and the characteristic amount is given in advance. A character recognition device that adds a certainty flag to the segment when the segment is within the specified range.

4. The apparatus according to claim 1, wherein the accuracy flag setting unit performs character recognition once for all of the primary segments, and compares the primary segment with a dictionary that recognizes a graphic of a specific part of the character. hand,
A character recognition device characterized by adding a certainty flag to the primary segment when it is recognized as a corresponding figure.