JP5288915B2

JP5288915B2 - Character recognition device, character recognition method, computer program, and storage medium

Info

Publication number: JP5288915B2
Application number: JP2008178370A
Authority: JP
Inventors: 裕章池田; 英明松本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2008-07-08
Filing date: 2008-07-08
Publication date: 2013-09-11
Anticipated expiration: 2028-07-08
Also published as: JP2010020421A

Description

本発明は、画像から文字を認識する技術に関する。 The present invention relates to a technique for recognizing characters from an image.

罫線で囲まれた複数の文字を記入するための枠内に、手書きで記入された文字を認識する技術がある。この技術では、まず、文字列を画素塊に分割し、画素塊を組み合わせたパターンを文字と見なして認識処理を行う。そして、その認識結果を用いて文字評価値を算出し、この評価値の和の最大となる画素塊の組み合わせの列を動的計画法を用いて探索するというものである。 There is a technology for recognizing handwritten characters in a frame for entering a plurality of characters surrounded by ruled lines. In this technique, first, a character string is divided into pixel blocks, and a pattern in which the pixel blocks are combined is regarded as a character and recognition processing is performed. Then, a character evaluation value is calculated using the recognition result, and a row of pixel block combinations that maximizes the sum of the evaluation values is searched using dynamic programming.

この画素塊の組み合わせの列を減らす方法として、以下の従来技術が存在する。
（１）丁目・番地部分のような文字の並びの規則性を利用して、画素塊の組み合わせの列を絞り込む技術として、例えば、特許文献１がある。
（２）地名部分、丁目・番地部分のような特定の部分において、字種を限定することを考慮に入れて画素塊の組み合わせの列を絞り込む技術として、例えば、特許文献２がある。
特開平０６−１２４３６６号公報特開平０８−２４３５０４号公報 The following conventional techniques exist as a method for reducing the number of combinations of pixel blocks.
(1) As a technique for narrowing down a column of pixel block combinations by using regularity of character arrangement such as a chome / address part, for example, there is Patent Document 1.
(2) As a technique for narrowing down a column of pixel block combinations in consideration of limiting character types in specific parts such as place name parts, chome / address parts, there is, for example, Patent Document 2.
Japanese Patent Laid-Open No. 06-124366 Japanese Patent Laid-Open No. 08-243504

しかしながら、上記のような従来技術を用いても、地名部分の認識において、偏と旁のように部首同士が分離してしまう画素塊が多い場合、動的計画法を用いて探索すると、処理に時間がかかる可能性がある。また、画素塊の組み合わせの数が増えると誤認識率が増加する可能性がある。 However, even when using the conventional technique as described above, if there are many pixel clusters that are separated from each other in radicals such as bias and heel in recognition of the place name part, processing is performed using dynamic programming. May take a long time. In addition, when the number of pixel block combinations increases, the false recognition rate may increase.

上記の課題を鑑み、本発明にかかる文字認識装置は、
入力された画像データから文字画像を構成する複数の画素塊を抽出する画素塊抽出手段と、
前記複数の画素塊から特定文字の連続した部分を検出する検出手段と、
前記複数の画素塊のうち前記検出手段で検出された特定文字が連続する部分を除く画素塊において、隣接する画素塊の間の距離が予め定められた距離判定閾値内であり、且つ、当該隣接する画素塊同士を結合させたものの縦横比が予め定められた縦横比判定閾値内である場合、当該隣接する画素塊同士を結合する結合手段と、
文字認識処理により、結合された画素塊に対応する文字の候補と、前記文字の候補に対応する評価値を取得する文字候補取得手段と、
前記文字画像の全体で、前記評価値が最大となる文字候補を決定する決定手段と、を備え、
前記検出手段により検出される前記特定文字には、数字と、前記数字と数字の間を区切る区切り記号と、が含まれることを特徴とする。 In view of the above problems, a character recognition device according to the present invention is
Pixel block extraction means for extracting a plurality of pixel blocks constituting a character image from input image data;
Detecting means for detecting a continuous portion of a specific character from the plurality of pixel blocks;
In a pixel block excluding a portion where the specific character detected by the detection means is continuous among the plurality of pixel blocks, a distance between adjacent pixel blocks is within a predetermined distance determination threshold, and the adjacent When the aspect ratio of the pixel blocks to be combined is within a predetermined aspect ratio determination threshold , combining means for combining the adjacent pixel blocks,
Character candidate corresponding to the combined pixel block by character recognition processing, and character candidate acquisition means for acquiring an evaluation value corresponding to the character candidate;
Determining means for determining a character candidate that maximizes the evaluation value in the entire character image ;
Wherein the specific character that is detected by the detection means and numbers, and a separator between the between the numbers and numbers, wherein Rukoto included.

本発明によれば、画像から文字を自動的に読み取る際に、認識速度および認識精度が向上する。 According to the present invention, when characters are automatically read from an image, recognition speed and recognition accuracy are improved.

以下、図面を参照して、本発明の好適な実施形態を例示的に詳しく説明する。ただし、この実施の形態に記載されている構成要素はあくまで例示であり、本発明の技術的範囲は、特許請求の範囲によって確定されるのであって、以下の個別の実施形態によって限定されるわけではない。 Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the drawings. However, the constituent elements described in this embodiment are merely examples, and the technical scope of the present invention is determined by the scope of claims, and is limited by the following individual embodiments. is not.

（第１実施形態）
図１は第１実施形態に係る文字認識装置の概略構成を示すブロック図である。 (First embodiment)
FIG. 1 is a block diagram showing a schematic configuration of the character recognition apparatus according to the first embodiment.

ＣＰＵ１０１は、ＲＯＭ１０２に格納されている制御プログラムを実行することにより、本装置全体の制御を行う。ＲＯＭ１０２は、ＣＰＵ１０１が実行するコンピュータプログラムや各種パラメータデータを格納する。コンピュータプログラムは、ＣＰＵ１０１で実行されることにより、後述するフローチャートに示す各処理を実行するための実行手段として、文字認識装置（コンピュータ）を機能させる。なお、本実施形態では、後述するフローチャートの各ステップに対応する処理を、ＣＰＵ１０１を用いてソフトウェアで実現することとするが、その処理の一部または全部を電子回路などのハードウェアで実現するようにしても構わない。また、本発明の文字認識装置は、汎用パソコンを用いて実現してもよいし、文字認識専用の装置として実現するようにしても構わない。 The CPU 101 controls the entire apparatus by executing a control program stored in the ROM 102. The ROM 102 stores computer programs executed by the CPU 101 and various parameter data. The computer program is executed by the CPU 101 to cause the character recognition device (computer) to function as execution means for executing each process shown in the flowcharts described below. In the present embodiment, processing corresponding to each step of the flowchart to be described later is realized by software using the CPU 101. However, part or all of the processing is realized by hardware such as an electronic circuit. It doesn't matter. The character recognition device of the present invention may be realized using a general-purpose personal computer or may be realized as a device dedicated to character recognition.

ＲＡＭ１０３は、画像や各種情報を記憶する。また、ＲＡＭ１０３は、ＣＰＵのワークエリアやデータの一時待避領域として機能する。 The RAM 103 stores images and various information. The RAM 103 functions as a work area for the CPU and a temporary save area for data.

外部記憶装置１０４は、辞書などの各種データを記憶する。外部記憶装置１０４は、例えば、ハードディスクやＣＤ−ＲＯＭ等で構成される。なお、文字認識装置の制御方法をコンピュータに実行させるためのコンピュータプログラムは、コンピュータ読取可能な外部記憶媒体に格納されていても構わないし、ネットワークを介して供給されるようにしても構わない。ディスプレイ１０５は、例えば、ＬＣＤやＣＲＴで構成される。 The external storage device 104 stores various data such as a dictionary. The external storage device 104 is composed of, for example, a hard disk or a CD-ROM. A computer program for causing a computer to execute the control method of the character recognition device may be stored in a computer-readable external storage medium or may be supplied via a network. The display 105 is composed of, for example, an LCD or CRT.

入力装置１０６は、例えば、スキャナやデジカメ等の画像入力装置を接続するためのインターフェースであってもよいし、スキャナ等の画像入力装置そのものであっても構わない。 The input device 106 may be an interface for connecting an image input device such as a scanner or a digital camera, or may be an image input device itself such as a scanner.

ネットワークインターフェース（Ｉ／Ｆ）１０７は、ネットワーク上に接続されている外部装置、例えば、サーバ、外部記憶装置、画像入力装置等と通信し、プログラムやデータを読み込んだり、書き込んだりする。尚、ネットワークは、典型的にはインターネットやＬＡＮやＷＡＮや電話回線などのいわゆる通信ネットワークであり、データの送受信が可能であれば良い。また、ディスプレイ１０５や入力装置１０６は、ネットワークインターフェース１０８を介して接続されていても良い。 A network interface (I / F) 107 communicates with external devices connected to the network, for example, a server, an external storage device, an image input device, and the like, and reads and writes programs and data. The network is typically a so-called communication network such as the Internet, a LAN, a WAN, or a telephone line as long as it can transmit and receive data. Further, the display 105 and the input device 106 may be connected via a network interface 108.

このような文字認識装置は、例えば、図２に示すシステムにおいて実現される。図２は、第1実施形態において採用可能なコンピュータシステムの構成例を示す図である。２０１はコンピュータ装置であり、スキャナ２０２で光学的に読み取った画像データを受信して、文字認識のための処理を実行する。 Such a character recognition device is realized, for example, in the system shown in FIG. FIG. 2 is a diagram illustrating a configuration example of a computer system that can be employed in the first embodiment. A computer apparatus 201 receives image data optically read by the scanner 202 and executes processing for character recognition.

次に、文字認識処理について図３〜図９を用いて説明する。本実施形態では、スキャナ等で光学的に読み取った画像データ内に記載された住所を認識する。 Next, the character recognition process will be described with reference to FIGS. In the present embodiment, an address described in image data optically read by a scanner or the like is recognized.

図３は手書き住所の一例を示す図である。日本語の住所の場合には、地名部分３０１と丁目・番地部分３０２とが存在する。図３の「東京都世田谷区下馬」を地名部分とし、それ以下を丁目・番地部分とする。 FIG. 3 is a diagram illustrating an example of a handwritten address. In the case of a Japanese address, a place name portion 301 and a chome / address portion 302 exist. In FIG. 3, “Shimouma, Setagaya-ku, Tokyo” is the place name part, and the rest is the chome / address part.

図４は、第1実施形態の文字認識装置における文字認識処理を示すフローチャートである。本処理は、ＣＰＵ１０１の全体的な制御の下に実行される。 FIG. 4 is a flowchart showing character recognition processing in the character recognition device of the first embodiment. This process is executed under the overall control of the CPU 101.

ステップＳ４０１では、入力された画像データから文字画像を構成する複数の画素塊を抽出する。行抽出は、行方向に射影を取り、射影の存在する部分を高さとするように行方向の矩形（行矩形）を取ればよい。画素塊の抽出は、行矩形内から文字を形成する画素を見つけ、その輪郭を追跡していき元の画素まで戻ってきたら、それを１つの画素塊とする。同様にして行矩形内の全ての画素塊を取り出し、横書きなら上下に存在する画素塊同士を結合する。または垂直方向に射影を取り、射影の存在する部分を幅とするように画素塊を抽出してもよい。その後、文字の一部の孤立した小画素塊を対象とし、孤立した小画素塊と接近して存在する画素塊同士を結合する。 In step S401, a plurality of pixel blocks constituting a character image are extracted from the input image data. The row extraction may be performed by taking a projection in the row direction and taking a rectangle in the row direction (row rectangle) so that the portion where the projection exists is the height. Extraction of a pixel block finds a pixel that forms a character from the line rectangle, traces its outline, and returns to the original pixel, and defines it as one pixel block. Similarly, all the pixel blocks in the row rectangle are taken out, and in the horizontal writing mode, the pixel blocks existing above and below are combined. Alternatively, the projection may be taken in the vertical direction, and the pixel block may be extracted so that the portion where the projection exists is the width. Thereafter, the isolated small pixel block of a part of the character is targeted, and the pixel blocks existing close to the isolated small pixel block are combined.

ステップＳ４０２では、画素塊から特定文字の連続した部分（丁目・番地部分）を検出する。ステップＳ４０２の詳細を図５に示す。 In step S402, a continuous portion (chome / address portion) of a specific character is detected from the pixel block. Details of step S402 are shown in FIG.

図５のステップＳ５０１では、区切り記号の検出を行う。区切り記号とは、「−」（ハイフン）、「の」、「ノ」、「丁目」、「番地」、「号」のような丁目・番地部分で用いられる数字と数字の間に存在する文字とする。ステップＳ５０１の詳細を図６に示す。 In step S501 in FIG. 5, a delimiter is detected. The delimiter is a character that exists between numbers, such as “-” (hyphen), “no”, “no”, “chome”, “address”, “no.” And Details of step S501 are shown in FIG.

図６のステップＳ６０１では、形状や位置情報等による区切り記号候補の検出を行う。例えば、ハイフンの場合、画素塊の縦横比や行高（１行分の高さ）のどの位置に画素塊が配置されているかという情報で特定できる。住所記入例を見ると、区切り記号にハイフンを利用する人が多く、文字の形状や位置情報だけで区切り記号候補を検出でき、処理の高速化に繋がることが期待される。 In step S601 in FIG. 6, a delimiter candidate is detected based on shape, position information, and the like. For example, in the case of a hyphen, it can be specified by information on the position of the pixel block in the aspect ratio or row height (height of one row). Looking at examples of address entry, many people use hyphens as delimiters, and it is expected that candidates for delimiters can be detected using only the shape and position information of characters, leading to faster processing.

ステップＳ６０２では、ステップＳ６０１で検出した区切り記号候補が存在するまで、ステップＳ６０３とステップＳ６０４の処理を実行する。 In step S602, steps S603 and S604 are executed until the delimiter candidate detected in step S601 exists.

ステップＳ６０３では、区切り記号候補の両隣が数字かどうかを判定する。区切り記号候補の両隣が数字であればステップＳ６０４へ進む。 In step S603, it is determined whether both neighbors of the delimiter candidate are numbers. If both sides of the delimiter candidate are numbers, the process proceeds to step S604.

ステップＳ６０４では、区切り記号候補を区切り記号と確定する。 In step S604, a delimiter candidate is determined as a delimiter.

ステップＳ６０５では、ステップＳ６０１〜ステップＳ６０４で区切り記号が複数検出されたかどうかを確認し、検出された場合は区切り記号検出の処理を終了する。検出されない場合は、ステップＳ６０６へ進む。 In step S605, it is confirmed whether or not a plurality of delimiters are detected in steps S601 to S604. If they are detected, the delimiter detection process is terminated. If not detected, the process proceeds to step S606.

ステップＳ６０６では、ＯＣＲを利用した区切り記号候補検出を行う。画素塊に対して、1文字認識を繰り返し行うことで、区切り記号候補を検出する。 In step S606, a delimiter candidate is detected using OCR. Separation symbol candidates are detected by repeatedly recognizing one character for a pixel block.

ステップＳ６０７では、ステップＳ６０６で検出した区切り記号候補が存在するまで、ステップＳ６０８とステップＳ６０９の処理を実行する。 In step S607, steps S608 and S609 are executed until the delimiter candidate detected in step S606 exists.

ステップＳ６０８では、区切り記号候補の両隣が数字かどうかを判定する。区切り記号候補の両隣が数字であればステップＳ６０９へ進む。 In step S608, it is determined whether both sides of the delimiter candidate are numbers. If both neighbors of the delimiter candidate are numbers, the process proceeds to step S609.

ステップＳ６０９では、区切り記号候補を区切り記号と確定する。 In step S609, the delimiter candidate is determined as a delimiter.

ステップＳ６０９が終了すると、処理は図５のステップＳ５０２に戻される。 When step S609 ends, the process returns to step S502 of FIG.

図５のステップＳ５０２では、区切り記号を検出したかどうかを判定する。区切り記号を検出した場合は、ステップＳ５０３へ進む。区切り記号が検出されていない場合、丁目・番地部分検出処理を終了する。 In step S502 of FIG. 5, it is determined whether a delimiter is detected. If a delimiter is detected, the process proceeds to step S503. If no delimiter is detected, the chome / address part detection process is terminated.

ステップＳ５０３では、丁目・番地部分を選択する。ステップＳ５０３の詳細を図７に示す。 In step S503, a chome / address portion is selected. Details of step S503 are shown in FIG.

図７のステップＳ７０１では、横書きの場合は最左端の区切り記号から左に丁目・番地部分の境界が確定されるまで、ステップＳ７０２〜ステップＳ７０４の処理を実行する。 In step S701 of FIG. 7, in the case of horizontal writing, the processing of steps S702 to S704 is executed until the boundary of the chome / address portion is determined to the left from the leftmost delimiter.

ステップＳ７０２では、最左端の区切り記号の左隣を1文字認識する。 In step S702, one character is recognized on the left side of the leftmost delimiter.

ステップＳ７０３では、1文字認識の結果が数字かどうかを判定する。数字であればステップＳ７０４へ進み、数字でなければステップＳ７０５へ進む。 In step S703, it is determined whether the result of single character recognition is a number. If it is a number, the process proceeds to step S704, and if it is not a number, the process proceeds to step S705.

ステップＳ７０４では、丁目・番地部分の境界を1文字分広げ、ステップＳ７０１へ戻る。その後、ステップＳ７０２では更に左隣を1文字認識し、ステップＳ７０３、ステップＳ７０４の処理を行う。 In step S704, the boundary of the chome / address part is expanded by one character, and the process returns to step S701. Thereafter, in step S702, one character on the left side is further recognized, and the processes in steps S703 and S704 are performed.

ステップＳ７０５では、横書きの場合は最右端の区切り記号から右に丁目・番地部分の境界が確定されるまで、ステップＳ７０６〜ステップＳ７０８の処理を実行する。 In step S705, in the case of horizontal writing, the processes in steps S706 to S708 are executed until the boundary between the chome / address part is determined to the right from the rightmost delimiter.

ステップＳ７０６では、最右端の区切り記号の右隣を1文字認識する。 In step S706, one character is recognized right next to the rightmost delimiter.

ステップＳ７０７では、1文字認識の結果が数字かどうかを判定する。数字であればステップＳ７０８へ進み、数字でなければ丁目・番地部分選択処理を終了する。 In step S707, it is determined whether the result of single character recognition is a number. If it is a number, the process proceeds to step S708, and if it is not a number, the chome / address portion selection process is terminated.

ステップＳ７０８では、丁目・番地部分の境界を1文字分広げ、ステップＳ７０５へ戻る。その後、ステップＳ７０６では更に右隣を1文字認識し、ステップＳ７０７、ステップＳ７０８の処理を行う。 In step S708, the boundary of the chome / address part is widened by one character, and the process returns to step S705. Thereafter, in step S706, one character on the right is recognized, and the processes in steps S707 and S708 are performed.

ステップＳ５０３で丁目・番地部分を選択した例を図８に示す。 An example in which the chome / address portion is selected in step S503 is shown in FIG.

８０１では、区切り記号に「丁目」、「番地」、「号」を使用した例を示す。「号」の右側に数字が存在しないため、「号」は丁目・番地部分に指定されない。しかし、ステップＳ４０４で行う部首結合の処理には影響しないので問題はない。 Reference numeral 801 denotes an example in which “chome”, “address”, and “issue” are used as delimiters. Since no number exists on the right side of “No.”, “No.” is not designated as the chome / address part. However, there is no problem because it does not affect the radical coupling process performed in step S404.

８０２は区切り記号に「の」を使用した例である。この場合、処理を通じて全体が丁目・番地部分とされる。 Reference numeral 802 is an example in which “no” is used as a delimiter. In this case, the whole is made into a chome / address portion through the processing.

８０３は区切り記号に「ノ」を使用した例である。この場合、処理を通じて全体が丁目・番地部分とされる。 Reference numeral 803 is an example in which “no” is used as a delimiter. In this case, the whole is made into a chome / address portion through the processing.

８０４では区切り記号に「丁目」と「−」（ハイフン）が混在した例である。この場合、処理を通じて全体が丁目・番地部分とされる。 804 is an example in which “chome” and “-” (hyphen) are mixed in the separator. In this case, the whole is made into a chome / address portion through the processing.

８０５では区切り記号に「丁目」と「の」が混在した例である。この場合、処理を通じて全体が丁目・番地部分とされる。 In the example 805, “chome” and “no” are mixed in the separator. In this case, the whole is made into a chome / address portion through the processing.

８０６では区切り記号に「丁目」と「ノ」が混在した例である。この場合、処理を通じて全体が丁目・番地部分とされる。 806 is an example in which “chome” and “no” are mixed in the separator. In this case, the whole is made into a chome / address portion through the processing.

図５の丁目・番地部分の選択処理（ステップＳ５０３）が終了すると、図４のステップＳ４０３に処理は進められる。 When the selection process (step S503) of the chome / address portion in FIG. 5 is completed, the process proceeds to step S403 in FIG.

図４のステップＳ４０３では丁目・番地部分を検出したかどうかを判定する。検出が確認されればステップＳ４０４へ処理は進められ、丁目・番地部分が未検出であればステップＳ４０５へ処理は進められる。 In step S403 in FIG. 4, it is determined whether a chome / address portion has been detected. If the detection is confirmed, the process proceeds to step S404, and if the chome / address portion is not detected, the process proceeds to step S405.

ステップＳ４０４では、ステップＳ４０２で検出された丁目・番地部分以外の画素塊を対象として、偏と旁に分離した画素塊同士の結合を行う。複数の画素塊のうち隣接する画素塊の間の距離と縦横比とが、予め定められた閾値を満たす画素塊同士を結合する。例えば、画素塊の高さが行高の３分の２以上であり、画素塊同士の距離が一定の閾値（距離判定閾値）内であり、画素塊同士を結合させたものの縦横比が一定の閾値（縦横比判定閾値）内であるような場合、画素塊同士を結合させる。偏と旁の条件を厳しくするのであれば、画素塊の縦横比が、縦１に対し横０．４〜０．６となるような画素塊を対象とすることを条件に加えても良い。これにより、丁目・番地部分に存在する数字のような細長い文字同士を結合させることを防ぎながら、地名部分に存在する部首同士の結合が可能となる。 In step S404, pixel blocks other than the chome / address portion detected in step S402 are targeted, and the pixel blocks separated into partials and ridges are combined. Among the plurality of pixel blocks, the pixel blocks satisfying a predetermined threshold in terms of the distance between the adjacent pixel blocks and the aspect ratio are combined. For example, the height of the pixel block is two-thirds or more of the row height, the distance between the pixel blocks is within a certain threshold (distance determination threshold), and the aspect ratio of the combined pixel clusters is constant. If it is within the threshold (aspect ratio determination threshold), the pixel blocks are combined. If the conditions of the bias and the harshness are to be strict, it may be added to the condition that the pixel block is such that the aspect ratio of the pixel block is 0.4 to 0.6 in the horizontal direction. As a result, it is possible to connect the radicals existing in the place name part while preventing the combination of long and narrow characters such as numbers existing in the chome / address part.

ステップＳ４０５では、ここまでのステップで得られた画素塊を組み合わせてラティス構造を生成する。例えば推定文字サイズの１．２倍もしくは行高の１．２倍の大きい方を選択し、画素塊同士を組み合わせた幅が閾値（文字サイズ判定閾値）以内となる場合はラティス構造にその組み合わせを追加するようにする。 In step S405, a lattice structure is generated by combining the pixel blocks obtained in the steps so far. For example, if the larger one of 1.2 times the estimated character size or 1.2 times the line height is selected and the combined width of the pixel blocks is within a threshold (character size determination threshold), the combination is added to the lattice structure. Try to add.

ステップＳ４０６では、ステップＳ４０５で作成されたラティス構造の各個別文字パターン候補に文字認識処理を行う。文字認識処理により、結合された画素塊に対応する文字候補取得処理が実行され、文字候補に対応する評価値を取得する。文字認識処理の結果として、複数の文字候補を取得することが可能である。文字候補を、例えば、「第1候補」（評価値ｘ１）、「第２候補」（評価値ｘ２）、・・・「第ｎ候補」（評価値ｘｎ）として求め、評価値が最大となる文字の組み合わせを、ラティス構造の最適経路として決定する。最適経路の決定アルゴリズムとしては、ビタビ（Ｖｉｔｅｒｂｉ）アルゴリズムの使用が好適である。尚、本発明の趣旨は、このアルゴリズムの使用に限定されるものではなく、他の同様に機能を実現するアルゴリズムであれば、同様に適用可能であることはいうまでもない。また、ラティス構造を構成する文字の候補の評価値を外部記憶装置１０４に記憶させておくことも可能である。 In step S406, character recognition processing is performed on each individual character pattern candidate of the lattice structure created in step S405. A character candidate acquisition process corresponding to the combined pixel block is executed by the character recognition process, and an evaluation value corresponding to the character candidate is acquired. As a result of the character recognition process, a plurality of character candidates can be acquired. Character candidates are obtained as, for example, “first candidate” (evaluation value x1), “second candidate” (evaluation value x2),... “Nth candidate” (evaluation value xn), and the evaluation value is maximized. The combination of characters is determined as the optimal path of the lattice structure. It is preferable to use the Viterbi algorithm as the optimum route determination algorithm. Note that the gist of the present invention is not limited to the use of this algorithm, and it is needless to say that any other algorithm that achieves the same function can be similarly applied. Also, it is possible to store the evaluation values of the candidate characters constituting the lattice structure in the external storage device 104.

図９はこれまでのステップで得られた文字認識処理の結果を例示する図である。 FIG. 9 is a diagram illustrating the result of the character recognition processing obtained in the steps so far.

９０１は画素塊抽出後の画素塊の状態を示す。「都」は偏と旁に分離している。このケースでは、射影でなく、輪郭を探索することによって画素塊を抽出しているので、「谷」のようなケースが分離している。 Reference numeral 901 denotes the state of the pixel block after pixel block extraction. “City” is divided into two parts. In this case, a pixel block is extracted by searching for an outline instead of a projection, and thus a case such as a “valley” is separated.

９０２は丁目・番地部分検出後の画素塊の状態を示す。「０−００−００」を丁目・番地部分としている。 Reference numeral 902 indicates the state of the pixel block after detection of the chome / address portion. “0-00-00” is the chome / address section.

９０３は部首同士の結合処理後の画素塊の状態を示す。「都」は結合した状態となる。このケースでは「谷」を構成する文字塊の外接矩形が重なっているため、部首結合の対象からは除外している。 Reference numeral 903 denotes the state of the pixel block after the process of combining the radicals. “City” is in a combined state. In this case, since the circumscribed rectangles of the character blocks constituting the “valley” overlap, they are excluded from the radical combination target.

９０４はラティス構造を生成し、最適経路決定した後の文字列の状態を示す。ここでは、「谷」の文字が正しく検出されているのがわかる。 Reference numeral 904 denotes a state of a character string after generating a lattice structure and determining an optimum route. Here, it can be seen that the characters “valley” are correctly detected.

以上説明したように、本実施形態に拠れば、画像から文字を自動的に読み取る際に、認識速度および認識精度に優れた文字認識が可能になる。 As described above, according to the present embodiment, character recognition with excellent recognition speed and recognition accuracy is possible when characters are automatically read from an image.

丁目・番地部分に存在する数字同士の結合を防ぎながら、地名部分に存在する部首同士の結合により、最適経路決定の際に、文字候補判定の組み合わせパターン（経路数）を削減することにより文字認識処理の高速化が可能になる。また経路数を削減することにより誤認識率の低下も期待される。
（他の実施形態）
なお、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムを記録したコンピュータ可読の記憶媒体を、システムあるいは装置に供給することによっても、達成されることは言うまでもない。また、システムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムを読出し実行することによっても、達成されることは言うまでもない。 Characters by reducing the combination pattern (number of routes) for character candidate determination when determining the optimal route by combining the radicals existing in the place name portion while preventing the combination of numbers existing in the chome / address portion The recognition process can be speeded up. In addition, a reduction in the false recognition rate is expected by reducing the number of routes.
(Other embodiments)
Note that it is needless to say that the object of the present invention can also be achieved by supplying a computer-readable storage medium that records a software program for realizing the functions of the above-described embodiments to a system or apparatus. Needless to say, this can also be achieved by the computer (or CPU or MPU) of the system or apparatus reading and executing the program stored in the storage medium.

この場合、記憶媒体から読出されたプログラム自体が前述した実施形態の機能を実現することになり、そのプログラムを記憶した記憶媒体は本発明を構成することになる。 In this case, the program itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program constitutes the present invention.

プログラムを供給するための記憶媒体としては、例えば、フレキシブルディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、不揮発性のメモリカード、ＲＯＭなどを用いることができる。 As a storage medium for supplying the program, for example, a flexible disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a nonvolatile memory card, a ROM, or the like can be used.

また、コンピュータが読出したプログラムを実行することにより、前述した実施形態の機能が実現される。また、プログラムの指示に基づき、コンピュータ上で稼働しているＯＳ（オペレーティングシステム）などが実際の処理の一部または全部を行い、その処理によって前述した実施形態が実現される場合も含まれることは言うまでもない。 Further, the functions of the above-described embodiments are realized by executing the program read by the computer. In addition, it is also included that an OS (operating system) or the like running on a computer performs part or all of actual processing based on a program instruction, and the above-described embodiment is realized by the processing. Needless to say.

本発明の実施形態に係る文字認識装置の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the character recognition apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字認識装置の概略構成を示すシステム構成図である。It is a system configuration figure showing a schematic structure of a character recognition device concerning an embodiment of the present invention. 手書き住所の一例を示す図である。It is a figure which shows an example of a handwritten address. 本発明の実施形態に係る文字認識処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the character recognition process which concerns on embodiment of this invention. 本発明の実施形態に係る丁目・番地部分の検出処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the detection process of the chome / address part which concerns on embodiment of this invention. 本発明の実施形態に係る区切り記号検出処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the separator detection process which concerns on embodiment of this invention. 本発明の実施形態に係る丁目・番地部分の選択処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the selection process of the chome / address part which concerns on embodiment of this invention. 丁目・番地部分を選択した例を示す図である。It is a figure which shows the example which selected the chome and the address part. 文字認識処理の結果を例示する図である。It is a figure which illustrates the result of a character recognition process.

Explanation of symbols

１０１ＣＰＵ
１０２ＲＯＭ
１０３ＲＡＭ
１０４外部記憶装置
１０５ディスプレイ
１０６入力装置 101 CPU
102 ROM
103 RAM
104 External storage device 105 Display 106 Input device

Claims

Pixel block extraction means for extracting a plurality of pixel blocks constituting a character image from input image data;
Detecting means for detecting a continuous portion of a specific character from the plurality of pixel blocks;
In a pixel block excluding a portion where the specific character detected by the detection means is continuous among the plurality of pixel blocks, a distance between adjacent pixel blocks is within a predetermined distance determination threshold, and the adjacent When the aspect ratio of the pixel blocks to be combined is within a predetermined aspect ratio determination threshold , combining means for combining the adjacent pixel blocks,
Character candidate corresponding to the combined pixel block by character recognition processing, and character candidate acquisition means for acquiring an evaluation value corresponding to the character candidate;
Determining means for determining a character candidate that maximizes the evaluation value in the entire character image ;
Wherein the specific character that is detected by the detection means, the character recognition device for a separator, characterized Rukoto includes separating and numbers between the numbers and figures.

Said detecting means, and aspect ratio of the pixel block, based on the information of the position of the pixel block is located, the character recognition apparatus according to claim 1, characterized in that for detecting the candidates of the separator.

The character recognition device according to claim 1 , wherein the detection unit determines, as the delimiter, a delimiter candidate that has a numeral on both sides of the delimiter.

A character recognition method in character recognition device,
A pixel block extraction means for extracting a plurality of pixel blocks constituting a character image from the input image data;
A detection step in which the detection means of the character recognition device detects a continuous portion of a specific character from the plurality of pixel blocks;
In the pixel block excluding a portion where the specific character detected in the detection step is continuous among the plurality of pixel blocks, the combining unit of the character recognition device determines a distance between adjacent pixel blocks in advance. When the aspect ratio is within a predetermined aspect ratio determination threshold within the threshold value and the adjacent pixel blocks are combined, a combining step of combining the adjacent pixel blocks;
Character candidate acquisition means of the character recognition device, a character candidate acquisition step of acquiring an evaluation value corresponding to the character candidate corresponding to the combined pixel block and the character candidate by character recognition processing;
The determination unit of the character recognition apparatus, the whole of the character image, have a, a determination step of determining a character candidate which the evaluation value is maximized,
Wherein the specific character, the character recognition process for a number, a delimiter to separate between the numbers and numbers, wherein Rukoto contains detected by said detecting step.

The computer program for functioning a computer as each means of the character recognition apparatus of any one of Claims 1 thru | or 3 .

A computer-readable storage medium storing the computer program according to claim 5 .