JP2013101616A

JP2013101616A - Method and system for dividing characters of text row having various character widths

Info

Publication number: JP2013101616A
Application number: JP2012245617A
Authority: JP
Inventors: Luo Zhaohai; ルオジャオハイ; Xian Lee; リーシーアン
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2011-11-09
Filing date: 2012-11-07
Publication date: 2013-05-23
Anticipated expiration: 2032-11-07
Also published as: CN103106406A; JP5600723B2; CN103106406B

Abstract

PROBLEM TO BE SOLVED: To provide a method for dividing characters of a text row having various character widths.SOLUTION: The method includes: a first division step of dividing the text row into a first character set on the basis of a projection method; a calculation step of calculating an average character width on the basis of the first character set; a forcible division step of acquiring a second character set by forcibly dividing wide characters of the first character set on the basis of the calculated average character width; a setting step of setting various average character widths to various characters of the second character set; and a connection step of preparing various division patterns according to the various average character widths set in the prior step and then of selecting the best division pattern to connect the character group of the second character set.

Description

本発明の分野
本発明は、光学式文字認識に関するものであり、より詳しくは、様々な文字幅を有するテキスト行の文字を分割するための方法及びシステムに関するものである。 The present invention relates to optical character recognition, and more particularly to a method and system for splitting characters in text lines having various character widths.

関連技術の説明
光学式文字認識（ＯＣＲ）システムでは、一般的には、ＯＣＲ処理は、図３のフローチャートで示されるように実行される。まず、スキャナあるいはカメラ、あるいは他の手段によって取得される文書画像（文書イメージ）が入力される。次に、複数のテキスト行を含む文書画像は、テキスト行画像に分割される。各テキスト行画像に対して、そのテキスト行内の文字に対して文字分割が実行される。その後、文字分割の結果に基づいて、文字認識が実行されて、文字認識結果を生成する。 2. Description of Related Art In an optical character recognition (OCR) system, generally, OCR processing is performed as shown in the flowchart of FIG. First, a document image (document image) acquired by a scanner, a camera, or other means is input. Next, a document image including a plurality of text lines is divided into text line images. For each text line image, character splitting is performed on the characters in the text line. Thereafter, character recognition is performed based on the result of character division to generate a character recognition result.

一般的には、文字分割ステップでは、テキスト行画像は、まず、黒画素投影に基づいて文字に分割されることになる。平均文字幅（ＡＣＷ）は、文字幅、高さ、及び行の高さ等についての統計的情報に従って計算される。オプションとしては、この時点で、連結成分方法による文字分割が実行されても良い。分割文字の幅が平均文字幅よりも大きい場合、平均文字幅に従う、あるいは特開平５−１２８３０７号公報に開示されるような境界追跡方法による、強制分割がなされることになる。次に、平均文字幅に従う様々な分割パターン（パス）を作成することによって、文字の断片が結合される。そして、すべてのパターンにおける文字が認識され、異なる分割パターンから最高の分割結果が、文字分割結果として選択される。 Generally, in the character dividing step, the text line image is first divided into characters based on the black pixel projection. The average character width (ACW) is calculated according to statistical information about character width, height, line height, and the like. As an option, character division by the connected component method may be performed at this point. When the width of the divided characters is larger than the average character width, forced division is performed according to the boundary tracking method according to the average character width or as disclosed in Japanese Patent Laid-Open No. 5-128307. Next, the character fragments are combined by creating various division patterns (paths) according to the average character width. Characters in all patterns are recognized, and the highest division result from different division patterns is selected as the character division result.

以下のフローでは、平均文字幅は、文字領域が、いくつかの実際の文字を含んでいるか、あるいは文字の一部、あるいは文字群の一部であるかを判定するためのかなり重要な基準である。文字を分割する処理あるいは妥当な分割パターンを作成する処理においても、平均文字幅は、重要な判定基準である。 In the following flow, the average character width is a fairly important criterion for determining whether a character area contains some actual characters, part of a character, or part of a group of characters. is there. The average character width is an important criterion in the process of dividing characters or the process of creating an appropriate division pattern.

しかしながら、本発明の発明者は、文書内のテキスト行が他の文字と異なる幅を有する文字で構成されている場合、時には、一定の平均文字幅がテキスト行の文字のすべてに対して適合できないことを発見した。平均文字幅が「幅広」文字に対してのみ適している場合、いくつかの連結文字は正しく分割されない可能性があり、あるいは、いくつかの文字は間違って１つの文字として結合される可能性がある。「幅広」平均文字幅は、多くの取り得る分割パターンをもたらすものでもあり、これには、より多くの計算時間あるいは複雑性が関与する。平均文字幅が「幅狭」文字にのみ適している場合、いくつかの「幅広」文字は、間違って断片に分割されることになる。この両側面は、ＯＣＲ精度を低下させることになる。 However, the inventor of the present invention sometimes fails to adapt a certain average character width to all of the characters in a text line when the text line in the document is composed of characters having a different width than other characters. I discovered that. If the average character width is only suitable for “wide” characters, some concatenated characters may not be split correctly, or some characters may be mistakenly combined as one character is there. The “wide” average character width also results in many possible division patterns, which involve more computation time or complexity. If the average character width is only suitable for “narrow” characters, some “wide” characters will be erroneously divided into fragments. Both side surfaces will reduce the OCR accuracy.

従来技術にある不備を説明するための、いくつかの例が図４Ａ及び図４Ｂに示されている。 Some examples to illustrate the deficiencies in the prior art are shown in FIGS. 4A and 4B.

図４Ａでは、テキスト行は、いくつかの全角文字とくつかの半角文字（例えば、アルファベット、数字、あるいは全角文字の左／右成分）を同時に含んでいる。文字を分割するために一定の平均文字幅を使用する場合、文字分割の結果が図４Ａに示される。従来技術の例では、いくつかの文字分割エラーが発生し、例えば、漢字「特開」は間違って断片に分割される。 In FIG. 4A, the text line includes several full-width characters and some half-width characters (for example, alphabets, numbers, or left / right components of full-width characters) simultaneously. If a constant average character width is used to split the characters, the result of the character splitting is shown in FIG. 4A. In the prior art example, several character division errors occur, for example, the Chinese character “JP” is erroneously divided into fragments.

図４Ｂでは、従来技術を使用する文字分割結果が示されている。同一の書体と同一のフォントサイズを用いている場合でさえ、例として日本語を取り上げると、いくつかのかなの幅は、他のかな及び漢字の少なくとも一方と異なる。例えば、異なる幅の、かな「れる」（図４Ｂで示される第２行）は間違って分割される。 In FIG. 4B, the result of character division using the prior art is shown. Even when using the same typeface and the same font size, taking Japanese as an example, the width of some kana is different from at least one of the other kana and kanji. For example, kana “s” (second row shown in FIG. 4B) of different widths are split incorrectly.

加えて、テキスト行にはいくつかの連結文字が存在するので、平均文字幅に基づいてのみ、最高の分割結果を検出することを難しい。例えば、図４Ｂのかな「バイ」（第１行）と「た」（第３行）は連結文字の例であり、これらは、従来技術に従うと間違って分割される。 In addition, since there are several connected characters in the text line, it is difficult to detect the best segmentation result only based on the average character width. For example, kana “by” (first line) and “ta” (third line) in FIG. 4B are examples of concatenated characters, which are erroneously divided according to the prior art.

それゆえ、ＯＣＲ精度が改善されるように、全角文字と半角文字とを含む、あるいは様々な文字幅を備えるかなと漢字を含むテキスト行の文字を分割することができる技術が必要とされている。また、連結成分の文字を分割することができる技術が必要とされている。 Therefore, there is a need for a technique capable of dividing characters in a text line including kana and kanji characters including full-width characters and half-width characters, or having various character widths, so as to improve OCR accuracy. . In addition, there is a need for a technique that can divide characters of connected components.

上述の従来技術において存在する技術的な課題を考慮して、様々な文字幅を有するテキスト行の文字を分割するための、新規の方法及びシステムが提供される。 In view of the technical problems that exist in the prior art described above, a novel method and system for splitting characters in a text line having various character widths is provided.

本発明の要約
本発明の一態様に従えば、様々な文字幅を有するテキスト行の文字を分割するための方法が提供される。この方法は、
投影法に基づいて、前記テキスト行を第１の文字のセットに分割する第１の分割ステップと、
前記第１の文字のセットに基づいて平均文字幅を計算する計算ステップと、
計算された前記平均文字幅に基づいて、前記第１の文字のセットの幅広文字を強制分割して、第２の文字のセットを取得する強制分割ステップと、
前記第２の文字のセットの種々の文字に対して、様々な平均文字幅を設定する設定ステップと、
設定された様々な前記平均文字幅に従って、様々な分割パターンを作成し、そして、最高の分割パターンを選択することによって、前記第２の文字のセットの文字群を結合する結合ステップと
を有する。 SUMMARY OF THE INVENTION According to one aspect of the invention, a method is provided for splitting characters in a text line having various character widths. This method
A first dividing step of dividing the text line into a first set of characters based on a projection;
Calculating a mean character width based on the first set of characters;
Forcibly dividing the wide character of the first character set based on the calculated average character width to obtain a second character set;
Setting various average character widths for various characters of the second set of characters;
Combining various character patterns of the second character set by creating various division patterns according to the various average character widths set and selecting the highest division pattern.

本発明の別の態様に従えば、様々な文字幅を有するテキスト行の文字を分割するためのシステムが提供される。このシステムは、
投影法に基づいて、前記テキスト行を第１の文字のセットに分割するように構成されている第１の分割ユニットと、
前記第１の文字のセットに基づいて平均文字幅を計算するように構成されている計算ユニットと、
計算された前記平均文字幅に基づいて、前記第１の文字のセットの幅広文字を強制分割して、第２の文字のセットを取得するように構成されている強制分割ユニットと、
前記第２の文字のセットの種々の文字に対して、様々な平均文字幅を設定するように構成されている設定ユニットと、
設定された様々な前記平均文字幅に従って、様々な分割パターンを作成し、そして、最高の分割パターンを選択することによって、前記第２の文字のセットの文字群を結合するように構成されている結合ユニットと
を備える。 According to another aspect of the invention, a system is provided for splitting characters in a text line having various character widths. This system
A first splitting unit configured to split the text line into a first set of characters based on a projection method;
A calculation unit configured to calculate an average character width based on the first set of characters;
A forced splitting unit configured to forcibly split the wide character of the first character set based on the calculated average character width to obtain a second character set;
A setting unit configured to set various average character widths for various characters of the second set of characters;
According to the set various average character widths, various division patterns are created, and the character groups of the second character set are combined by selecting the highest division pattern. And a coupling unit.

従来技術に従う、図４Ａ及び図４Ｂにおける間違った分割結果に対して、図２０Ａ及び図２０Ｂは、本発明に従う方法を適用した分割結果を示している。図２０Ａの分割結果からは、全角文字（漢字）と半角文字（文字及び数字）を含むテキスト行の文字が正しく分割されていることを明確に確認することができる。図２０Ｂの分割結果からは、かな「バイ」（第１の行）と「た。」（第３の行）のような連結成分を有する文字を含むテキスト行の文字も正しく分割されていることを明確に確認することができる。 In contrast to the incorrect segmentation results in FIGS. 4A and 4B according to the prior art, FIGS. 20A and 20B show the segmentation results applying the method according to the invention. From the division result of FIG. 20A, it can be clearly confirmed that the characters in the text line including the full-width characters (kanji) and the half-width characters (characters and numbers) are correctly divided. From the division result of FIG. 20B, the characters in the text line including characters having connected components such as kana “by” (first line) and “ta” (third line) are also correctly divided. Can be clearly confirmed.

正しい文字分割結果に基づいて、光学的文字認識の精度は、全角文字及び半角文字を含む、あるいは、様々な文字幅を有するかなと漢字を含む、あるいは、連結成分を有する文字を含む、テキスト行に対して大幅に改善されることになる。 Based on the correct character segmentation results, the accuracy of optical character recognition is a text line that includes full-width and half-width characters, or includes kana and kanji with various character widths, or includes characters with connected components. Will be greatly improved.

また、本発明の特徴的な構成及び効果は、以下の説明及び図面から明らかになるであろう。 Further, the characteristic configuration and effect of the present invention will become apparent from the following description and drawings.

本発明に従う、様々な文字幅を有するテキスト行の文字を分割するためのコンピュータデバイスの構成を示すブロック図である。FIG. 3 is a block diagram illustrating a configuration of a computing device for dividing characters in a text line having various character widths according to the present invention. 本発明の実施形態に従う、様々な文字幅を有するテキスト行の文字を分割するためのシステムの一般的な構成を示す機能ブロック図である。FIG. 2 is a functional block diagram illustrating a general configuration of a system for dividing characters in a text line having various character widths according to an embodiment of the present invention. 光学式文字認識における、本発明のアプリケーションを示すフローチャートである。It is a flowchart which shows the application of this invention in an optical character recognition. 従来技術に従う、様々な文字幅を有するテキスト行における文字に対する文字分割結果の例を示す図である。It is a figure which shows the example of the character division result with respect to the character in the text line which has various character widths according to a prior art. 従来技術に従う、様々な文字幅を有するテキスト行における文字に対する文字分割結果の例を示す図である。It is a figure which shows the example of the character division result with respect to the character in the text line which has various character widths according to a prior art. 本発明の実施形態に従って、様々な文字幅を有するテキスト行の文字を分割するための方法を示すフローチャートである。4 is a flowchart illustrating a method for splitting characters in a text line having various character widths, in accordance with an embodiment of the present invention. 本発明の実施形態に従う、文字分割方法の各ステップの分割結果を示すテーブルである。It is a table which shows the division | segmentation result of each step of the character division method according to embodiment of this invention. 強制分割処理を必要とするテキスト行の例を示す図である。It is a figure which shows the example of the text line which requires a forced division | segmentation process. 強制分割処理で使用される分割グループの例を示す図である。It is a figure which shows the example of the division group used by forced division processing. 新規の分割点を追加することを必要とする強制分割の例を示す図である。It is a figure which shows the example of the forced division which requires adding a new division point. 本発明の実施形態に従う、図５の方法のステップＳ２００の処理を示すフローチャートである。6 is a flowchart showing a process of step S200 of the method of FIG. 5 according to an embodiment of the present invention. 本発明の実施形態に従う、図５の方法のステップＳ３００の処理を示すフローチャートである。6 is a flowchart showing a process of step S300 of the method of FIG. 5 according to an embodiment of the present invention. 本発明の実施形態に従う、図５の方法のステップＳ４００の処理を示すフローチャートである。6 is a flowchart showing a process of step S400 of the method of FIG. 5 according to an embodiment of the present invention. 平均文字幅に基づいて、１つの分割グループ内の分割点を検出する一方法のフローチャートである。6 is a flowchart of a method for detecting a division point in one division group based on an average character width. どのようにして分割グループに検索位置を設定するかを示す図である。It is a figure which shows how a search position is set to a division group. 動的に決定される点を検索するための検索範囲を示すテーブルである。It is a table which shows the search range for searching the point determined dynamically. ステップＳ４００の後の強制分割結果を示す図である。It is a figure which shows the forced division | segmentation result after step S400. 本発明の一実施形態に従う、図５の方法のステップＳ５００の処理を示すフローチャートである。FIG. 6 is a flowchart showing a process of step S500 of the method of FIG. 5 according to an embodiment of the present invention. 本発明の別の実施形態に従う、図５の方法のステップＳ５００の処理を示すフローチャートである。6 is a flowchart illustrating the process of step S500 of the method of FIG. 5 according to another embodiment of the invention. 図１７のステップＳ５３０の処理の詳細を示す図である。It is a figure which shows the detail of the process of step S530 of FIG. 本発明に従う方法を適用した後の、様々な文字幅を有するテキスト行の文字に対する文字分割結果の例を示す図である。It is a figure which shows the example of the character division result with respect to the character of the text line which has various character widths after applying the method according to this invention. 本発明に従う方法を適用した後の、様々な文字幅を有するテキスト行の文字に対する文字分割結果の例を示す図である。It is a figure which shows the example of the character division result with respect to the character of the text line which has various character widths after applying the method according to this invention.

本発明の実施形態を、図面を参照して詳細に説明する。 Embodiments of the present invention will be described in detail with reference to the drawings.

本記載において、用語「左」及び「右」は、本明細書を読む際に人が通常に行うような方法で画像を見る場合の左側及び右側を示すものである。 In this description, the terms “left” and “right” refer to the left and right sides when viewing an image in a manner that a person would normally do when reading this specification.

本記載において、用語「文字」は、分割結果の個々の要素を示し、これは、実際の文字、実際の文字の一部、句読点、あるいはそれらの組み合わせである場合がある。 In this description, the term “character” refers to an individual element of the segmentation result, which may be an actual character, a portion of an actual character, punctuation, or a combination thereof.

本記載では、特に示さない限り、すべてのサイズ（例えば、高さあるいは幅）は、「画素」の単位である。例えば、Ｌ＜５は、Ｌが５画素未満であることを意味している。 In this description, unless otherwise indicated, all sizes (eg, height or width) are in units of “pixels”. For example, L <5 means that L is less than 5 pixels.

図１は、本発明に従って、様々な文字幅を有するテキスト行の文字を分割するためのシステムを実現するためのコンピュータデバイスの構成を示すブロック図である。説明を簡単にするために、システムは、単一のコンピュータデバイスで構築されるように示されている。しかしながら、そのシステムが単位のコンピュータデバイスで構築されている、あるいはネットワークシステムとして複数のコンピュータデバイスで構築されているかに関わらず、システムは有効である。 FIG. 1 is a block diagram showing a configuration of a computer device for realizing a system for dividing characters of a text line having various character widths according to the present invention. For ease of explanation, the system is shown as being built with a single computing device. However, the system is effective regardless of whether the system is constructed of unit computer devices or a plurality of computer devices as a network system.

図１に示されるように、コンピュータデバイス１００は、様々な文字幅を有するテキスト行の文字を分割する処理を実現するために使用される。コンピュータデバイス１００は、ＣＰＵ１０１、チップセット１０２、ＲＡＭ１０３、記憶コントローラ１０４、ディスプレイコントローラ１０５、ハードディスクドライブ１０６、ＣＤ−ＲＯＭドライブ１０７及びディスプレイ１０８とを備えている。コンピュータデバイスは、更に、信号線１１１を備え、これは、ＣＰＵ１０１とチップセット１０２との間で接続される。また、信号線１１２を備え、これは、チップセット１０２とＲＡＭ１０３との間で接続される。また、周辺デバイスバス１１３を備え、これは、チップセット１０２と、様々な周辺デバイスとの間で接続される。また、信号線１１４を備え、これは、記憶コントローラ１０４とハードディスクドライブ１０６との間で接続される。また、信号線１１５を備え、これは、記憶コントローラ１０４とＣＤ−ＲＯＭドライブ１０７との間で接続される。また、信号線１１６を備え、これは、ディスプレイコントローラ１０５とディスプレイ１０８との間で接続される。 As shown in FIG. 1, the computing device 100 is used to realize a process of dividing characters of a text line having various character widths. The computer device 100 includes a CPU 101, a chip set 102, a RAM 103, a storage controller 104, a display controller 105, a hard disk drive 106, a CD-ROM drive 107, and a display 108. The computer device further includes a signal line 111, which is connected between the CPU 101 and the chipset 102. Further, a signal line 112 is provided, which is connected between the chipset 102 and the RAM 103. A peripheral device bus 113 is also provided, which is connected between the chipset 102 and various peripheral devices. A signal line 114 is also provided, which is connected between the storage controller 104 and the hard disk drive 106. Further, a signal line 115 is provided, which is connected between the storage controller 104 and the CD-ROM drive 107. In addition, a signal line 116 is provided, which is connected between the display controller 105 and the display 108.

クライアント１２０は、コンピュータデバイス１００と直接あるいはネットワーク１３０を介して接続される。クライアント１２０は、文字分割タスクをコンピュータデバイス１００へ送信して、コンピュータデバイス１００は分割結果をクライアント１２０へ返信する。 The client 120 is connected to the computer device 100 directly or via the network 130. The client 120 transmits a character division task to the computer device 100, and the computer device 100 returns the division result to the client 120.

図２は、各モジュールユニットからなる、様々な文字幅を有するテキスト行の文字を分割するためのシステムの一般的な構成を示すブロック図である。 FIG. 2 is a block diagram showing a general configuration of a system for dividing characters in a text line having various character widths, each module unit.

図２に示されるように、文字分割システム２００は、投影法に基づいて、テキスト行を第１の文字のセットに分割するように構成されている第１の分割ユニット２０１と、第１の文字のセットに基づいて、平均文字幅を計算するように構成されている計算ユニット２０３と、オプションとして、連結成分方法を使用して第１の文字のセットの幅広文字を分割し、第３の文字のセットを取得するように構成されている第２の分割ユニット２０５と、計算された平均文字幅に基づいて、第３の文字のセットの幅広文字を強制分割して、第２の文字のセットを取得するように構成されている強制分割ユニット２０７と、第２の文字のセットの種々の文字に対して様々な平均文字幅を設定するように構成されている設定ユニット２０９と、設定された様々な平均文字に従って様々な分割パターンを作成し、そして、最高の分割パターンを選択することによって、第２の文字のセットの文字を結合するように構成されている結合ユニット２１１とを備えている。 As shown in FIG. 2, the character splitting system 200 includes a first split unit 201 configured to split a text line into a first set of characters based on a projection method, and a first character. A calculation unit 203 configured to calculate an average character width based on the set of characters, and optionally, splitting the wide character of the first character set using a connected component method to produce a third character A second segmentation unit 205 configured to obtain a set of characters and a second character set by forcibly dividing the wide character of the third character set based on the calculated average character width. A compulsory splitting unit 207 configured to obtain a character, a setting unit 209 configured to set various average character widths for various characters of the second character set, and A combining unit 211 configured to combine the characters of the second set of characters by creating various dividing patterns according to various average characters and selecting the highest dividing pattern .

文字分割システム２００では、第２の分割ユニット２０５は、光学式文字認識の精度を更に改善するために使用され、また、一実施形態では省略することができる。そのために、第２の分割ユニット２０５は、破線によって示される。第２の分割ユニット２０５が省略される場合、強制分割ユニット２０７は、第１の分割ユニット２０１によって取得される第１の文字のセットの幅広文字を、計算された平均文字幅に基づいて、直接、強制分割して、第２の文字のセットを取得するように構成されている。 In the character segmentation system 200, the second segmentation unit 205 is used to further improve the accuracy of optical character recognition and may be omitted in one embodiment. To that end, the second division unit 205 is indicated by a broken line. When the second division unit 205 is omitted, the forced division unit 207 directly calculates the wide character of the first character set obtained by the first division unit 201 based on the calculated average character width. The second character set is obtained by forced division.

上述のユニット群は、以下で説明される処理を実現するための例示の好適なモジュール群であり、ハードウェアあるいはソフトウェアによって実現することができる。様々なステップを実現するためのモジュール群は、上記では完全には説明されていない。しかしながら、一定の処理を実行するステップが存在する場合、同一の処理を実現するための、対応する機能モジュールあるいはユニットが存在する。 The above-described unit group is an exemplary suitable module group for realizing the processing described below, and can be realized by hardware or software. The modules for implementing the various steps are not fully described above. However, when there is a step for executing a certain process, there is a corresponding functional module or unit for realizing the same process.

図５は、本発明の実施形態に従って、様々な文字幅を有するテキスト行の文字を分割するための方法を示すフローチャートである。この方法は、投影法に基づいて、テキスト行を第１の文字のセットに分割する第１の分割ステップ（Ｓ１００）と、第１の文字のセットに基づいて平均文字幅を計算するステップ（Ｓ２００）と、連結成分法を使用して、第１の文字のセットの幅広文字を分割して、第３の文字のセットを取得する、オプションの第２の分割ステップ（Ｓ３００）と、計算された平均文字幅に基づいて、第３の文字のセットの幅広文字を強制分割して、第２の文字のセットを取得する強制分割ステップ（Ｓ４００）と、第２の文字のセットの種々の文字に対して、様々な平均文字幅を設定する設定ステップ（Ｓ５００）と、設定された様々な平均文字幅に従って、様々な分割パターンを作成し、そして、最高の分割パターンを選択することによって、第２の文字のセットの文字群を結合する結合ステップ（Ｓ６００）とを備える。 FIG. 5 is a flowchart illustrating a method for splitting characters in a text line having various character widths, in accordance with an embodiment of the present invention. The method includes a first dividing step (S100) for dividing a text line into a first set of characters based on a projection method, and a step of calculating an average character width based on the first set of characters (S200). ) And an optional second segmentation step (S300) that uses the connected component method to segment the wide character of the first character set to obtain a third character set, and Based on the average character width, a forcible division step (S400) for forcibly dividing a wide character of the third character set to obtain a second character set; and for various characters of the second character set On the other hand, in the setting step (S500) for setting various average character widths, various division patterns are created according to the set various average character widths, and the highest division pattern is selected. Sentence of And a combining step (S600) to combine the character group of sets.

図５では、第２の分割ステップＳ３００は、光学式文字認識の精度を更に改善するためのものであり、また、一実施形態では省略することができる。そのために、ステップＳ３００は、破線によって示されている。第２の分割ステップＳ３００が省略される場合、強制分割ステップＳ４００は、第１の分割ステップＳ１００で取得される第１の文字のセットの幅広文字を、計算された平均文字幅に基づいて、直接、強制分割して、第２の文字のセットを取得する。 In FIG. 5, the second division step S300 is for further improving the accuracy of optical character recognition and can be omitted in one embodiment. To that end, step S300 is indicated by a broken line. When the second division step S300 is omitted, the forced division step S400 directly calculates the wide character of the first character set obtained in the first division step S100 based on the calculated average character width. , Forcibly split to obtain a second set of characters.

ステップＳ１００では、オリジナルのテキスト行画像に基づいて、テキスト行の文字群が投影法を使用して分割され、第１の文字のセットを取得する。ここで、投影法は、黒画素投影、白画素投影等を含んでいて、これらは、光学式文字認識の分野で周知の文字分割方法であるので、詳細は記載しない。第１の文字のセットの文字数は、Ｖ１として計算される。図６は、本発明の実施形態に従う文字分割方法の各ステップの分割結果を示すテーブルである。図６のテキスト行が投影法によって分割される場合、テーブルの第１行が取得される。この場合、Ｖ１＝１４となる。 In step S100, based on the original text line image, the character group of the text line is divided using a projection method to obtain a first set of characters. Here, the projection method includes black pixel projection, white pixel projection, and the like, which are well-known character division methods in the field of optical character recognition, and thus will not be described in detail. The number of characters in the first set of characters is calculated as V1. FIG. 6 is a table showing the division result of each step of the character division method according to the embodiment of the present invention. When the text line of FIG. 6 is divided by the projection method, the first line of the table is acquired. In this case, V1 = 14.

ステップＳ１００では、連結文字と、左成分と右成分とを有する文字が、間違って分割される可能性がある。例えば、テーブルの第１行の漢字「能」は間違って分割される。 In step S100, a character having a concatenated character, a left component, and a right component may be divided by mistake. For example, the Chinese character “Noh” in the first line of the table is erroneously divided.

ステップＳ２００では、テキスト行全体に対する平均文字幅が、第１の文字のセットに基づいて計算される。ステップＳ２００の詳細は、以下で説明する。 In step S200, the average character width for the entire text line is calculated based on the first set of characters. Details of step S200 will be described below.

一実施形態では、光学文字認識の精度を更に改善するために、本発明に従う文字分割方法は、ステップＳ３００を含めることができる。ステップＳ３００では、第１の文字のセットの幅広文字が、連結成分法を使用して分割され、第３の文字のセットを取得する。ここで、連結成分法も、光学式文字認識の分野で周知の文字分割方法であるので、詳細は記載しない。いわゆる「幅広」文字は、閾値ＴＨ０よりも大きい幅を有する文字を示している。ＴＨ０は、０．９×ＡＣＷより大きく、そして、例えば、ＴＨＯ＝１．１×ＡＣＷとなる。第３の文字のセットの文字数は、Ｖ２として計算される。図６のテキスト行が連結成分法で分割される場合、テーブルの第２行が取得される。この場合、Ｖ２＝１６である。連結成分法による分割ですら、いくつかの連結文字が依然として分割されず、例えば、図４Ｂではかな「バイ」である。ステップＳ３００の詳細は、以下で説明する。 In one embodiment, in order to further improve the accuracy of optical character recognition, the character segmentation method according to the present invention may include step S300. In step S300, the wide character of the first character set is divided using a connected component method to obtain a third character set. Here, the connected component method is also a character dividing method well known in the field of optical character recognition, and therefore will not be described in detail. A so-called “wide” character indicates a character having a width larger than the threshold value TH0. TH0 is larger than 0.9 × ACW and, for example, THO = 1.1 × ACW. The number of characters in the third set of characters is calculated as V2. When the text line of FIG. 6 is divided by the connected component method, the second line of the table is acquired. In this case, V2 = 16. Even with the connected component method, some connected characters are still not split, for example, kana “by” in FIG. 4B. Details of step S300 will be described below.

ステップＳ４００では、第３の文字のセット（あるいは第１の文字のセット、ステップＳ３００が省略される場合）の幅広文字が、計算された平均文字幅に基づいて強制分割されて、第２の文字のセットを取得する。オプションのステップＳ３００の後でさえも、閾値ＴＨ０よりも大きい幅広文字がテキスト行に依然として存在する。そのため、強制分割が必要となる。第２の文字のセットの文字数は、Ｖ３として計算される。図６のテキスト行が強制分割によって分割される場合、テーブルの第３行が取得される。この場合、Ｖ３＝２７となる。ステップＳ４００の詳細は、以下で説明する。 In step S400, the wide character of the third character set (or the first character set, if step S300 is omitted) is forcibly divided based on the calculated average character width to obtain the second character. Get a set of Even after optional step S300, there are still wide characters in the text line that are greater than the threshold TH0. Therefore, forced division is necessary. The number of characters in the second set of characters is calculated as V3. When the text line of FIG. 6 is divided by forced division, the third line of the table is acquired. In this case, V3 = 27. Details of step S400 will be described below.

ステップ５００では、様々な平均文字幅が、第２の文字のセットの種々の文字に対して設定される。１つ（大きい方）のＡＣＷは幅広文字（あるいは全角文字）に対して設定され、また、別のＡＣＷが通常の文字（あるいは半角文字）に対して設定される。ステップＳ５００の詳細は、以下で説明する。 In step 500, various average character widths are set for various characters of the second set of characters. One (larger) ACW is set for a wide character (or full-width character), and another ACW is set for a normal character (or half-width character). Details of step S500 will be described below.

ステップＳ６００では、設定された様々な平均文字幅に従って、様々な分割パターンを作成し、そして、最高の分割パターンを選択することによって、第２の文字のセットの文字群が結合される。 In step S600, the character groups of the second character set are combined by creating various division patterns according to the set various average character widths and selecting the highest division pattern.

手短にいえば、強制分割によって取得される第２の文字のセットの文字は、実際の文字の断片である場合がある（例えば、漢字「能」は、２つの成分に分割される）。実際の文字を取得するために、これらの断片の多くの取り得る組み合わせが存在し、これらは、分割パターンと呼ばれる。分割パターンは、第２の文字のセットの隣接文字の組み合わせを示している。平均文字幅は、妥当なパターンを作成する際のかなり重要な条件である。単一の平均文字幅は、同一テキスト行の幅広文字及び幅狭文字の両方に対して適合しないので、パターンを制限することになる。ここで、適切な平均文字幅とは、妥当でないパターンを除外して、正しいパターンを含んでいることを意味し、これは、計算量を削減し、ＯＣＲの精度を改善する。本発明は、主に、単一のテキスト行に対して様々な平均文字幅をいつ、どのようにして計算するかについて着目している。パターンを制限するために、計算済の平均文字幅を使用する方法は光学的文字認識の分野で周知であり、その詳細説明の１つは、１９９６年７月の、パターン解析及びマシーンインテリジェンスにおけるＩＥＥＥ議事録、Ｖｏｌ．１８、Ｎｏ．７の、リチャードジー．ケイシーとエリックリコリネットによる、「文字分割の方法及びストラテジーの概説」で参照することができる。 In short, the characters in the second set of characters obtained by forced splitting may be actual character fragments (eg, the Chinese character “Noh” is split into two components). There are many possible combinations of these fragments to obtain the actual characters, these are called split patterns. The division pattern indicates a combination of adjacent characters in the second character set. Average character width is a fairly important condition in creating a reasonable pattern. A single average character width will limit the pattern because it does not fit for both wide and narrow characters in the same text line. Here, an appropriate average character width means that a correct pattern is included excluding an invalid pattern, which reduces the amount of calculation and improves the accuracy of OCR. The present invention focuses primarily on when and how various average character widths are calculated for a single text line. The method of using the calculated average character width to limit the pattern is well known in the field of optical character recognition, one of which is the detailed description of IEEE in Pattern Analysis and Machine Intelligence, July 1996. Minutes, Vol. 18, no. 7. Richard G. Refer to Casey and Eric Ricorinette's “Overview of Character Splitting Methods and Strategies”.

図１０は、本発明の実施形態に従う図５の方法のステップＳ２００の処理を示すフローチャートである。 FIG. 10 is a flowchart showing a process of step S200 of the method of FIG. 5 according to the embodiment of the present invention.

ステップＳ２１０では、第１の文字のセットの大まかな平均文字幅ＡＣＷ１が計算される。つまり、この計算では、（テキスト行の）第１の文字のセットの文字のすべてが考慮される。次に、計算された平均文字幅ＡＣＷ１が適切であるかどうかを判定するために、ＡＣＷ１の信頼度が以下のように計算される。 In step S210, a rough average character width ACW1 of the first set of characters is calculated. That is, in this calculation, all of the characters in the first set of characters (in the text line) are considered. Next, to determine whether the calculated average character width ACW1 is appropriate, the reliability of ACW1 is calculated as follows.

３つのタイプの文字が計数される。第１の文字のセットの文字のすべてが、値Ｃ１として計数される。幅−高さ−比率が妥当である文字は、値Ｃ２として計数される。幅−高さ−比率が妥当であり、また、幅がＡＣＷ１に近い文字は、値Ｃ３として計数される。ここで、文字の幅−高さ−比率が妥当であると見なされる場合、１−ＴＨ１６＜幅−高さ−比率＜１＋ＴＨ１６を満足することを必要とする。ここで、ＴＨ１６は、例えば、０．１から０．５の間で変化する閾値であり、好ましくは、ＴＨ１６＝０．１である。文字の幅がＡＣＷ１に近いと見なされる場合、（１−ＴＨ１６）＊ＡＣＷ１＜文字幅＜（１＋ＴＨ１６）＊ＡＣＷ１を満足することを必要とする。値Ｃ１、Ｃ２及びＣ３を取得した後、ＡＣＷ１の信頼度＝Ｍｉｎｉｍｕｍ（Ｃ２／Ｃ１，Ｃ３／Ｃ２）であり、ここで、Ｍｉｎｉｍｕｍ（Ａ，Ｂ）は、Ａ及びＢの最小値を意味する。 Three types of characters are counted. All characters of the first set of characters are counted as the value C1. Characters with a reasonable width-height-ratio are counted as the value C2. A character whose width-height-ratio is reasonable and whose width is close to ACW1 is counted as the value C3. Here, if the character width-height-ratio is considered reasonable, it is necessary to satisfy 1-TH16 <width-height-ratio <1 + TH16. Here, TH16 is a threshold value that varies between 0.1 and 0.5, for example, and preferably TH16 = 0.1. When the character width is considered to be close to ACW1, it is necessary to satisfy (1-TH16) * ACW1 <character width <(1 + TH16) * ACW1. After obtaining the values C1, C2 and C3, the reliability of ACW1 = Minimum (C2 / C1, C3 / C2), where Minimum (A, B) means the minimum value of A and B.

ＡＣＷ１の計算された信頼度が閾値ＴＨ１未満である場合（ＴＨ１は、例えば、０．６よりも大きく、好ましくは、ＴＨ１＝０．７５）、これは、ＡＣ１が十分に適切でないことを意味し、この場合、処理はステップＳ２２０へ継続する。そうでなければ、ＡＣＷ１は、テキスト行全体のＡＣＷとして使用される。 If the calculated reliability of ACW1 is less than the threshold TH1 (TH1 is greater than 0.6, for example, preferably TH1 = 0.75), this means that AC1 is not sufficiently adequate In this case, the process continues to step S220. Otherwise, ACW1 is used as the ACW for the entire text line.

ステップＳ２２０では、第１の文字のセットから選択される、所定の範囲内の幅−高さ−比率の文字の平均文字幅ＡＣＷ２が計算される。例えば、幅−高さ−比率の所定の範囲は、［１−ＴＨ１７，１＋ＴＨ１７］（ＴＨ１７は、０から０．４の範囲である）であり、好ましくは、［０．９，１．１］である。これらの選択された文字の平均文字幅ＡＣＷ２が計算される。ＡＣＷ２の信頼度は、Ｃ３の計算において、ＡＣＷ１がＡＣＷ２に置き換わる以外は、ステップＳ２１０の方法と同様方の方法で計算される。計算されたＡＣＷ２の信頼度が閾値ＴＨ１未満である場合、これは、ＡＣＷ２が十分に適していないことを意味し、そして、処理は、ステップＳ２３０へ継続する。そうでなければ、ＡＣＷ２は、テキスト行全体のＡＣＷとして使用される。 In step S220, an average character width ACW2 of characters having a width-height-ratio within a predetermined range selected from the first set of characters is calculated. For example, the predetermined range of width-height-ratio is [1-TH17, 1 + TH17] (TH17 is in the range of 0 to 0.4), preferably [0.9, 1.1]. It is. The average character width ACW2 of these selected characters is calculated. The reliability of ACW2 is calculated by the same method as that of step S210, except that ACW1 is replaced by ACW2 in the calculation of C3. If the calculated ACW2 confidence is less than the threshold TH1, this means that ACW2 is not well suited, and the process continues to step S230. Otherwise, ACW2 is used as the ACW for the entire text line.

ステップＳ２３０では、平均文字幅ＡＣＷ３が、直前あるいは次のテキスト行の平均文字幅に従って計算される。特に、現在のテキスト行に隣接する（直前あるいは次の）テキスト行が文書画像内に存在し、そして、現在のテキスト行と隣接するテキストとの間の高さの差が閾値ＴＨ２より小さいことを判定する。ここで、ＴＨ２＝Ｘ＊ｃＬｉｎｅＨｅｉｇｈｔとｐＬｉｎｅＨｅｉｇｈｔの大きい方、Ｘは０．１から０．５の間で変化し、ｃＬｉｎｅＨｅｉｇｈｔは現在のテキスト行の最大文字高さであり、ｐＬｉｎｅＨｅｉｇｈｔは隣接するテキスト行の最大文字高さである。判定の結果が否定である場合、処理はステップＳ２４０へ継続する。判定の結果が肯定である場合、隣接するテキスト行の平均文字幅の信頼度が計算される。信頼度が閾値ＴＨ１未満である場合、処理はステップＳ２４０へ継続し、そうでなければ、以下の式によって、隣接するテキスト行の平均文字幅に従って、現在のテキスト行の平均文字幅ＡＣＷ３を計算する。 In step S230, the average character width ACW3 is calculated according to the average character width of the immediately preceding or next text line. In particular, there is a text line adjacent (previous or next) to the current text line in the document image, and the height difference between the current text line and the adjacent text is less than a threshold TH2. judge. Where TH2 = X * cLineHeight and pLineHeight, whichever is greater, X varies between 0.1 and 0.5, cLineHeight is the maximum character height of the current text line, and pLineHeight is the adjacent text line The maximum character height. If the result of the determination is negative, the process continues to step S240. If the result of the determination is affirmative, the reliability of the average character width of adjacent text lines is calculated. If the reliability is less than the threshold TH1, the process continues to step S240; otherwise, the average character width ACW3 of the current text line is calculated according to the average character width of adjacent text lines according to the following formula: .

ここで、ｃｏｅｆｆは０から１の間で変化し、好ましくは、０．７であり、ＡＣＷ_CurrentLine＝ＡＣＷ１あるいはＡＣＷ２である。 Here, coeff varies between 0 and 1, preferably 0.7, and ACW _CurrentLine = ACW1 or ACW2.

ステップＳ２４０では、平均文字幅ＡＣＷ４は、テキスト行の高さに一定値を乗算することによって計算される。テキスト行の高さに一定値を乗算することによって、平均文字幅ＡＣＷ４が取得される。ＡＣＷ４の信頼度は、Ｃ３の計算において、ＡＣＷ１がＡＣＷ４に置き換わる以外は、ステップＳ２１０の方法と同様方の方法で計算される。計算された信頼度が閾値ＴＨ１未満である場合、ＡＣＷ１はテキスト行全体のＡＣＷとして使用される。そうでなければ、ＡＣＷ４は、テキスト行全体のＡＣＷとして使用される。 In step S240, the average character width ACW4 is calculated by multiplying the text line height by a constant value. The average character width ACW4 is obtained by multiplying the height of the text line by a constant value. The reliability of ACW4 is calculated by the same method as that of step S210, except that ACW1 is replaced by ACW4 in the calculation of C3. If the calculated confidence is less than the threshold TH1, ACW1 is used as the ACW for the entire text line. Otherwise, ACW4 is used as the ACW for the entire text line.

図１０は、本発明に従う行全体の平均文字幅を計算するための好適な方法だけを示している。簡略化した実施形態では、図１０における連鎖（カスケード）方法に基づいて、信頼度を計算して、平均文字幅を取得することは必要とせず、平均文字幅は、以下の方法の１つで直接計算されても良い。第１の文字のセットの平均文字幅を計算すること、第１の文字のセットから選択される、所定の範囲内の幅−高さ−比率を有する文字の平均文字幅を計算すること、直前あるいは次のテキスト行の平均文字幅に従って平均文字幅を計算すること、あるいはテキスト行の高さに一定値を乗算することによって平均文字幅を計算することである。 FIG. 10 shows only the preferred method for calculating the average character width of the entire line according to the present invention. In the simplified embodiment, it is not necessary to calculate the reliability and obtain the average character width based on the chain method in FIG. 10, and the average character width is one of the following methods: It may be calculated directly. Calculating an average character width of a first set of characters, calculating an average character width of characters selected from the first set and having a width-height-ratio within a predetermined range, immediately before Alternatively, the average character width is calculated according to the average character width of the next text line, or the average character width is calculated by multiplying the text line height by a constant value.

図１１は、本発明の実施形態に従う図５の方法のステップＳ３００の処理を示すフローチャートである。図５の方法では、ステップＳ３００はオプションである。 FIG. 11 is a flowchart showing a process of step S300 of the method of FIG. 5 according to the embodiment of the present invention. In the method of FIG. 5, step S300 is optional.

ステップＳ３１０では、閾値ＴＨ０より大きい文字幅を有する第１の文字のセットの各文字に対して（即ち、幅広文字である）、幅広文字が連結成分法を使用して分割される。ここで、ＴＨ０＝Ｘ＊ＡＣＷであり、Ｘは、例えば、０．９より大きく、好ましくは、Ｘ＝１．１である。ステップＳ３１０の後、第３の文字のセットが取得される。第３の文字のセットの文字数は、Ｖ２として計算される。図６のテキスト行が連結成分法によって分割される場合、テーブルの第２行が取得される。 In step S310, for each character in the first set of characters having a character width greater than threshold TH0 (ie, a wide character), the wide character is segmented using the connected component method. Here, TH0 = X * ACW, and X is, for example, greater than 0.9, and preferably X = 1.1. After step S310, a third set of characters is obtained. The number of characters in the third set of characters is calculated as V2. When the text line of FIG. 6 is divided by the connected component method, the second line of the table is acquired.

ステップＳ３２０では、ステップＳ３１０でより多くの文字に分割し過ぎられている場合、つまり、Ｖ２／Ｖ１が閾値Ｔ４より大きい場合（ＴＨ４が１．１より大きく、好ましくは、ＴＨ４＝１．３である）場合、ステップＳ２１０で記載される方法を使用して平均文字幅を再計算する。 In step S320, if it is divided into more characters in step S310, that is, if V2 / V1 is larger than the threshold T4 (TH4 is larger than 1.1, preferably TH4 = 1.3). ), The average character width is recalculated using the method described in step S210.

図１２は、本発明の実施形態に従う図５の方法のステップＳ４００の処理を示すフローチャートである。 FIG. 12 is a flowchart showing a process of step S400 of the method of FIG. 5 according to the embodiment of the present invention.

ステップＳ３００はオプションであるので、ステップＳ４００への入力は、ステップＳ３００が省略される場合は、第１の文字のセット（投影法による分割の結果）となり得る、あるいはステップＳ３００が含まれる場合は、第３の文字のセット（連結成分法による分割の結果）となり得る。説明を簡単にするために、前者の場合だけを、例として説明する。しかし、当業者は、本願が後者の場合にも同様に適用できることを理解するであろう。 Since step S300 is optional, the input to step S400 can be the first set of characters (result of division by projection) if step S300 is omitted, or if step S300 is included, It can be the third set of characters (result of segmentation by the connected component method). In order to simplify the description, only the former case will be described as an example. However, those skilled in the art will appreciate that the present application is equally applicable to the latter case.

ステップＳ４１０では、第１の文字のセットの各文字に対して、文字が閾値ＴＨ５よりも大きい幅を有しているかどうか（つまり、幅広すぎるか）を判定する。ここで、ＴＨ５＝Ｘ＊ＡＣＷであり、Ｘは１より大きく、かつ好ましくは、Ｘ＝１．１である。判定の結果が肯定である場合、文字は、ステップＳ４２０−４５０を使用する強制分割の対象となる。図７は、強制分割処理を必要とするテキスト行の例を示している。例えば、図７のテキスト行画像が処理され、そして、ステップＳ２００で計算される平均文字幅が７８であると仮定する。ステップＳ１００の後（ステップＳ３００の後でもさえ）、図７において円でマークされている文字は正しく分割することができず、また、このマークされている文字の幅は１０４である。１０４＞１．１＊７．８であるので、このマークされている文字は幅広文字であり、そして、強制分割されることになる。 In step S410, for each character in the first set of characters, it is determined whether the character has a width greater than threshold TH5 (ie, is too wide). Here, TH5 = X * ACW, X is greater than 1, and preferably X = 1.1. If the result of the determination is affirmative, the character is subject to forced splitting using steps S420-450. FIG. 7 shows an example of a text line that requires forced division processing. For example, assume that the text line image of FIG. 7 is processed and the average character width calculated in step S200 is 78. After step S100 (even after step S300), the character marked with a circle in FIG. 7 cannot be correctly divided, and the width of the marked character is 104. Since 104> 1.1 * 7.8, the marked character is a wide character and is forcibly divided.

ステップＳ４２０では、幅広文字に対して、幅広文字、あるいは、幅広文字と隣接する文字との組み合わせからなる複数の分割グループが生成され、平均文字幅に基づいて、各分割グループにおける取り得る分割点が検索され、そして、各分割点のスコアが取得される。 In step S420, a plurality of divided groups consisting of wide characters or combinations of wide characters and adjacent characters are generated for wide characters, and the possible division points in each divided group are based on the average character width. The search is performed, and the score of each division point is obtained.

例として図７の文字幅を採用すると、４つの分割グループが図８で示されるように生成される。図８は、強制分割処理で使用される分割グループの例を示している。図８の左から右へと、４つの分割グループが、現在の文字だけ、現在の文字と直前の文字の組み合わせ、現在の文字と次の文字との組み合わせ、そして、現在の文字、直前の文字、及び次の文字の組み合わせとして、連続して示されている。次に、平均文字幅に基づいて、各分割グループの左端から、及び各分割グループの右端から別々に、各分割グループにおける取り得る分割点が検索され、そして、各分割点のスコアが取得される。 Taking the character width of FIG. 7 as an example, four divided groups are generated as shown in FIG. FIG. 8 shows an example of a division group used in the forced division process. From left to right in FIG. 8, the four divided groups are the current character only, the combination of the current character and the previous character, the combination of the current character and the next character, and the current character and the previous character. , And the next combination of letters. Next, based on the average character width, possible division points in each division group are searched separately from the left end of each division group and from the right end of each division group, and the score of each division point is obtained. .

ここで、ステップＳ４２０の処理の詳細を、図１３を参照して説明する。図１３は、平均幅文字に基づいて、１つの分割グループ内の分割点を検出する方法の１つのフローチャートである。 Here, details of the processing in step S420 will be described with reference to FIG. FIG. 13 is a flowchart of a method for detecting division points in one division group based on average width characters.

ステップＳ４２１では、分割グループ内の１つ以上の検索位置が、平均文字幅に従って設定される。検索位置は、分割グループの左端と右端の両方から位置（Ｎ＊ＡＣＷ）に配置され、ここで、Ｎ＝１，２，．．．，ＩＮＴ（分割グループ／ＡＣＷの幅）であり、また、ＩＮＴ（Ｘ）はＸの整数部分に等しい。例として図８の４つの分割グループを採用すると、すべての検索位置が図１４で示される。図１４は、どのようにして検索位置を分割グループに設定するかを示している。図１４では、４つの行は、図８の４つの分割グループそれぞれに対応し、左側は、取り得る分割点が分割グループの左端から検索される場合を示していて、右側は、取り得る分割点が分割グループの右端から検索される場合を示している。 In step S421, one or more search positions in the divided group are set according to the average character width. The search position is arranged at a position (N * ACW) from both the left end and the right end of the divided group, where N = 1, 2,. . . , INT (width of divided group / ACW), and INT (X) is equal to the integer part of X. When the four divided groups of FIG. 8 are adopted as an example, all search positions are shown in FIG. FIG. 14 shows how the search position is set to the divided group. In FIG. 14, the four rows correspond to the four division groups in FIG. 8, the left side shows a case where a possible division point is searched from the left end of the division group, and the right side is a possible division point. Indicates a case where the search is performed from the right end of the divided group.

ステップＳ４２２では、各検索位置に対して、平均文字幅倍の幅と、検索位置が配置されている分割グループの幅の差に従って、その検索位置が中心に置かれる、分割点に対する検索範囲を動的に決定する。具体的には、取り得る分割点は各位置の近辺で検索される。分割点の検索範囲が、ＡＣＷ倍の幅と分割グループの幅の差に従って動的に決定される。分割点の検索範囲は、［−ＴＨ７，ＴＨ７］であり、これは、検索位置が中心に置かれる。例えば、ＴＨ７＝５％＊ＡＣＷである。Ｒａｔｉｏ（比率）＝“分割グループの幅”ＭＯＤ“ＡＣＷ”／“ＡＣＷ”である（ＭＯＤは、剰余を求める演算子を意味する）。Ｒａｔｉｏが８５％より大きい場合あるいは１５％より小さい場合、ＴＨ７は１０％＊ＡＣＷまで拡大される。図１４の場合、最初の３つの分割グループに対する検索位置が中心に置かれる検索範囲が図１５のテーブルで示されている。図１５では、テーブルの第３の分割グループに対して、Ｒａｔｉｏが９１％であり、これは、８５％よりも大きいので、ＴＨ７は動的に１０％＊ＡＣＷまで拡大される。つまり、この分割グループに対する検索範囲は、この分割グループ内の検索位置を中心にして［−１０％＊ＡＣＷ，１０％ＡＣＷ］である。この場合において、固定の検索範囲が使用される場合、正しい分割点は検出することができない。 In step S422, for each search position, the search range for the division point where the search position is centered is moved according to the difference between the average character width doubled and the width of the divided group in which the search position is arranged. To decide. Specifically, possible division points are searched in the vicinity of each position. The search range of the division point is dynamically determined according to the difference between the ACW-fold width and the division group width. The search range of the dividing point is [−TH7, TH7], which is centered on the search position. For example, TH7 = 5% * ACW. Ratio = “width of divided group” MOD “ACW” / “ACW” (MOD means an operator for calculating a remainder). If Ratio is greater than 85% or less than 15%, TH7 is expanded to 10% * ACW. In the case of FIG. 14, the search range in which the search positions for the first three divided groups are centered is shown in the table of FIG. In FIG. 15, for the third split group of the table, the Ratio is 91%, which is greater than 85%, so TH7 is dynamically expanded to 10% * ACW. That is, the search range for this divided group is [−10% * ACW, 10% ACW] with the search position in this divided group as the center. In this case, when a fixed search range is used, a correct division point cannot be detected.

ステップＳ４２３では、各検索範囲では、各画素列（あるいは行）に対する分割スコアを計算し、そして、検索範囲内の分割点として、最小分割スコアを有する画素列（あるいは行）を選択する。ここでは、一見したところ、テキスト行が水平である場合、分割点は画素列であり、また、テキスト行が垂直である場合、分割点は画素行となる。例えば、スコアは、画素列（あるいは行）の黒画素投影量と、隣接する画素列（あるいは行）の他の黒画素に連結されている黒画素数との総和である。 In step S423, in each search range, a division score for each pixel column (or row) is calculated, and a pixel column (or row) having the minimum division score is selected as a division point in the search range. Here, at first glance, when the text line is horizontal, the dividing point is a pixel column, and when the text line is vertical, the dividing point is a pixel line. For example, the score is the sum of the black pixel projection amount of the pixel column (or row) and the number of black pixels connected to other black pixels in the adjacent pixel column (or row).

ステップＳ４２４では、検索地点の各検索範囲に対して、最小分割スコアを有する画素列（あるいは行）が、自身の分割点として選択される。 In step S424, a pixel column (or row) having the minimum division score is selected as its division point for each search range of the search point.

ステップＳ４２５では、各分割グループに対して、自身の分割点とこれらの分割点のスコアが取得される。 In step S425, own division points and scores of these division points are acquired for each division group.

ここで、図１２に戻る。ステップＳ４３０では、各分割グループに対するスコアは、分割グループの各分割点のスコアに基づいて計算される。特に、各分割グループに対しては、２つのスコアが存在する。１つは（Ｓｃｏｒｅ１（スコア１））、左端から分割点を検索するためのスコアであり、もう１つは（Ｓｃｏｒｅ２（スコア２））は、右端から分割点を検索するためのスコアである。Ｓｃｏｒｅ１は、左端からの分割グループのすべての分割点の平均スコアである。Ｓｃｏｒｅ２は、右端からの分割グループのすべての分割点の平均スコアである。分割グループの最終スコアは、Ｓｃｏｒｅ１とＳｃｏｒｅ２の最小値である。 Returning now to FIG. In step S430, the score for each division group is calculated based on the score of each division point of the division group. In particular, there are two scores for each split group. One (Score1 (score 1)) is a score for searching for a dividing point from the left end, and the other (Score2 (score 2)) is a score for searching for a dividing point from the right end. Score 1 is an average score of all the division points of the division group from the left end. Score2 is the average score of all the division points of the division group from the right end. The final score of the divided group is the minimum value of Score1 and Score2.

ステップＳ４４０では、すべてのグループから、最小スコアを有する分割グループが強制分割結果として選択される。一実施形態では、強制分割処理は、ステップＳ４４０の後で終了しても良い。 In step S440, a divided group having the minimum score is selected as a forced division result from all groups. In one embodiment, the forced splitting process may end after step S440.

別の実施形態では、更なる判定が、ステップＳ４４０の後に行われても良い。ステップＳ４５０では、選択された分割グループのスコアが閾値より大きい場合、投影法に基づいて、現在の文字幅の中間に新規の分割点を追加する。特に、最高の分割パターンのスコアが依然として閾値ＴＨ６よりも大きい場合、新規の分割点が、投影法に従って、現在の文字幅の中間に追加される。新規の分割点は、以下の条件を満足しなければならない。 In another embodiment, further determination may be made after step S440. In step S450, if the score of the selected divided group is larger than the threshold, a new dividing point is added in the middle of the current character width based on the projection method. In particular, if the score of the highest division pattern is still greater than the threshold TH6, a new division point is added in the middle of the current character width according to the projection method. The new dividing point must satisfy the following conditions:

ａ）分割点の黒画素投影量が範囲Ａの最小値であり、範囲Ａが、文字の１／４幅から３．４幅までの、文字の中間部分であること
ｂ）分割点の黒画素投影量が、範囲Ａの黒画素投影量の最大値の１／３よりも小さいこと
ｃ）分割点に対応する画素列（あるいは行）内に１つの黒画素ブロックだけが存在すること、黒画素ブロックは、連続する黒画素のグループを意味する
図９は、新規の分割点を追加する必要がある強制分割の例を示す図である。図９では、文字分割結果の場合が示されている。垂直線２は、平均文字幅に基づく検索位置である。正しい分割点は、分割点に対する検索範囲外になっている。垂直線１は、ステップＳ４１０−Ｓ４４０を実行することによって検出される分割位置である。また、垂直線３は、ステップＳ４５０で追加される新規の分割点である。 a) The black pixel projection amount at the dividing point is the minimum value of the range A, and the range A is the middle part of the character from the 1/4 width to the 3.4 width of the character. b) The black pixel at the dividing point The projection amount is smaller than 1/3 of the maximum black pixel projection amount in the range A. c) Only one black pixel block exists in the pixel column (or row) corresponding to the division point, and the black pixel. A block means a group of continuous black pixels. FIG. 9 is a diagram showing an example of forced division in which a new division point needs to be added. FIG. 9 shows the result of character division. A vertical line 2 is a search position based on the average character width. The correct division point is outside the search range for the division point. The vertical line 1 is a division position detected by executing steps S410 to S440. The vertical line 3 is a new division point added in step S450.

図１７は、本発明の実施形態に従う図５の方法のステップＳ５００の処理を示すフローチャートである。図１７の処理に対して、ステップＳ４００の強制分割によって取得される第２の文字のセットと、ステップＳ２００で取得される平均文字幅が入力される。 FIG. 17 is a flowchart showing a process of step S500 of the method of FIG. 5 according to the embodiment of the present invention. For the process in FIG. 17, the second character set acquired by the forced division in step S400 and the average character width acquired in step S200 are input.

ステップＳ５１０では、第１の文字のセット（つまり、ステップＳ１００における投影法の分割結果）の文字間の平均スペースが計算される。 In step S510, the average space between characters of the first character set (that is, the projection method division result in step S100) is calculated.

ステップＳ５２０では、第１の文字のセットの文字の数と、第２の文字のセットの文字の数と、平均スペースに従って、テキスト行が、様々な幅を有する大量数の文字を含んでいるかを判定する。特に、以下の条件に合致するかが判定される。 In step S520, according to the number of characters in the first character set, the number of characters in the second character set, and the average space, whether the text line includes a large number of characters having various widths. judge. In particular, it is determined whether the following conditions are met.

条件１：強制分割法（ステップＳ４００）で分割されているたくさんの文字が存在する、即ち、例えば、（Ｖ３−Ｖ１）／Ｖ１＞ＴＨ１８、ＴＨ１８＝３／７である。 Condition 1: There are many characters divided by the forced division method (step S400), that is, for example, (V3-V1) / V1> TH18, TH18 = 3/7.

条件２：ステップＳ１００の投影法によって分割される文字間の平均スペースが十分に大きい、即ち、平均スペースが、閾値ＴＨ８より大きい（ＴＨ８＝ＡＣＷ／Ｘであり、Ｘは８より大きく、また、好ましくは、Ｘ＝１０である）。 Condition 2: The average space between characters divided by the projection method in step S100 is sufficiently large, that is, the average space is larger than the threshold TH8 (TH8 = ACW / X, X is larger than 8, and preferably X = 10).

ステップＳ３００が含まれる場合において、上述の判定がなされる際には、第３の文字のセットの数を考慮することができる。特に、この場合、条件１は、例えば、（Ｖ３−Ｖ１）／Ｖ１＞ＴＨ１８、（Ｖ２−Ｖ１）／Ｖ１＞ＴＨ１９、ＴＨ１９＝３／２０である。 When step S300 is included, the number of third character sets can be taken into account when the above determination is made. In particular, in this case, the condition 1 is, for example, (V3−V1) / V1> TH18, (V2−V1) / V1> TH19, and TH19 = 3/20.

ステップＳ５３０では、ステップＳ５２０における判定結果が肯定である場合、強制分割によって分割された幅広文字に対して別の平均文字幅が設定される。特に、この肯定の判定結果は、多くの幅広文字がこのテキスト行にあり、かつ強制分割されていることを表している。強制分割法（ステップＳ４００）によって分割される文字は、間違って分割された幅広文字と見なされる。この点では、幅広文字は強制分割されているので、幅広文字に対して別の平均文字幅を設定することは、幅広文字の第１の断片が別の平均文字幅で設定されることを示している。 In step S530, if the determination result in step S520 is affirmative, another average character width is set for the wide character divided by forced division. In particular, this affirmative determination result indicates that many wide characters are present in this text line and are forcibly divided. A character divided by the forced division method (step S400) is regarded as a wide character divided by mistake. In this respect, since wide characters are forcibly split, setting a different average character width for wide characters indicates that the first fragment of wide characters is set with a different average character width. ing.

図１９は、図１７のステップＳ５３０の処理の詳細を示している。ステップＳ５３１では、現在のテキスト行に類似する高さを有する隣接するテキスト行が、類似行として検索される。この類似行は、以下の条件を満足しなければならない。 FIG. 19 shows details of the process in step S530 of FIG. In step S531, an adjacent text line having a height similar to the current text line is searched as a similar line. This similar row must satisfy the following conditions:

ｉ）隣接する行の文字の数が、閾値ＴＨ３より大きい、例えば、ＴＨ１３＞１０であり、好ましくは、ＴＨ１３＝２０（これは、そのＡＣＷに信頼性があることを意味する）
ｉｉ）２つの行の間の行の高さの差が閾値ＴＨ１４より小さい、例えば、ＴＨ１４＝Ｘ＊現在の行の高さあるいは隣接する行の高さの大きい方、ここで、Ｘ＜０．５であり、好ましくは、Ｘ＝３／１０である。 i) The number of characters in adjacent lines is greater than a threshold TH3, eg TH13> 10, preferably TH13 = 20 (this means that the ACW is reliable)
ii) The height difference between the two rows is less than the threshold TH14, eg TH14 = X * the current row height or the adjacent row height, where X <0. 5, preferably X = 3/10.

類似行が検出される場合、処理はステップＳ５３２へ継続する。ステップＳ５３２では、類似行の平均文字幅が、別の平均文字幅を設定するために使用される。特に、以下の式が、幅広文字に対する別のＡＣＷを設定するために使用される。 If a similar row is detected, processing continues to step S532. In step S532, the average character width of similar lines is used to set another average character width. In particular, the following equation is used to set another ACW for wide characters:

ここで、ｂ＞ａであり、好ましくは、ａ＝１、及びｂ＝４である。 Here, b> a, and preferably a = 1 and b = 4.

類似行が検出されない場合、処理は、ステップＳ５３３へ継続する。ステップＳ５３３では、ステップＳ２００で計算される平均文字幅が、別の平均文字幅を設定するために、直接、パラメータと乗算される。特に、以下の式が、幅広文字に対する別のＡＣＷを設定するために使用される。 If a similar row is not detected, processing continues to step S533. In step S533, the average character width calculated in step S200 is directly multiplied by the parameter to set another average character width. In particular, the following equation is used to set another ACW for wide characters:

ここで、ＴＨ１５＞１．１であり、好ましくは、ＴＨ１５＞７／５である。 Here, TH15> 1.1, and preferably TH15> 7/5.

図６のテーブルでは、幅広文字に対する別のＡＣＷの実例が示されている。この場合、文字間の平均スペースは、約１０画素である。ステップＳ５００（Ｓ５３０）の後、幅広文字のＡＣＷは、約６０画素となり、他の文字のＡＣＷは変更しない。このテーブルで挙げられている値は、本発明の様々な実装に従う非限定的な例である。 In the table of FIG. 6, another ACW example for wide characters is shown. In this case, the average space between characters is about 10 pixels. After step S500 (S530), the ACW of the wide character is about 60 pixels, and the ACW of other characters is not changed. The values listed in this table are non-limiting examples according to various implementations of the invention.

図１８は、本発明の別の実施形態に従う、図５の方法のステップＳ５００の処理を示すフローチャートである。図１８の処理に対しては、ステップＳ４００の強制分割によって取得される第２の文字のセットと、ステップＳ２００で取得される平均文字幅が入力される。 FIG. 18 is a flowchart illustrating the process of step S500 of the method of FIG. 5, according to another embodiment of the present invention. For the process of FIG. 18, the second character set acquired by the forced division in step S400 and the average character width acquired in step S200 are input.

ステップＳ５４０では、強制分割によってのみ分割される隣接する文字の対象のグループであって、その対象のグループの最後の文字とその次の文字との間のスペースが閾値ＴＨ１０より大きいスペースとなっている対象のグループが検索される。ここで、ＴＨ１０＝ＡＣＷ／Ｘ、Ｘ＜１０、好ましくは、Ｘ＝７であり、ＡＣＷはステップＳ２００で計算される。図１６は、ステップＳ４００の後の強制分割結果を示している。図１６の分割結果では、隣接する文字の２つのグループが検出され、「グループ１」と「グループ２」としてマークされている。この場合、スペース１は２１に等しく、スペース２は２５に等しく、ＡＣＷは、６３に等しい（半角文字の幅）、つまり、グループ１とグループ２は、ステップＳ５４０で挙げられる条件を満足する。対象のグループが検出される場合、処理はステップＳ５５０へ継続し、そうでなければ、処理は、別の平均文字幅を設定することなくステップＳ６００へ進む。 In step S540, a target group of adjacent characters divided only by forced division, and the space between the last character of the target group and the next character is a space larger than the threshold value TH10. The target group is searched. Here, TH10 = ACW / X, X <10, preferably X = 7, and ACW is calculated in step S200. FIG. 16 shows the result of forced division after step S400. In the division result of FIG. 16, two groups of adjacent characters are detected and marked as “Group 1” and “Group 2”. In this case, space 1 is equal to 21, space 2 is equal to 25, and ACW is equal to 63 (half-width character width). That is, group 1 and group 2 satisfy the conditions listed in step S540. If the target group is detected, the process continues to step S550, otherwise the process proceeds to step S600 without setting another average character width.

ステップＳ５５０では、対象の文字グループの幅−高さ−比率ＷＨＲが計算される。図１６に示される場合では、グループ１の幅、高さ及びＷＨＲは、それぞれ１０９、１０５及び１．０４であり、グループ２の幅、高さ及びＷＨＲは、それぞれ９５、１０４及び０．９１である。 In step S550, the width-height-ratio WHR of the target character group is calculated. In the case shown in FIG. 16, the width, height and WHR of group 1 are 109, 105 and 1.04 respectively, and the width, height and WHR of group 2 are 95, 104 and 0.91 respectively. is there.

ステップＳ５６０では、対象のグループの幅−高さ−比率が閾値より小さい場合、対象のグループの第１の文字に対して、別の平均文字幅が設定される。特に、対象の文字グループのＷＨＲが閾値ＴＨ１より小さい場合（例えば、ＴＨ１１＞１であり、好ましくは、ＴＨ１１＝１．１である）、対象の文字グループの第１の文字のＡＣＷは値ＴＨ１２として設定される（例えば、ＴＨ１２＝Ｘ＊対象の文字グループの高さ、Ｘ＞１であり、好ましくは、Ｘ＝１．１である）。ＷＨＲ＜１．１である場合、これは、対象の文字グループがもともと全角の幅の実際の文字であり、ステップＳ４００での強制分割によって間違って分割されたことを意味することに注意すべきである。 In step S560, if the width-height-ratio of the target group is smaller than the threshold, another average character width is set for the first character of the target group. In particular, when the WHR of the target character group is smaller than the threshold value TH1 (eg, TH11> 1, preferably TH11 = 1.1), the ACW of the first character of the target character group is the value TH12. (For example, TH12 = X * height of the target character group, X> 1, preferably X = 1.1). It should be noted that if WHR <1.1, this means that the target character group was originally a full-width actual character and was split incorrectly by the forced split in step S400. is there.

図１７及び図１８は、図５のステップＳ５００を実現するための２つの実施形態を示している。図１７の処理は、長いテキスト行に対して適していて、一方、図１８の処理は、例えば、文書の最後の段落のような、短いテキスト行に対して適している。図１７及び図１８における処理は、上述のように単独で使用することができ、また、それらを組み合わせて使用することができる。これは、２つの処理が、ステップＳ５００を構成するためにシーケンスで実行することができることを意味する。 17 and 18 show two embodiments for realizing step S500 of FIG. The process of FIG. 17 is suitable for long text lines, while the process of FIG. 18 is suitable for short text lines, such as the last paragraph of a document. The processing in FIGS. 17 and 18 can be used alone as described above, or can be used in combination. This means that two processes can be performed in sequence to constitute step S500.

ステップＳ５００の後、２つのＡＣＷ（１つは、通常の文字に対するものであり、もう１つは幅広文字に対するものである）が、様々な幅の文字に対して設定される。ステップＳ６００では、ステップＳ４００で取得される文字分割結果（第２の文字のセット）と、２つの異なるＡＣＷとに基づいて、第２の文字のセットの文字群が、様々な平均文字幅に従って様々な分割パターンを作成し、最高の分割パターンを選択することによって、従来技術に従って、結合される。そして、例えば、本発明に従う方法を適用した後の、様々な文字幅を有する、テキスト行の文字に対する正しい文字分割結果が、図２０Ａ及び図２０Ｂで示される。 After step S500, two ACWs (one for normal characters and one for wide characters) are set for various width characters. In step S600, the character group of the second character set varies according to various average character widths based on the character division result (second character set) obtained in step S400 and two different ACWs. Are created according to the prior art by creating a simple division pattern and selecting the highest division pattern. Then, for example, correct character segmentation results for characters in a text line having various character widths after applying the method according to the present invention are shown in FIGS. 20A and 20B.

本記載では、閾値のすべての値は、単なる例であり、限定するものではない。 In this description, all values of the threshold are merely examples and are not limiting.

本記載では、本発明に従って、様々な文字幅を有する、テキスト上の文字を分割するための方法及びシステムを記載するための例として日本語が使用されている。しかしながら、日本語に限定されるものではなく、本発明が、例えば、中国語、日本語及び韓国語等の他の言語にも適用できることを予期することができる。 In this description, Japanese is used as an example to describe a method and system for splitting characters on text having various character widths in accordance with the present invention. However, the present invention is not limited to Japanese, and it can be expected that the present invention can be applied to other languages such as Chinese, Japanese, and Korean.

本記載では、テキスト行画像は、水平行として示されている。このテキスト行は、本発明を説明するための例として使用されている。しかしながら、本発明は、垂直列として記述されるテキスト行にも適用できることを予期することができる。つまり、本記載の用語「テキスト行（テキストライン）（text line）」は、テキストの行（text row）を必ずしも意味するものではない。 In this description, text line images are shown as horizontal lines. This line of text is used as an example to illustrate the present invention. However, it can be expected that the present invention is also applicable to text lines described as vertical columns. That is, the term “text line” in this description does not necessarily mean a text row.

本発明の方法及びシステムは多くの方法で実行することができる。例えば、本発明の方法及びシステムは、ソフトウェア、ハードウェア、ファームウェア、あるいはそれらの任意の組み合わせを通じて実行することができる。方法に対する上述のステップの順序は例示することだけを意図するものであり、本発明の方法のステップは、特に、言及しない限り、上述の特定の順序に制限されるものではない。むしろ、いくつかの実施形態では、本発明は、記録媒体に記録されるプログラムとして実現されても良く、この記録媒体は、本発明に従う方法を実現するためのマシーン可読命令を含んでいる。つまり、本発明は、本発明に従う方法を実現するためのプログラムを記憶する記録媒体を包含する。 The method and system of the present invention can be implemented in many ways. For example, the method and system of the present invention can be implemented through software, hardware, firmware, or any combination thereof. The order of the steps described above for the method is intended to be exemplary only, and the steps of the method of the present invention are not limited to the particular order described above unless specifically stated. Rather, in some embodiments, the present invention may be implemented as a program recorded on a recording medium that includes machine-readable instructions for implementing the method according to the present invention. That is, the present invention includes a recording medium for storing a program for realizing the method according to the present invention.

本発明のいくつかの特定の実施形態を例示を用いて詳細に示しているが、上述の例は単なる説明であることだけを意図するものであり、本発明の範囲を制限するものでないことを、当業者は理解するべきである。本発明の範囲及び精神を逸脱することなく、上述の実施形態を変形することができることを当業者は理解するべきである。本発明の範囲は、添付の請求項によって定義される。 While certain specific embodiments of the present invention have been shown in detail by way of illustration, it is to be understood that the above examples are intended to be illustrative only and are not intended to limit the scope of the invention. Those skilled in the art should understand. It should be understood by those skilled in the art that the above-described embodiments can be modified without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims

A method for splitting characters in a text line having various character widths,
A first dividing step of dividing the text line into a first set of characters based on a projection;
Calculating a mean character width based on the first set of characters;
Forcibly dividing the wide character of the first character set based on the calculated average character width to obtain a second character set;
Setting various average character widths for various characters of the second set of characters;
Combining the character groups of the second set of characters by creating various division patterns according to the various average character widths set and selecting the highest division pattern; A method characterized by.

The forced splitting step includes:
For each wide character whose width is greater than the threshold, generate a plurality of divided groups consisting of the wide character or a combination of the wide character and an adjacent character, and based on the average character width, Searching for possible dividing points, and obtaining a score for each dividing point;
Calculating a score for each split group based on the score of each split point in the split group;
The method according to claim 1, further comprising: selecting the divided group having the smallest score from all divided groups as a result of the forced division.

The forced splitting step includes:
The method according to claim 2, further comprising the step of adding a new dividing point in the middle of the current character width based on a projection method when the score of the selected dividing group is larger than a threshold value.

Searching possible division points in each division group, and obtaining a score of each division point,
Setting one or more search positions within a divided group according to the average character width;
For each search position, a search range for a division point placed at the center of the search position is dynamically determined according to the difference between the width of the average character width and the width of the division group in which the search position is arranged. Steps,
In each search range, when the text row is horizontal, a division score for each pixel column is calculated, and the pixel column having the minimum division score is selected as a division point in the search range, and the text If the rows are vertical, calculating a split score for each pixel row and selecting the pixel row with the minimum split score as a split point within the search range;
For each search range, selecting the pixel column or the pixel row having the minimum division score as its division point;
The method according to claim 3, further comprising: for each division group, obtaining own division points and scores of these division points.

The setting step includes
Calculating an average space between characters of the first set of characters;
According to the number of characters of the first character set, the number of characters of the second character set, and the average space, the text line includes a large number of characters having various widths. A determining step;
The method according to claim 1, further comprising: setting another average character width for the wide character divided by the forced division when the result of the determination is affirmative.

The setting step includes
A target group of adjacent characters that is divided only by the forced division, and a target group in which a space between the last character of the target group and the next character is larger than a threshold value. Searching, and
Calculating the width-height-ratio of the group of objects if the group of objects is detected;
The method of claim 1, further comprising: setting another average character width for the first character of the target group when the width-height-ratio of the target group is less than a threshold. The method described.

The step of setting another average character width for the wide character divided by the forced division,
Searching for adjacent text lines having a height similar to the current text line as similar lines;
If the similar line is detected, using the average character width of the similar line to set the other average character width;
When the similar line is not detected, the average character width calculated in the calculating step is directly multiplied by a parameter to set the another average character width. 5. The method according to 5.

The method according to claim 1, wherein the division pattern indicates a combination of adjacent character groups of the second set of characters.

In the calculating step, the average character width is:
Calculating an average character width of the first set of characters;
Calculating an average character width of characters having a width-height-ratio within a predetermined range selected from the first set of characters;
Calculate the average character width according to the average character width of the previous or next text line,
The method of claim 1, wherein the average character width is calculated by multiplying the height of the text line by a constant value.

Using a connected component method to further split a wide character of the first set of characters to obtain a third set of characters;
The method according to any one of claims 1 to 9, wherein the third character set is processed instead of the first character set in the compulsory division step.

A system for splitting characters in a text line having various character widths,
A first splitting unit configured to split the text line into a first set of characters based on a projection method;
A calculation unit configured to calculate an average character width based on the first set of characters;
A forced splitting unit configured to forcibly split the wide character of the first character set based on the calculated average character width to obtain a second character set;
A setting unit configured to set various average character widths for various characters of the second set of characters;
According to the set various average character widths, various division patterns are created, and the character groups of the second character set are combined by selecting the highest division pattern. A system comprising: a coupling unit.

The forced splitting unit is
For each wide character whose width is greater than the threshold, generate a plurality of divided groups consisting of the wide character or a combination of the wide character and an adjacent character, and based on the average character width, A unit configured to search for possible split points and obtain a score for each split point;
A unit configured to calculate a score for each split group based on the score of each split point in the split group;
The system of claim 11, comprising: a unit configured to select the split group having a minimum score from all split groups as a result of the forced split.

The forced splitting unit further includes:
A unit configured to add a new division point in the middle of the current character width based on a projection method if the score of the selected division group is greater than a threshold. 12. The system according to 12.

Based on the average character width, a unit configured to search for possible division points in each division group and obtain a score of each division point,
A unit configured to set one or more search positions within a split group according to the average character width;
For each search position, a search range for a division point placed at the center of the search position is dynamically determined according to the difference between the width of the average character width and the width of the division group in which the search position is arranged. A unit configured as
In each search range, when the text row is horizontal, a division score for each pixel column is calculated, and the pixel column having the minimum division score is selected as a division point in the search range, and the text A unit configured to calculate a split score for each pixel row if the row is vertical, and to select the pixel row having a minimum split score as a split point within the search range;
For each search range, a unit configured to select the pixel column or the pixel row having the minimum division score as its division point;
The system according to claim 13, comprising for each division group its own division points and units configured to obtain the scores of these division points.

The setting unit is
A unit configured to calculate an average space between characters of the first set of characters;
According to the number of characters of the first character set, the number of characters of the second character set, and the average space, the text line includes a large number of characters having various widths. A unit configured to determine, and
The system according to claim 11, further comprising: a unit configured to set another average character width for the wide character divided by the forced division when the result of the determination is affirmative. .

The setting unit is
A target group of adjacent characters that is divided only by the forced division, and a target group in which a space between the last character of the target group and the next character is larger than a threshold value. A unit that is configured to search, and
A unit configured to calculate a width-height-ratio of the target group if the target group is detected;
A unit configured to set another average character width for the first character of the target group when the width-height-ratio of the target group is less than a threshold. The system according to claim 11.

The unit configured to set another average character width for the wide character divided by the forced division,
A unit configured to search for adjacent text lines having a height similar to the current text line as similar lines;
A unit configured to set the other average character width using the average character width of the similar row if the similar line is detected;
A unit configured to set the another average character width by multiplying the average character width calculated by the calculating unit directly by a parameter when the similar line is not detected. The system according to claim 15.

The system according to claim 11, wherein the division pattern indicates a combination of adjacent character groups of the second character set.

In the calculation unit, the average character width is:
Calculating an average character width of the first set of characters;
Calculating an average character width of characters having a width-height-ratio within a predetermined range selected from the first set of characters;
Calculate the average character width according to the average character width of the previous or next text line,
The system of claim 11, wherein the average character width is calculated by multiplying the height of the text line by a constant value.

Further comprising a second splitting unit configured to split a wide character of the first set of characters using a connected component method to obtain a third set of characters;
The system according to any one of claims 11 to 19, wherein the compulsory division unit processes the third character set in place of the first character set.