JP2014075071A

JP2014075071A - Information processor and information processing program

Info

Publication number: JP2014075071A
Application number: JP2012222861A
Authority: JP
Inventors: Kosuke Maruyama; 耕輔丸山; Shunichi Kimura; 俊一木村; Eiichi Tanaka; 瑛一田中
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2012-10-05
Filing date: 2012-10-05
Publication date: 2014-04-24
Anticipated expiration: 2032-10-05
Also published as: JP6229254B2

Abstract

PROBLEM TO BE SOLVED: To provide an information processor for reducing determination errors as to whether a first segment and a second segment constitute the same character, compared with the case without this configuration.SOLUTION: First calculation means of the information processor calculates the distance of a part at which a first segment expressed by a first coordinate information string and a second segment expressed by a second coordinate information string overlap each other on the basis of the first coordinate information string and the second coordinate information string composed of coordinate information, second calculation means calculates the length of the first segment or the second segment, and determination means determines whether the first segment and the second segment constitute the same character on the basis of a ratio of the distance calculated by the first calculation means to the length calculated by the second calculation means.

Description

本発明は、情報処理装置及び情報処理プログラムに関する。 The present invention relates to an information processing apparatus and an information processing program.

特許文献１には、入力文字列の密度に応じた縦横比率を計算して、文字の認識率を向上させることを課題とし、座標入力手段であるタブレットから入力されたＮ本のストローク列は、基本セグメント分割手段に入力されて、基本セグメントに分割され、文字列密度計算手段は、基本セグメント分割手段により分割された基本セグメント間の隙間と入力文字列の最初の文字から最後の文字までの距離から文字列密度を計算し、縦横比率計算手段は、文字列密度から、その密度に応じた最適な縦横比率の閾値を求め、候補文字生成手段は、文字の候補となる候補文字を生成し、候補文字認識手段は、相違度が最小となる標準文字の名称とその相違度を検出し、最適文字列手段は、入力ストローク列に対して、相違度の総和を最小とする文字名称の系列を割り当てることが開示されている。 Patent Document 1 aims to improve the character recognition rate by calculating the aspect ratio according to the density of the input character string, and N stroke sequences input from a tablet as a coordinate input means are: Input to the basic segment dividing means and divided into basic segments. The character string density calculating means calculates the gap between the basic segments divided by the basic segment dividing means and the distance from the first character to the last character of the input character string. The character string density is calculated from the character string density, the aspect ratio calculating means obtains a threshold value of the optimum aspect ratio according to the density from the character string density, and the candidate character generating means generates candidate characters that are character candidates, The candidate character recognition means detects the name of the standard character having the smallest difference and the difference, and the optimum character string means has the character name that minimizes the sum of the differences for the input stroke string. Assigning a column is disclosed.

特開平０９−０３４９９２号公報JP 09-034992 A

本発明は、本構成を有していない場合に比較して、第１の線分と第２の線分が同じ文字を構成しているか否かの判定誤りを減少させるようにした情報処理装置及び情報処理プログラムを提供することを目的としている。 The present invention reduces the determination error as to whether or not the first line segment and the second line segment constitute the same character as compared with the case where the present configuration is not provided. And an information processing program.

かかる目的を達成するための本発明の要旨とするところは、次の各項の発明に存する。
請求項１の発明は、座標情報によって構成されている第１の座標情報列と第２の座標情報列に基づいて、該第１の座標情報列によって表される第１の線分と該第２の座標情報列によって表される第２の線分が重なり合っている部分の距離を算出する第１の算出手段と、前記第１の線分又は前記第２の線分の長さを算出する第２の算出手段と、前記第１の算出手段によって算出された距離と前記第２の算出手段によって算出された長さの比率に基づいて、前記第１の線分と前記第２の線分が同じ文字を構成しているか否かを判定する判定手段を具備することを特徴とする情報処理装置である。 The gist of the present invention for achieving the object lies in the inventions of the following items.
The invention according to claim 1 is based on the first coordinate information sequence and the second coordinate information sequence configured by the coordinate information, and the first line segment represented by the first coordinate information sequence and the first A first calculation means for calculating a distance of a portion where the second line segments represented by the two coordinate information strings overlap, and a length of the first line segment or the second line segment. Based on the ratio of the distance calculated by the second calculation means, the distance calculated by the first calculation means, and the length calculated by the second calculation means, the first line segment and the second line segment It is an information processing apparatus characterized by comprising determination means for determining whether or not they constitute the same character.

請求項２の発明は、前記第２の算出手段は、前記第１の線分の長さを算出する第３の算出手段と、前記第２の線分の長さを算出する第４の算出手段を含み、前記判定手段は、前記第１の算出手段によって算出された距離と前記第３の算出手段によって算出された長さの比率に基づいて、前記第１の線分と前記第２の線分が同じ文字を構成しているか否かの仮の判定をする第２の判定手段と、前記第１の算出手段によって算出された距離と前記第４の算出手段によって算出された長さの比率に基づいて、前記第１の線分と前記第２の線分が同じ文字を構成しているか否かの仮の判定をする第３の判定手段と、前記第２の判定手段による判定結果と前記第３の判定手段による判定結果に基づいて、前記第１の線分と前記第２の線分が同じ文字を構成しているか否かを判定する第４の判定手段を含むことを特徴とする請求項１に記載の情報処理装置である。 According to a second aspect of the present invention, the second calculation means includes third calculation means for calculating the length of the first line segment, and fourth calculation for calculating the length of the second line segment. And the determination means includes the first line segment and the second line based on the ratio of the distance calculated by the first calculation means and the length calculated by the third calculation means. A second determination means for temporarily determining whether or not the line segments constitute the same character; a distance calculated by the first calculation means; and a length calculated by the fourth calculation means. Based on the ratio, a third determination unit that temporarily determines whether or not the first line segment and the second line segment constitute the same character, and a determination result by the second determination unit And the determination result by the third determination means, the first line segment and the second line segment form the same character. An information processing apparatus according to claim 1, characterized in that it comprises a fourth determination means for determining whether or not the.

請求項３の発明は、前記第２の算出手段は、前記第１の線分の長さを算出する第３の算出手段と、前記第２の線分の長さを算出する第４の算出手段と、前記第３の算出手段によって算出された長さと前記第４の算出手段によって算出された長さのいずれか一方を選択する選択手段を含み、前記判定手段は、前記第１の算出手段によって算出された距離と前記選択手段によって選択された長さの比率に基づいて、前記第１の線分と前記第２の線分が同じ文字を構成しているか否かを判定することを特徴とする請求項１に記載の情報処理装置である。 According to a third aspect of the present invention, the second calculation means includes a third calculation means for calculating the length of the first line segment, and a fourth calculation for calculating the length of the second line segment. And a selection unit that selects one of the length calculated by the third calculation unit and the length calculated by the fourth calculation unit, and the determination unit includes the first calculation unit. And determining whether or not the first line segment and the second line segment constitute the same character based on a ratio between the distance calculated by the above and the length selected by the selection unit. The information processing apparatus according to claim 1.

請求項４の発明は、前記第１の算出手段は、前記第２の線分又は前記第１の線分の予め定められた位置を算出し、前記第２の算出手段は、前記第１の線分又は前記第２の線分の範囲を算出し、前記判定手段は、前記第１の算出手段によって算出された位置が、前記第２の算出手段によって算出された範囲に含まれているか否かによって、前記第１の線分と前記第２の線分が同じ文字を構成しているか否かを判定することを特徴とする請求項１に記載の情報処理装置である。 According to a fourth aspect of the present invention, the first calculation unit calculates the second line segment or a predetermined position of the first line segment, and the second calculation unit includes the first line segment. A line segment or a range of the second line segment is calculated, and the determination unit determines whether or not the position calculated by the first calculation unit is included in the range calculated by the second calculation unit. The information processing apparatus according to claim 1, wherein the information processing apparatus determines whether or not the first line segment and the second line segment constitute the same character.

請求項５の発明は、前記第２の算出手段は、対象となり得る線分の長さを算出し、前記第２の算出手段によって算出された長さに基づいて、線分を昇順又は降順に整列する整列手段を具備し、前記判定手段は、前記第１の線分と前記第２の線分のうち、前記整列手段による整列順によって、線分の長さが短い線分を選択し、前記第１の算出手段によって算出された距離と該選択された線分の長さの比率に基づいて、前記第１の線分と前記第２の線分が同じ文字を構成しているか否かを判定することを特徴とする請求項１に記載の情報処理装置である。 According to a fifth aspect of the present invention, the second calculation unit calculates a length of a line segment that can be an object, and the line segments are arranged in ascending or descending order based on the length calculated by the second calculation unit. An alignment means for aligning, wherein the determination means selects a line segment having a short line length from the first line segment and the second line segment according to the alignment order by the alignment means; Whether or not the first line segment and the second line segment constitute the same character based on the ratio between the distance calculated by the first calculation means and the length of the selected line segment The information processing apparatus according to claim 1, further comprising:

請求項６の発明は、前記判定手段によって同じ文字を構成していると判定された前記第１の線分と前記第２の線分に基づいて、該文字を構成している線分群を抽出する抽出手段をさらに具備することを特徴とする請求項１から５のいずれか一項に記載の情報処理装置である。 The invention according to claim 6 extracts a line segment group constituting the character based on the first line segment and the second line segment determined to constitute the same character by the determining means. The information processing apparatus according to claim 1, further comprising an extraction unit configured to perform extraction.

請求項７の発明は、前記抽出手段によって抽出された線分群を対象として文字認識を行う文字認識手段をさらに具備することを特徴とする請求項６に記載の情報処理装置である。 The invention according to claim 7 is the information processing apparatus according to claim 6, further comprising character recognition means for performing character recognition on the line segment group extracted by the extraction means.

請求項８の発明は、コンピュータを、座標情報によって構成されている第１の座標情報列と第２の座標情報列に基づいて、該第１の座標情報列によって表される第１の線分と該第２の座標情報列によって表される第２の線分が重なり合っている部分の距離を算出する第１の算出手段と、前記第１の線分又は前記第２の線分の長さを算出する第２の算出手段と、前記第１の算出手段によって算出された距離と前記第２の算出手段によって算出された長さの比率に基づいて、前記第１の線分と前記第２の線分が同じ文字を構成しているか否かを判定する判定手段として機能させるための情報処理プログラムである。 According to the invention of claim 8, the computer is configured to output the first line segment represented by the first coordinate information sequence based on the first coordinate information sequence and the second coordinate information sequence configured by the coordinate information. And a length of the first line segment or the second line segment, a first calculation means for calculating a distance of a portion where the second line segment represented by the second coordinate information sequence overlaps, Based on the ratio of the distance calculated by the first calculation means and the length calculated by the second calculation means, and the first line segment and the second It is an information processing program for functioning as a determination means for determining whether or not the line segments constitute the same character.

請求項１の情報処理装置によれば、本構成を有していない場合に比較して、第１の線分と第２の線分が同じ文字を構成しているか否かの判定誤りを減少させることができる。 According to the information processing apparatus of claim 1, the determination error as to whether or not the first line segment and the second line segment constitute the same character is reduced as compared with the case where the present configuration is not provided. Can be made.

請求項２の情報処理装置によれば、本構成を有していない場合に比較して、高速に処理することができる。 According to the information processing apparatus of the second aspect, it is possible to perform processing at a higher speed than in the case where the present configuration is not provided.

請求項３の情報処理装置によれば、本構成を有していない場合に比較して、メモリの消費を抑制することができる。 According to the information processing apparatus of the third aspect, it is possible to suppress the memory consumption as compared with the case where the present configuration is not provided.

請求項４の情報処理装置によれば、本構成を有していない場合に比較して、第１の線分と第２の線分が同じ文字を構成しているか否かの判定誤りを減少させることができる。 According to the information processing apparatus of the fourth aspect, the determination error as to whether or not the first line segment and the second line segment constitute the same character is reduced as compared with the case where the present configuration is not provided. Can be made.

請求項５の情報処理装置によれば、本構成を有していない場合に比較して、高速に処理することができる。 According to the information processing apparatus of the fifth aspect, processing can be performed at a higher speed than in the case where the present configuration is not provided.

請求項６の情報処理装置によれば、文字を構成している線分群を抽出することができる。 According to the information processing apparatus of the sixth aspect, it is possible to extract a line segment group constituting a character.

請求項７の情報処理装置によれば、線分群から構成されている文字を認識することができる。 According to the information processing apparatus of the seventh aspect, it is possible to recognize a character composed of a line segment group.

請求項８の情報処理プログラムによれば、本構成を有していない場合に比較して、第１の線分と第２の線分が同じ文字を構成しているか否かの判定誤りを減少させることができる。 According to the information processing program of claim 8, the determination error as to whether or not the first line segment and the second line segment constitute the same character is reduced as compared with the case where the present configuration is not provided. Can be made.

第１の実施の形態の構成例についての概念的なモジュール構成図である。It is a conceptual module block diagram about the structural example of 1st Embodiment. 第１の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 1st Embodiment. 第１の実施の形態による処理例を示すフローチャートである。It is a flowchart which shows the process example by 1st Embodiment. 第１の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 1st Embodiment. 第１の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 1st Embodiment. 第１の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 1st Embodiment. 第１の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 1st Embodiment. 第１の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 1st Embodiment. 第１の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 1st Embodiment. 第２の実施の形態の構成例についての概念的なモジュール構成図である。It is a conceptual module block diagram about the structural example of 2nd Embodiment. 第２の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 2nd Embodiment. 第２の実施の形態による処理例を示すフローチャートである。It is a flowchart which shows the process example by 2nd Embodiment. 第３の実施の形態の構成例についての概念的なモジュール構成図である。It is a conceptual module block diagram about the structural example of 3rd Embodiment. 第３の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 3rd Embodiment. 第３の実施の形態による処理例を示すフローチャートである。It is a flowchart which shows the process example by 3rd Embodiment. 第４の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 4th Embodiment. 座標値の中心の例を示す説明図である。It is explanatory drawing which shows the example of the center of a coordinate value. 時間の中心の例を示す説明図である。It is explanatory drawing which shows the example of the center of time. 重心の例を示す説明図である。It is explanatory drawing which shows the example of a gravity center. 重み付き重心の例を示す説明図である。It is explanatory drawing which shows the example of a gravity center with a weight. 第５の実施の形態の構成例についての概念的なモジュール構成図である。It is a notional module block diagram about the structural example of 5th Embodiment. 第６の実施の形態の構成例についての概念的なモジュール構成図である。It is a notional module block diagram about the structural example of 6th Embodiment. 第６の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 6th Embodiment. 第７の実施の形態の構成例についての概念的なモジュール構成図である。It is a notional module block diagram about the structural example of 7th Embodiment. 第７の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 7th Embodiment. 第８の実施の形態の構成例についての概念的なモジュール構成図である。It is a notional module block diagram about the structural example of 8th Embodiment. 本実施の形態を実現するコンピュータのハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the computer which implement | achieves this Embodiment. 重なり幅の例を示す説明図である。It is explanatory drawing which shows the example of an overlap width. 本実施の形態の前提条件となる例を示す説明図である。It is explanatory drawing which shows the example used as the precondition of this Embodiment.

まず、本実施の形態を説明する前に、その前提となる技術について説明する。なお、この説明は、本実施の形態の理解を容易にすることを目的とするものである。
手書き文字列の文字認識等を行う文字認識装置に用いられるオンライン手書き文字列切り出し装置に関するものである。オンライン手書き文字は、複数の線分（以下、ストロークともいう）によって文字が構成されている。例えば、文字認識を行うためには、１文字を区別する必要がある。
図２８は、重なり幅の例を示す説明図である。図２８（ａ）は、分離すべきストロークを示している。ストローク２８１０とストローク２８２０は、それぞれ別々の文字に属しているものである。図２８（ｂ）は、分離すべきではない（統合すべき）ストロークを示している。ストローク２８３０とストローク２８４０は、同じ文字を構成するストロークである。ここで、ストローク列を１つの座標軸に投影したときの隣接するストロークの隙間の距離を求め、その隙間の距離が設定の閾値よりも小さいとき、隙間の両側のストロークを１つに結合し、隙間の距離が閾値よりも大きいとき、隙間の両側のストロークを別々のものとして分割し、その結合又は分割したストロークを基本セグメントとし、さらに隣接する基本セグメントを組み合わせて候補文字を順次生成し、また、複数の基本セグメントに基づき、入力された文字列の文字の密度を計算し、その文字の密度によって文字の特徴量を決定するようなアルゴリズム（ストロークデータの隙間を切り出し位置とする手法）を採用した場合、文字間隔がある程度存在することを想定しており、文字間が狭い(又は無い)場合、ストロークデータを正しく分離・統合することができない場合が生じる。例えば、図２８（ｃ）に示すように、ストローク２８１０とストローク２８２０が重なり合っている部分の距離と、ストローク２８３０とストローク２８４０が重なり合っている部分の距離が等しい場合（オーバーラップ間隔２８５０）、前述のアルゴリズムでは、重なり合っている距離（正値）を閾値として、ストローク２８３０とストローク２８４０を統合しようとすると、ストローク２８１０とストローク２８２０も統合されてしまうことになってしまう。逆に、離れている距離（負値）を閾値として、ストローク２８１０とストローク２８２０を分離しようとすると、ストローク２８３０とストローク２８４０も分離されてしまうことになってしまう。 First, before explaining the present embodiment, a technique that is a premise thereof will be described. This description is intended to facilitate understanding of the present embodiment.
The present invention relates to an online handwritten character string cutout device used for a character recognition device that performs character recognition of a handwritten character string. Online handwritten characters are composed of a plurality of line segments (hereinafter also referred to as strokes). For example, in order to perform character recognition, it is necessary to distinguish one character.
FIG. 28 is an explanatory diagram illustrating an example of the overlap width. FIG. 28A shows strokes to be separated. Stroke 2810 and stroke 2820 belong to different characters. FIG. 28B shows strokes that should not be separated (to be integrated). Stroke 2830 and stroke 2840 are strokes constituting the same character. Here, the distance between the adjacent strokes when the stroke sequence is projected on one coordinate axis is obtained, and when the distance between the gaps is smaller than the set threshold, the strokes on both sides of the gap are combined into one, When the distance is larger than the threshold, the strokes on both sides of the gap are divided as separate ones, the combined or divided strokes are used as basic segments, and adjacent basic segments are combined to generate candidate characters sequentially, Based on multiple basic segments, an algorithm that calculates the character density of the input character string and determines the character feature value based on the character density (a technique that uses the gap in the stroke data as the cut-out position) was adopted. In this case, it is assumed that there is some space between characters, and if the space between characters is narrow (or absent), correct the stroke data. If it is not possible to separate and integrated results. For example, as shown in FIG. 28 (c), when the distance of the portion where the stroke 2810 and the stroke 2820 overlap is equal to the distance of the portion where the stroke 2830 and the stroke 2840 overlap (overlap interval 2850), In the algorithm, when the overlapping distance (positive value) is used as a threshold and the stroke 2830 and the stroke 2840 are integrated, the stroke 2810 and the stroke 2820 are also integrated. Conversely, if the stroke 2810 and the stroke 2820 are separated using the distance (negative value) as a threshold as a threshold, the stroke 2830 and the stroke 2840 will also be separated.

図２９は、本実施の形態の前提条件となる例を示す説明図である。
前提条件として、以下の４つがある。
（１）文字列方向と直行する方向に分離（又は統合）する。
（２）文字列方向と直行するストロークは統合する。
（３）単文字を偏と旁に分離してもよい。
（４）２本の横棒がわずかに接触又はｘ座標上でわずかに重なっている場合は、隣の文字である可能性があるので、分離する。「わずかに接触」、「わずかに重なっている」ことの判定については後述する。 FIG. 29 is an explanatory diagram showing an example as a precondition for the present embodiment.
There are the following four preconditions.
(1) Separation (or integration) in a direction perpendicular to the character string direction.
(2) The strokes orthogonal to the character string direction are integrated.
(3) Single characters may be separated into unequal and ugly.
(4) If the two horizontal bars are slightly touching or slightly overlapping on the x coordinate, they may be adjacent characters, so they are separated. The determination of “slightly touching” or “slightly overlapping” will be described later.

図２９（ａ）の例は、前提条件（１）の例を示すものである。図に示すように、文字列の方向が左から右への場合、上から下への矢印のように、各ストローク群を分離する。この分離したストローク群を一文字として文字認識等の対象とする。
図２９（ｂ）の例は、前提条件（２）の例を示すものである。図のように、３本のストローク群を統合する。ただし、統合するか否かの判定については後述する。 The example of FIG. 29A shows an example of the precondition (1). As shown in the figure, when the direction of the character string is from left to right, the stroke groups are separated as indicated by arrows from top to bottom. The separated stroke group is set as a character recognition target.
The example of FIG. 29B shows an example of the precondition (2). As shown in the figure, the three stroke groups are integrated. However, the determination of whether to integrate will be described later.

図２９（ｃ１）、（ｃ２）の例は、前提条件（３）の例を示すものである。図２９（ｃ１）の例は、偏と旁に分離しているものであり、本実施の形態では許容されるものである。
図２９（ｃ２）の例は、冠とその下の部分を分離しているが、このような分離は本実施の形態では許容されないものである。なお、単文字を偏と旁に分離してもよいとした理由は、偏と旁を単文字に統合する手法が既にあるためである。例えば、多重仮説検定法という手法がある。これは、文字の切り出し方に複数の仮説を立て、文字パターンの候補を切り出した後、文字識別や文字列照合によって正しい仮説を決定する方式である。例えば、多重仮説検定法を活用した方式として、特開平０９−１８５６８１号公報に示されているような技術があり、多重仮説検定法に、文字パターンの大きさや前後のパターンとの位置関係に基づく評価値（概形ペナルティ）を導入したものである。また、その他の方式として、特開平０８−１６１４３２号公報に示されているような技術があり、外接矩形の形状情報から、まず、切り出しを確定できるものは確定し、残ったものに対しては、外接矩形の組み合わせによる複数の切り出し候補を推定し、それぞれの切り出し候補毎に個々の矩形に対する認識評価値を求め、これら個々の矩形に対する認識評価値を用いた各切り出し候補毎の組み合わせ評価値のうち最適な組み合わせ評価値を得た切り出し候補を切り出し結果として確定する。 The examples of FIGS. 29 (c1) and (c2) show an example of the precondition (3). The example of FIG. 29 (c1) is separated into a bias and a heel, and is allowed in this embodiment.
In the example of FIG. 29 (c2), the crown and the lower part are separated, but such separation is not allowed in the present embodiment. The reason that single characters may be separated into partial characters and 旁 is that there is already a method for integrating partial characters and 旁 into single characters. For example, there is a technique called multiple hypothesis testing. This is a method in which a plurality of hypotheses are set in the character cutout method, a character pattern candidate is cut out, and then a correct hypothesis is determined by character identification or character string matching. For example, as a method utilizing the multiple hypothesis testing method, there is a technique as disclosed in Japanese Patent Laid-Open No. 09-185681, and the multiple hypothesis testing method is based on the size of the character pattern and the positional relationship with the preceding and following patterns. Evaluation value (rough shape penalty) is introduced. As another method, there is a technique as disclosed in Japanese Patent Application Laid-Open No. 08-161432, and from the shape information of the circumscribed rectangle, first, what can be determined to be cut out is determined, and for the remaining ones , Estimating a plurality of cutout candidates based on a combination of circumscribed rectangles, obtaining a recognition evaluation value for each rectangle for each cutout candidate, and determining a combination evaluation value for each cutout candidate using the recognition evaluation value for each individual rectangle Of these, the extraction candidate that has obtained the optimum combination evaluation value is determined as the extraction result.

図２９（ｄ）の例は、前提条件（４）の例を示すものである。図に示すような２つのストロークは、分離すべきものである。 The example in FIG. 29D shows an example of the precondition (4). The two strokes as shown are to be separated.

以下、図面に基づき本発明を実現するにあたっての好適な各種の実施の形態の例を説明する。
＜第１の実施の形態＞
図１は、第１の実施の形態の構成例についての概念的なモジュール構成図を示している。
なお、モジュールとは、一般的に論理的に分離可能なソフトウェア（コンピュータ・プログラム）、ハードウェア等の部品を指す。したがって、本実施の形態におけるモジュールはコンピュータ・プログラムにおけるモジュールのことだけでなく、ハードウェア構成におけるモジュールも指す。それゆえ、本実施の形態は、それらのモジュールとして機能させるためのコンピュータ・プログラム（コンピュータにそれぞれの手順を実行させるためのプログラム、コンピュータをそれぞれの手段として機能させるためのプログラム、コンピュータにそれぞれの機能を実現させるためのプログラム）、システム及び方法の説明をも兼ねている。ただし、説明の都合上、「記憶する」、「記憶させる」、これらと同等の文言を用いるが、これらの文言は、実施の形態がコンピュータ・プログラムの場合は、記憶装置に記憶させる、又は記憶装置に記憶させるように制御するの意である。また、モジュールは機能に一対一に対応していてもよいが、実装においては、１モジュールを１プログラムで構成してもよいし、複数モジュールを１プログラムで構成してもよく、逆に１モジュールを複数プログラムで構成してもよい。また、複数モジュールは１コンピュータによって実行されてもよいし、分散又は並列環境におけるコンピュータによって１モジュールが複数コンピュータで実行されてもよい。なお、１つのモジュールに他のモジュールが含まれていてもよい。また、以下、「接続」とは物理的な接続の他、論理的な接続（データの授受、指示、データ間の参照関係等）の場合にも用いる。「予め定められた」とは、対象としている処理の前に定まっていることをいい、本実施の形態による処理が始まる前はもちろんのこと、本実施の形態による処理が始まった後であっても、対象としている処理の前であれば、そのときの状況・状態に応じて、又はそれまでの状況・状態に応じて定まることの意を含めて用いる。「予め定められた値」が複数ある場合は、それぞれ異なった値であってもよいし、２以上の値（もちろんのことながら、全ての値も含む）が同じであってもよい。また、「Ａである場合、Ｂをする」という意味を有する記載は、「Ａであるか否かを判断し、Ａであると判断した場合はＢをする」の意味で用いる。ただし、Ａであるか否かの判断が不要である場合を除く。
また、システム又は装置とは、複数のコンピュータ、ハードウェア、装置等がネットワーク（一対一対応の通信接続を含む）等の通信手段で接続されて構成されるほか、１つのコンピュータ、ハードウェア、装置等によって実現される場合も含まれる。「装置」と「システム」とは、互いに同義の用語として用いる。もちろんのことながら、「システム」には、人為的な取り決めである社会的な「仕組み」（社会システム）にすぎないものは含まない。
また、各モジュールによる処理毎に又はモジュール内で複数の処理を行う場合はその処理毎に、対象となる情報を記憶装置から読み込み、その処理を行った後に、処理結果を記憶装置に書き出すものである。したがって、処理前の記憶装置からの読み込み、処理後の記憶装置への書き出しについては、説明を省略する場合がある。なお、ここでの記憶装置としては、ハードディスク、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、外部記憶媒体、通信回線を介した記憶装置、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）内のレジスタ等を含んでいてもよい。 Hereinafter, examples of various preferred embodiments for realizing the present invention will be described with reference to the drawings.
<First Embodiment>
FIG. 1 is a conceptual module configuration diagram of a configuration example according to the first embodiment.
The module generally refers to components such as software (computer program) and hardware that can be logically separated. Therefore, the module in the present embodiment indicates not only a module in a computer program but also a module in a hardware configuration. Therefore, the present embodiment is a computer program for causing these modules to function (a program for causing a computer to execute each procedure, a program for causing a computer to function as each means, and a function for each computer. This also serves as an explanation of the program and system and method for realizing the above. However, for the sake of explanation, the words “store”, “store”, and equivalents thereof are used. However, when the embodiment is a computer program, these words are stored in a storage device or stored in memory. It is the control to be stored in the device. Modules may correspond to functions one-to-one, but in mounting, one module may be configured by one program, or a plurality of modules may be configured by one program, and conversely, one module May be composed of a plurality of programs. The plurality of modules may be executed by one computer, or one module may be executed by a plurality of computers in a distributed or parallel environment. Note that one module may include other modules. Hereinafter, “connection” is used not only for physical connection but also for logical connection (data exchange, instruction, reference relationship between data, etc.). “Predetermined” means that the process is determined before the target process, and not only before the process according to this embodiment starts but also after the process according to this embodiment starts. In addition, if it is before the target processing, it is used in accordance with the situation / state at that time or with the intention to be decided according to the situation / state up to that point. When there are a plurality of “predetermined values”, they may be different values, or two or more values (of course, including all values) may be the same. In addition, the description having the meaning of “do B when it is A” is used in the meaning of “determine whether or not it is A and do B when it is judged as A”. However, the case where it is not necessary to determine whether or not A is excluded.
In addition, the system or device is configured by connecting a plurality of computers, hardware, devices, and the like by communication means such as a network (including one-to-one correspondence communication connection), etc., and one computer, hardware, device. The case where it implement | achieves by etc. is included. “Apparatus” and “system” are used as synonymous terms. Of course, the “system” does not include a social “mechanism” (social system) that is an artificial arrangement.
In addition, when performing a plurality of processes in each module or in each module, the target information is read from the storage device for each process, and the processing result is written to the storage device after performing the processing. is there. Therefore, description of reading from the storage device before processing and writing to the storage device after processing may be omitted. Here, the storage device may include a hard disk, a RAM (Random Access Memory), an external storage medium, a storage device via a communication line, a register in a CPU (Central Processing Unit), and the like.

第１の実施の形態である情報処理装置は、文字を構成している複数の線分を統合すべきか否かを判定するものであって、図１の例に示すように、ストロークサイズ算出モジュール１１０、オーバーラップ算出モジュール１２０、判定モジュール１３０を有している。 The information processing apparatus according to the first embodiment determines whether or not a plurality of line segments constituting a character should be integrated, and, as shown in the example of FIG. 1, a stroke size calculation module 110, an overlap calculation module 120, and a determination module 130.

オーバーラップ算出モジュール１２０は、判定モジュール１３０と接続されている。オーバーラップ算出モジュール１２０は、座標情報によって構成されている第１の座標情報列（以下、ストローク情報ともいう）であるストロークＡ：１００Ａと第２の座標情報列であるストロークＢ：１００Ｂに基づいて、ストロークＡ：１００Ａによって表される第１の線分とストロークＢ：１００Ｂによって表される第２の線分が重なり合っている部分（以下、オーバーラップ）の距離を算出する。なお、ストロークＡ：１００Ａ、ストロークＢ：１００Ｂは、２次元の座標データ群として構成されている。例えば、利用者によってタブレット上で電子ペン等を用いて筆記される文字列について、予め定められた時間間隔で各筆点を２次元の座標データとして検出したものである。また、この他に座標データが埋め込まれた情報画像が印刷された用紙上を、スキャナ付きの電子ペン等を用いて筆記される文字列について、その電子ペン等が情報画像を解析して座標データを検出するようにしたものであってもよい。なお、座標情報列は、少なくとも始点と終点の座標データによって構成されていればよい。 The overlap calculation module 120 is connected to the determination module 130. The overlap calculation module 120 is based on a stroke A: 100A that is a first coordinate information sequence (hereinafter also referred to as stroke information) constituted by coordinate information and a stroke B: 100B that is a second coordinate information sequence. The distance between the first line segment represented by stroke A: 100A and the second line segment represented by stroke B: 100B (hereinafter referred to as overlap) is calculated. The stroke A: 100A and the stroke B: 100B are configured as a two-dimensional coordinate data group. For example, for a character string written by a user using an electronic pen or the like on a tablet, each writing point is detected as two-dimensional coordinate data at a predetermined time interval. In addition, for a character string written on a paper on which an information image in which coordinate data is embedded is printed using an electronic pen with a scanner, the electronic pen analyzes the information image to obtain coordinate data. May be detected. In addition, the coordinate information sequence should just be comprised by the coordinate data of a starting point and an end point at least.

ストロークサイズ算出モジュール１１０は、判定モジュール１３０と接続されている。ストロークサイズ算出モジュール１１０は、第１の線分又は第２の線分の長さ（全長）を算出する。第１の線分又は第２の線分における始点から終点までの距離を算出する。
判定モジュール１３０は、ストロークサイズ算出モジュール１１０、オーバーラップ算出モジュール１２０と接続されている。判定モジュール１３０は、オーバーラップ算出モジュール１２０によって算出された距離とストロークサイズ算出モジュール１１０によって算出された長さの比率に基づいて、第１の線分と第２の線分が同じ文字を構成しているか否かを判定する。ここで、「比率に基づいて」とは、例えば、比率と予め定められた閾値とを比較することをいう。そして、比較結果として、閾値以上、閾値より大、閾値以下、閾値未満等の関係である場合に、同じ文字を構成していると判定する。そして、その判定結果１９９を出力する。出力先としては、例えば、後述する連結成分算出モジュール２４２０、オンライン文字列認識モジュール２６２０等である。また、判定モジュール１３０は、オーバーラップ算出モジュール１２０による算出結果が正値である場合に判定処理を行うようにしてもよい。つまり、オーバーラップが発生していない場合は、判定処理は不要だからである。又は、オーバーラップが発生していない場合は、同じ文字を構成していないとする判定結果にしてもよい。
図２は、第１の実施の形態（特に判定モジュール１３０）による処理例を示す説明図である。
ここでは、ストロークＡ：２００ＡとストロークＢ：２００Ｂが重なり合っている部分の距離（オーバーラップ間隔δ２２０）と、ストロークＡ：２００ＡのストロークサイズＷ２１０を用いる。そして、（オーバーラップ間隔δ２２０）／（ストロークサイズＷ２１０）×１００［％］が予め定められた閾値以上である場合、ストロークＡ：２００ＡとストロークＢ：２００Ｂは同じ文字を構成していると判定する。それ以外の場合は、同じ文字を構成していない（別々の文字の構成である）と判定する。
また、判定モジュール１３０は、２本のストロークにおいて、一方のストロークの予め定められた範囲（例えば、ストロークサイズ（ストロークの全長）のＡ％以上等）が、もう一方のストロークのｘ座標上における範囲内に含まれているかを判定するようにしてもよい。 The stroke size calculation module 110 is connected to the determination module 130. The stroke size calculation module 110 calculates the length (full length) of the first line segment or the second line segment. The distance from the start point to the end point in the first line segment or the second line segment is calculated.
The determination module 130 is connected to the stroke size calculation module 110 and the overlap calculation module 120. Based on the ratio of the distance calculated by the overlap calculation module 120 and the length calculated by the stroke size calculation module 110, the determination module 130 forms the same character in the first line segment and the second line segment. It is determined whether or not. Here, “based on the ratio” means, for example, comparing the ratio with a predetermined threshold value. Then, as a comparison result, if the relationship is greater than or equal to the threshold, greater than the threshold, less than the threshold, less than the threshold, it is determined that the same character is configured. Then, the determination result 199 is output. Examples of the output destination include a connected component calculation module 2420 and an online character string recognition module 2620 described later. Further, the determination module 130 may perform the determination process when the calculation result by the overlap calculation module 120 is a positive value. That is, if no overlap occurs, the determination process is unnecessary. Alternatively, if there is no overlap, it may be determined that the same character is not configured.
FIG. 2 is an explanatory diagram illustrating a processing example according to the first embodiment (particularly, the determination module 130).
Here, the distance (overlap interval δ220) of the portion where stroke A: 200A and stroke B: 200B overlap and the stroke size W210 of stroke A: 200A are used. When (overlap interval δ220) / (stroke size W210) × 100 [%] is equal to or larger than a predetermined threshold, it is determined that stroke A: 200A and stroke B: 200B constitute the same character. . In other cases, it is determined that the same character is not configured (a configuration of different characters).
In addition, in the two strokes, the determination module 130 determines that a predetermined range of one stroke (for example, A% or more of the stroke size (the total length of the stroke)) is a range on the x coordinate of the other stroke. You may make it determine whether it is contained in.

なお、記号を用いて説明するが、以下にその例を示す。
ｓ_ｉ：ストローク
ｗ_ｉ：ストロークｓ_ｉの幅
ｖ_ｉｊ：ストロークｓ_ｉとストロークｓ_ｊが重なり合っている部分の距離（オーバーラップ間隔、被覆幅）
ｂ_ｉｊ∈｛０，１｝：ｓ_ｉがｓ_ｊに従属する
ｍ_ｉｊ∈｛０，１｝：ｓ_ｉとｓ_ｊを統合する（ｂ_ｉｊとｂ_ｊｉの論理和）
Ｓ：ストローク列（ｓの系列）
Ｗ：ストロークの幅列（ｗの系列）
Ｍ：全判定結果情報（ｍのセット）
ｃ_ｋ：統合ストローク（ｓのセット）
Ｃ：統合ストローク列（ｃの系列）
θ：文字列方向
⊥：終了信号 In addition, although it demonstrates using a symbol, the example is shown below.
s _i : Stroke w _i : Width of stroke s _i v _ij : Distance between overlapping portions of stroke s _i and stroke s _j (overlap interval, covering width)
b _ij ε {0,1}: s _i is dependent on s _j m _ij ε {0, 1}: s _i and s _j are integrated (logical sum of b _ij and b _ji )
S: Stroke sequence (series of s)
W: Stroke width sequence (series of w)
M: All judgment result information (set of m)
c _k : integrated stroke (set of s)
C: Integrated stroke sequence (series of c)
θ: Character string direction ⊥: End signal

図３は、第１の実施の形態による処理例を示すフローチャートである。
ステップＳ３０２では、第１の実施の形態が、ストローク情報を取得する。例えば、図４（ａ）に示すように、ストローク４０２〜４１２のストローク情報を取得する。
ステップＳ３０４では、第１の実施の形態が、取得したストローク情報から対象とする２つのストローク情報を抽出する。例えば、図４（ｂ）に示すように、ストローク４０２とストローク４０４の組を抽出する。そして、２回目以降の処理で、ストローク４０６以降のストロークとストローク４０２との組み合わせを抽出する。この例は、全ての組み合わせを抽出する例である。 FIG. 3 is a flowchart illustrating a processing example according to the first exemplary embodiment.
In step S302, the first embodiment acquires stroke information. For example, as shown in FIG. 4A, the stroke information of the strokes 402 to 412 is acquired.
In step S304, the first embodiment extracts two pieces of target stroke information from the acquired stroke information. For example, as shown in FIG. 4B, a set of stroke 402 and stroke 404 is extracted. In the second and subsequent processing, combinations of strokes after stroke 406 and strokes 402 are extracted. In this example, all combinations are extracted.

ステップＳ３０６では、ストロークサイズ算出モジュール１１０が、ストロークのサイズを算出する。図５は、第１の実施の形態（特にストロークサイズ算出モジュール１１０）による処理例を示す説明図である。ストロークサイズ算出モジュール１１０は、文字列方向θについて、ストロークｓ_ｉの幅ｗ_ｉを算出する。文字列方向が水平の場合は、そのストロークにおけるｘ座標の最小値から最大値までの距離である。 In step S306, the stroke size calculation module 110 calculates the size of the stroke. FIG. 5 is an explanatory diagram illustrating a processing example according to the first embodiment (particularly, the stroke size calculation module 110). The stroke size calculation module 110 calculates the width w _i of the stroke s _{i with} respect to the character string direction θ. When the character string direction is horizontal, it is the distance from the minimum value to the maximum value of the x coordinate in the stroke.

ステップＳ３０８では、オーバーラップ算出モジュール１２０が、２本のストロークの重なりを算出する。図６は、第１の実施の形態（特にオーバーラップ算出モジュール１２０）による処理例を示す説明図である。オーバーラップ算出モジュール１２０は、文字列方向θについて、ストロークｓ_ｉとｓ_ｊの被覆幅ｖ_ｉｊを算出する。 In step S308, the overlap calculation module 120 calculates the overlap of two strokes. FIG. 6 is an explanatory diagram illustrating a processing example according to the first embodiment (particularly, the overlap calculation module 120). The overlap calculation module 120 calculates the covering width v _ij of the strokes s _i and s _j with respect to the character string direction θ.

ステップＳ３１０では、判定モジュール１３０が、統合判定を行う。図７は、第１の実施の形態（特に判定モジュール１３０）による処理例を示す説明図である。判定モジュール１３０は、幅ｗ_ｉと被覆幅ｖ_ｉｊの比率が閾値ｔｈ以上であれば１（ストロークｓ_ｉとｓ_ｊは同じ文字を構成している）を、そうでなければ０（ストロークｓ_ｉとｓ_ｊは同じ文字を構成していない）を返す。 In step S310, the determination module 130 performs integration determination. FIG. 7 is an explanatory diagram illustrating a processing example according to the first exemplary embodiment (particularly, the determination module 130). The determination module 130 determines 1 (strokes s _i and s _j constitute the same character) if the ratio of the width w _i and the covering width v _ij is equal to or greater than the threshold th, and 0 (stroke s _i ) otherwise. And s _j do not constitute the same character).

ステップＳ３１２では、判定モジュール１３０が、残っているストロークがあるか否かによって終了か否かを判断し、終了の場合は処理を終了し（ステップＳ３９９）、それ以外の場合はステップＳ３０４へ戻る。
ステップＳ３０６とステップＳ３０８の処理は、いずれが先に処理を行ってもよいし、平行して処理してもよい。 In step S312, the determination module 130 determines whether or not to end depending on whether or not there is a remaining stroke. If it is ended, the process ends (step S399), and otherwise returns to step S304.
Either step S306 or step S308 may be performed first or in parallel.

図８は、第１の実施の形態による処理例を示す説明図である。特に、データの流れを示したものである。ここでは、ストロークｓ_ｉとストロークｓ_ｊは、同じ文字を構成しているか否かを判定するものである。ストロークサイズ算出モジュール１１０は、ストロークｓ_ｉを受け取り、ｗ_ｉを算出する。オーバーラップ算出モジュール１２０は、ストロークｓ_ｉとストロークｓ_ｊを受け取り、ｖ_ｉｊを算出する。判定モジュール１３０は、ｗ_ｉとｖ_ｉｊを受け取り、判定結果をｂ_ｉｊとして、その２つのストロークが同じ文字を構成しているならばｂ_ｉｊ＝１を、そうでなければｂ_ｉｊ＝０を返す。 FIG. 8 is an explanatory diagram illustrating a processing example according to the first exemplary embodiment. In particular, it shows the flow of data. Here, it is determined whether or not the stroke s _i and the stroke s _j constitute the same character. Stroke size calculating module 110 receives the stroke _{s i,} calculates the _{w i.} The overlap calculation module 120 receives the stroke s _i and the stroke s _j and calculates v _ij . The determination module 130 receives w _i and v _ij , sets b _ij as the determination result, and returns b _ij = 1 if the two strokes constitute the same character, otherwise returns b _ij = 0. .

図９は、第１の実施の形態による処理例を示す説明図である。図８の例に示したように、ストロークサイズ算出モジュール１１０がストロークｓ_ｉの長さw_ｉを算出し、w_ｉにおけるv_ijの占める割合に基づいて判定モジュール１３０が判定することを示しており、どちらのストロークがｓ_ｉになるかによって、判定結果が異なる場合があることを示している。
図９（ａ）は、ストローク９００Ａをｓ_ｉとした場合の例を示している。つまり、ストロークｓ_ｉがストロークs_ｊより長く、ストロークサイズＷ_ｉ９１０に占めるオーバーラップ間隔Ｖ_ｉｊ９２０の割合が閾値Ｔｈ未満となるため、判定結果は、ｂ_ｉｊ＝０となる。したがって、ストロークｓ_ｉとストロークs_ｊを分離する。
図９（ｂ）は、ストローク９００Ｂをｓ_ｉとした場合の例を示している。図９（ａ）の例とは、ストロークｓ_ｉとストロークs_ｊの関係が逆になっている。つまり、ストロークｓ_ｊがストロークs_ｉより長く、ストロークサイズＷ_ｉ９３０に占めるオーバーラップ間隔Ｖ_ｉｊ９２０の割合が閾値Ｔｈ以上となるため、判定結果は、ｂ_ｉｊ＝１となる。したがって、ストロークｓ_ｉとストロークs_ｊを統合する。
ストロークの組み合わせとして、全ての組み合わせ（図９の例のように、いずれをｓ_ｉとするかについての組み合わせを含む）を生成し、判定を行う。そして、同じ２つのストロークの組み合わせにおいて、いずれかの判定結果が同じ文字の構成である（ｂ_ｉｊ＝１）となった場合は、その組み合わせの判定結果は、同じ文字の構成であるとする。 FIG. 9 is an explanatory diagram illustrating a processing example according to the first exemplary embodiment. As shown in the example of FIG. 8, the stroke size calculation module 110 calculates the length w _i of the stroke s _i , and the determination module 130 determines based on the ratio of v _ij in w _i . , depending on which stroke is s _i, the determination result indicates that there may be different.
FIG. 9A shows an example in which the stroke 900A is set to s _i . That is, since the stroke s _i is longer than the stroke s _{j and} the ratio of the overlap interval V _ij 920 to the stroke size W _i 910 is less than the threshold Th, the determination result is b _ij = 0. Therefore, the stroke s _i and the stroke s _j are separated.
FIG. 9B shows an example in which the stroke 900B is set to s _i . The relationship between the stroke s _i and the stroke s _j is reversed from the example of FIG. That is, since the stroke s _j is longer than the stroke s _{i and} the ratio of the overlap interval V _ij 920 to the stroke size W _i 930 is equal to or greater than the threshold Th, the determination result is b _ij = 1. Therefore, the stroke s _i and the stroke s _j are integrated.
As combinations of strokes, all combinations (including combinations of which are set to s _i as in the example of FIG. 9) are generated and determined. Then, in the combination of the same two strokes, if any of the determination results has the same character configuration (b _ij = 1), the determination result of the combination is the same character configuration.

＜第２の実施の形態＞
図１０は、第２の実施の形態の構成例についての概念的なモジュール構成図である。第２の実施の形態は、図１０の例に示すように、ストロークサイズ算出モジュール１０１２、ストロークサイズ算出モジュール１０１４、オーバーラップ算出モジュール１０２０、判定モジュール１０３２、判定モジュール１０３４、ＯＲ演算モジュール１０３６を有している。 <Second Embodiment>
FIG. 10 is a conceptual module configuration diagram of a configuration example according to the second embodiment. As shown in the example of FIG. 10, the second embodiment includes a stroke size calculation module 1012, a stroke size calculation module 1014, an overlap calculation module 1020, a determination module 1032, a determination module 1034, and an OR operation module 1036. ing.

ストロークサイズ算出モジュール１０１２は、判定モジュール１０３２と接続されている。ストロークサイズ算出モジュール１０１２は、ストロークｓ_ｉの長さw_ｉを算出し、判定モジュール１０３２に渡す。
ストロークサイズ算出モジュール１０１４は、判定モジュール１０３４と接続されている。ストロークサイズ算出モジュール１０１４は、ストロークｓ_ｊの長さw_ｊを算出し、判定モジュール１０３４に渡す。
オーバーラップ算出モジュール１０２０は、判定モジュール１０３２、判定モジュール１０３４と接続されている。オーバーラップ算出モジュール１０２０は、ストロークｓ_ｉとストロークｓ_ｊを受け取り、ｖ_ｉｊを算出し、判定モジュール１０３２、判定モジュール１０３４に渡す。
判定モジュール１０３２は、ストロークサイズ算出モジュール１０１２、オーバーラップ算出モジュール１０２０、ＯＲ演算モジュール１０３６と接続されている。判定モジュール１０３２は、オーバーラップ算出モジュール１０２０によって算出されたｖ_ｉｊとストロークサイズ算出モジュール１０１２によって算出されたｗ_ｉの比率に基づいて、ストロークｓ_ｉとストロークｓ_ｊが同じ文字を構成しているか否かの仮の判定を行い、判定結果であるｂ_ｉｊをＯＲ演算モジュール１０３６に渡す。
判定モジュール１０３４は、ストロークサイズ算出モジュール１０１４、オーバーラップ算出モジュール１０２０、ＯＲ演算モジュール１０３６と接続されている。判定モジュール１０３４は、オーバーラップ算出モジュール１０２０によって算出されたｖ_ｉｊとストロークサイズ算出モジュール１０１４によって算出されたｗ_ｊの比率に基づいて、ストロークｓ_ｊとストロークｓ_ｉが同じ文字を構成しているか否かの仮の判定を行い、判定結果であるｂ_ｊｉをＯＲ演算モジュール１０３６に渡す。
ＯＲ演算モジュール１０３６は、判定モジュール１０３２、判定モジュール１０３４と接続されている。ＯＲ演算モジュール１０３６は、判定モジュール１０３２による仮の判定結果ｂ_ｉｊと判定モジュール１０３４による仮の判定結果ｂ_ｊｉに基づいて、ストロークｓ_ｉとストロークｓ_ｊが同じ文字を構成しているか否かを判定する。そして、判定結果をm_ｉｊとして、同じ文字を構成しているならばｍ_ｉｊ＝１を、そうでなければｍ_ｉｊ＝０を出力する。 The stroke size calculation module 1012 is connected to the determination module 1032. The stroke size calculation module 1012 calculates the length w _i of the stroke s _i and passes it to the determination module 1032.
The stroke size calculation module 1014 is connected to the determination module 1034. The stroke size calculation module 1014 calculates the length w _j of the stroke s _j and passes it to the determination module 1034.
The overlap calculation module 1020 is connected to the determination module 1032 and the determination module 1034. The overlap calculation module 1020 receives the stroke s _i and the stroke s _j , calculates v _ij , and passes it to the determination module 1032 and the determination module 1034.
The determination module 1032 is connected to the stroke size calculation module 1012, the overlap calculation module 1020, and the OR operation module 1036. The determination module 1032 determines whether or not the stroke s _i and the stroke s _j constitute the same character based on the ratio of v _ij calculated by the overlap calculation module 1020 and w _i calculated by the stroke size calculation module 1012. Tentative determination is performed, and the determination result b _ij is passed to the OR operation module 1036.
The determination module 1034 is connected to the stroke size calculation module 1014, the overlap calculation module 1020, and the OR operation module 1036. The determination module 1034 determines whether the stroke s _j and the stroke s _i constitute the same character based on the ratio of v _ij calculated by the overlap calculation module 1020 and w _j calculated by the stroke size calculation module 1014. The provisional determination is performed, and the determination result b _ji is passed to the OR operation module 1036.
The OR operation module 1036 is connected to the determination module 1032 and the determination module 1034. The OR operation module 1036 determines whether or not the stroke s _i and the stroke s _j constitute the same character based on the temporary determination result b _{ij from} the determination module 1032 and the temporary determination result b _{ji from} the determination module 1034. To do. Then, with the determination result being m _ij , m _ij = 1 is output if the same character is formed, and m _ij = 0 is output otherwise.

ストロークサイズ算出モジュール１０１２、ストロークサイズ算出モジュール１０１４は、第１の実施の形態のストロークサイズ算出モジュール１１０と同等の処理を行う。オーバーラップ算出モジュール１０２０は、第１の実施の形態のオーバーラップ算出モジュール１２０と同等の処理を行う。判定モジュール１０３２、判定モジュール１０３４は、第１の実施の形態の判定モジュール１３０と同等の処理を行う。
ストロークサイズ算出モジュール１０１２による算出処理とストロークサイズ算出モジュール１０１４による算出処理は平行して行われる。判定モジュール１０３２による判定処理と判定モジュール１０３４による判定処理は平行して行われる。 The stroke size calculation module 1012 and the stroke size calculation module 1014 perform the same processing as the stroke size calculation module 110 of the first embodiment. The overlap calculation module 1020 performs the same processing as the overlap calculation module 120 of the first embodiment. The determination module 1032 and the determination module 1034 perform the same processing as the determination module 130 of the first embodiment.
The calculation process by the stroke size calculation module 1012 and the calculation process by the stroke size calculation module 1014 are performed in parallel. The determination process by the determination module 1032 and the determination process by the determination module 1034 are performed in parallel.

図１１は、第２の実施の形態による処理例を示す説明図である。
ストロークサイズ算出モジュール１０１２がストロークｓ_ｉ１１００ＡのストロークサイズＷ_ｉ１１１０を算出し、ストロークサイズ算出モジュール１０１４がストロークｓ_ｊ１１００ＢのストロークサイズＷ_ｊ１１２０を算出し、オーバーラップ算出モジュール１０２０がオーバーラップ間隔Ｖ_ｉｊ１１３０を算出し、判定モジュール１０３２がストロークサイズＷ_ｉ１１１０におけるオーバーラップ間隔Ｖ_ｉｊ１１３０の占める割合に基づいて仮判定し、判定モジュール１０３４がストロークサイズＷ_ｊ１１２０におけるオーバーラップ間隔Ｖ_ｉｊ１１３０の占める割合に基づいて仮判定し、ＯＲ演算モジュール１０３６が２つの仮判定結果（ｂ_ｉｊ、ｂ_ｊｉ）から判定し、判定結果m_ｉｊを出力する。この場合、ストロークサイズＷ_ｊ１１２０に占めるオーバーラップ間隔Ｖ_ｉｊ１１３０の割合が閾値Ｔｈ以上であるので、ｂ_ｊｉ＝１となり、m_ｉｊ＝１となる。したがって、ストロークｓ_ｉ１１００Ａとストロークs_ｊ１１００Ｂを統合する。 FIG. 11 is an explanatory diagram illustrating a processing example according to the second exemplary embodiment.
The stroke size calculation module 1012 calculates the stroke size W _i 1110 of the stroke s _i 1100A, the stroke size calculation module 1014 calculates the stroke size W _j 1120 of the stroke s _j 1100B, and the overlap calculation module 1020 calculates the overlap interval V. _ij 1130 is calculated, and the determination module 1032 makes a tentative determination based on the ratio of the overlap interval V _ij 1130 in the stroke size W _i 1110. The determination module 1034 occupies the overlap interval V _ij 1130 in the stroke size W _j 1120. Temporary determination is made based on the ratio, and the OR operation module 1036 makes a determination from two temporary determination results (b _ij , b _ji ), and outputs a determination result m _ij . In this case, since the ratio of the overlap interval V _ij 1130 occupying the stroke size W _j 1120 is equal to or greater than the threshold Th, b _ji = 1 and m _ij = 1. Therefore, the stroke s _i 1100A and the stroke s _j 1100B are integrated.

なお、ＯＲ演算モジュール１０３６の判定は、ｂ_ｉｊ、ｂ_ｊｉのいずれかが１（同じ文字を構成している）であれば、判定結果m_ｉｊも１（同じ文字を構成している）となる。つまり、論理積（ＡＮＤ）演算ではなく、論理和（ＯＲ）演算としている。これは、次のような理由による。図１１の例に示したように、
Ｖ_ｉｊ／Ｗ_ｉ＜Ｔｈ → ｂ_ｉｊ＝０
Ｖ_ｉｊ／Ｗ_ｊ≧Ｔｈ → ｂ_ｊｉ＝１
となる。
w_ｉからみてv_ijはＴｈ未満（ｂ_ｉｊ＝０であるが、w_ｊからみるとv_ijはＴｈ以上（ｂ_ｊｉ＝０）となっているため、論理和演算によりｍ_ｉｊ＝１とする。
論理積演算ではｍ_ｉｊ＝０となるため、図１１に示す例においてストロークを分離してしまう。よって、論理和演算とする。 The determination by the OR operation module 1036 is that if either b _ij or b _ji is 1 (constitutes the same character), the determination result m _ij is also 1 (constitutes the same character). . That is, it is not a logical product (AND) operation but a logical sum (OR) operation. This is due to the following reason. As shown in the example of FIG.
V _ij / W _i <Th → b _ij = 0
V _ij / W _j ≧ Th → b _ji = 1
It becomes.
From the viewpoint of w _i , v _ij is less than Th (b _ij = 0, but from the viewpoint of w _j , v _ij is equal to or greater than Th (b _ji = 0), so that m _ij = 1 by logical OR operation. .
Since m _ij = 0 in the logical product operation, the stroke is separated in the example shown in FIG. Therefore, the logical sum operation is used.

図１２は、第２の実施の形態による処理例を示すフローチャートである。
ステップＳ１２０２では、第２の実施の形態が、ストローク情報を取得する。
ステップＳ１２０４では、第２の実施の形態が、ストローク情報を抽出する。
ステップＳ１２０６では、オーバーラップ算出モジュール１０２０が、重なりを算出する。
ステップＳ１２０８では、ストロークサイズ算出モジュール１０１２が、サイズを算出する。
ステップＳ１２１０では、ストロークサイズ算出モジュール１０１４が、サイズを算出する。
ステップＳ１２１２では、判定モジュール１０３２が、統合判定を行う。
ステップＳ１２１４では、判定モジュール１０３４が、統合判定を行う。
ステップＳ１２１６では、ＯＲ演算モジュール１０３６が、総合判定を行う。
ステップＳ１２１８では、ＯＲ演算モジュール１０３６が、終了したか否かを判断し、終了の場合は処理を終了し（ステップＳ１２９９）、それ以外の場合はステップＳ１２０４へ戻る。 FIG. 12 is a flowchart illustrating a processing example according to the second exemplary embodiment.
In step S1202, the second embodiment acquires stroke information.
In step S1204, the second embodiment extracts stroke information.
In step S1206, the overlap calculation module 1020 calculates an overlap.
In step S1208, the stroke size calculation module 1012 calculates the size.
In step S1210, the stroke size calculation module 1014 calculates the size.
In step S <b> 1212, the determination module 1032 performs integration determination.
In step S1214, the determination module 1034 performs integration determination.
In step S1216, the OR operation module 1036 performs comprehensive determination.
In step S1218, the OR operation module 1036 determines whether or not the processing has ended. If the processing has ended, the processing ends (step S1299). Otherwise, the processing returns to step S1204.

＜第３の実施の形態＞
図１３は、第３の実施の形態の構成例についての概念的なモジュール構成図である。第３の実施の形態は、ストロークサイズ算出モジュール１３１２、ストロークサイズ算出モジュール１３１４、オーバーラップ算出モジュール１３２０、ミニマム演算モジュール１３３２、判定モジュール１３３４を有している。 <Third Embodiment>
FIG. 13 is a conceptual module configuration diagram of an exemplary configuration according to the third embodiment. The third embodiment includes a stroke size calculation module 1312, a stroke size calculation module 1314, an overlap calculation module 1320, a minimum calculation module 1332, and a determination module 1334.

ストロークサイズ算出モジュール１３１２は、ミニマム演算モジュール１３３２と接続されている。ストロークサイズ算出モジュール１３１２は、ストロークｓ_ｉの長さw_ｉを算出し、ミニマム演算モジュール１３３２に渡す。第１の実施の形態のストロークサイズ算出モジュール１１０と同等の処理を行う。
ストロークサイズ算出モジュール１３１４は、ミニマム演算モジュール１３３２と接続されている。ストロークサイズ算出モジュール１３１４は、ストロークｓ_ｊの長さw_ｊを算出し、ミニマム演算モジュール１３３２に渡す。第１の実施の形態のストロークサイズ算出モジュール１１０と同等の処理を行う。
オーバーラップ算出モジュール１３２０は、判定モジュール１３３４と接続されている。オーバーラップ算出モジュール１３２０は、第１の実施の形態のオーバーラップ算出モジュール１２０と同等の処理を行う。 The stroke size calculation module 1312 is connected to the minimum calculation module 1332. The stroke size calculation module 1312 calculates the length w _i of the stroke s _i and passes it to the minimum calculation module 1332. Processing equivalent to that of the stroke size calculation module 110 of the first embodiment is performed.
The stroke size calculation module 1314 is connected to the minimum calculation module 1332. The stroke size calculation module 1314 calculates the length w _j of the stroke s _j and passes it to the minimum calculation module 1332. Processing equivalent to that of the stroke size calculation module 110 of the first embodiment is performed.
The overlap calculation module 1320 is connected to the determination module 1334. The overlap calculation module 1320 performs the same processing as the overlap calculation module 120 of the first embodiment.

ミニマム演算モジュール１３３２は、ストロークサイズ算出モジュール１３１２、ストロークサイズ算出モジュール１３１４、判定モジュール１３３４と接続されている。ミニマム演算モジュール１３３２は、ストロークサイズ算出モジュール１３１２によって算出された長さｗ_ｉとストロークサイズ算出モジュール１３１４によって算出された長さｗ_ｊのいずれか一方を選択する。具体的には、長さｗ_ｉと長さｗ_ｊを比較して、短いものを選択する。つまり、短いストロークが長いストロークへ従属しない（同じ文字を構成していない）とき、逆も必ず従属しないという関係になるからである。 The minimum calculation module 1332 is connected to the stroke size calculation module 1312, the stroke size calculation module 1314, and the determination module 1334. The minimum calculation module 1332 selects either the length w _i calculated by the stroke size calculation module 1312 or the length w _j calculated by the stroke size calculation module 1314. Specifically, the length w _i is compared with the length w _j and the short one is selected. In other words, when a short stroke does not depend on a long stroke (does not constitute the same character), the reverse is not necessarily dependent.

判定モジュール１３３４は、オーバーラップ算出モジュール１３２０、ミニマム演算モジュール１３３２と接続されている。判定モジュール１３３４は、オーバーラップ算出モジュール１３２０によって算出された距離Ｖ_ｉｊとミニマム演算モジュール１３３２によって選択された長さｗの比率に基づいて、ストロークｓ_ｉとストロークｓ_ｊが同じ文字を構成しているか否かを判定する。第１の実施の形態の判定モジュール１３０と同等の処理を行う。
なお、第２の実施の形態よりも必要とするメモリ容量を少なくしたい場合は、仮判定結果ｂを記憶する必要がない第３の実施の形態を採用すればよい。 The determination module 1334 is connected to the overlap calculation module 1320 and the minimum calculation module 1332. The determination module 1334 determines whether the stroke s _i and the stroke s _j constitute the same character based on the ratio of the distance V _ij calculated by the overlap calculation module 1320 and the length w selected by the minimum calculation module 1332. Determine whether or not. Processing equivalent to that of the determination module 130 of the first embodiment is performed.
If it is desired to reduce the required memory capacity compared to the second embodiment, the third embodiment that does not need to store the provisional determination result b may be employed.

図１４は、第３の実施の形態による処理例を示す説明図である。
ストロークサイズ算出モジュール１３１２がストロークｓ_ｉ１４００ＡのストロークサイズＷ_ｉ１４１０を算出し、ストロークサイズ算出モジュール１３１４がストロークｓ_ｊ１４００ＢのストロークサイズＷ_ｊ１４２０を算出し、ミニマム演算モジュール１３３２がストロークサイズＷ_ｉ１４１０とストロークサイズＷ_ｊ１４２０のうち短いストロークサイズＷ_ｊ１４２０を選択し、オーバーラップ算出モジュール１３２０がオーバーラップ間隔Ｖ_ｉｊ１４３０を算出し、判定モジュール１３３４がストロークサイズＷ_ｊ１４２０におけるオーバーラップ間隔Ｖ_ｉｊ１４３０の占める割合に基づいて判定し、判定結果m_ｉｊを出力する。この場合、ストロークサイズＷ_ｊ１４２０に占めるオーバーラップ間隔Ｖ_ｉｊ１４３０の割合が閾値Ｔｈ以上であるので、m_ｉｊ＝１となる。したがって、ストロークｓ_ｉ１４００Ａとストロークs_ｊ１４００Ｂを統合する。 FIG. 14 is an explanatory diagram illustrating a processing example according to the third exemplary embodiment.
The stroke size calculation module 1312 calculates the stroke size W _i 1410 of the stroke s _i 1400A, the stroke size calculation module 1314 calculates the stroke size W _j 1420 of the stroke s _j 1400B, and the minimum calculation module 1332 calculates the stroke size W _i 1410. and select a shorter stroke size _W j 1420 of the stroke size _W j 1420, overlap calculation module 1320 calculates the overlap distance _V ij 1430, determination module 1334 overlap distance _V ij 1430 at the stroke size _W j 1420 And the determination result m _ij is output. In this case, since the ratio of the overlap interval V _ij 1430 to the stroke size W _j 1420 is equal to or greater than the threshold Th, m _ij = 1. Therefore, the stroke s _i 1400A and the stroke s _j 1400B are integrated.

図１５は、第３の実施の形態による処理例を示すフローチャートである。
ステップＳ１５０２では、第３の実施の形態が、ストローク情報を取得する。
ステップＳ１５０４では、第３の実施の形態が、ストローク情報を抽出する。
ステップＳ１５０６では、ストロークサイズ算出モジュール１３１２が、一方のストロークのサイズを算出する。
ステップＳ１５０８では、ストロークサイズ算出モジュール１３１４が、他方のストロークのサイズを算出する。
ステップＳ１５１０では、オーバーラップ算出モジュール１３２０が、重なりを算出する。
ステップＳ１５１２では、ミニマム演算モジュール１３３２が、ストロークのサイズで小さい方を選択する。
ステップＳ１５１４では、判定モジュール１３３４が、統合判定を行う。
ステップＳ１５１６では、判定モジュール１３３４が、終了したか否かを判断し、終了の場合は処理を終了し（ステップＳ１５９９）、それ以外の場合はステップＳ１５０４へ戻る。 FIG. 15 is a flowchart illustrating a processing example according to the third exemplary embodiment.
In step S1502, the third embodiment acquires stroke information.
In step S1504, the third embodiment extracts stroke information.
In step S1506, the stroke size calculation module 1312 calculates the size of one stroke.
In step S1508, the stroke size calculation module 1314 calculates the size of the other stroke.
In step S1510, the overlap calculation module 1320 calculates an overlap.
In step S1512, the minimum calculation module 1332 selects the smaller stroke size.
In step S1514, the determination module 1334 performs an integrated determination.
In step S1516, the determination module 1334 determines whether or not the processing has ended. If the processing has ended, the processing ends (step S1599). Otherwise, the processing returns to step S1504.

＜第４の実施の形態＞
図１６は、第４の実施の形態による処理例を示す説明図である。
第１の実施の形態においては、図１６（ａ）の例に示すように、ストローク１６００Ａの長さであるストロークサイズＷ_ｉ１６１０を用いているが、ストローク中のある一点（ストロークの中心、重心等）が他のストロークの範囲に入るか否かを用いるようにしてもよい。
具体的には、図１６（ｂ）の例に示すように、第１の実施の形態のストロークサイズ算出モジュール１１０が、ストローク１６００Ａ又はストローク１６００Ｂの範囲を算出する。つまり、ｘ軸上での始点から終点までの範囲を算出する。第１の実施の形態のオーバーラップ算出モジュール１２０が、ストローク１６００Ｂ又はストローク１６００Ａの予め定められた位置を算出する。例えば、ストロークの中心（例えば、中心Ｃ_ｊ）等である。なお、ストロークサイズ算出モジュール１１０がストローク１６００Ａを対象とした場合は、オーバーラップ算出モジュール１２０はストローク１６００Ｂを対象とする。そして、ストロークサイズ算出モジュール１１０がストローク１６００Ｂを対象とした場合は、オーバーラップ算出モジュール１２０はストローク１６００Ａを対象とする。そして、第１の実施の形態の判定モジュール１３０は、オーバーラップ算出モジュール１２０によって算出された位置（例えば、中心Ｃ_ｊ）が、ストロークサイズ算出モジュール１１０によって算出された範囲（例えば、ストローク１６００Ａの左端から右端までの範囲）に含まれているか否かによって、ストローク１６００Ａとストローク１６００Ｂが同じ文字を構成しているか否かを判定する。 <Fourth embodiment>
FIG. 16 is an explanatory diagram illustrating a processing example according to the fourth exemplary embodiment.
In the first embodiment, as shown in the example of FIG. 16A, a stroke size W _i 1610, which is the length of the stroke 1600A, is used, but a certain point in the stroke (center of stroke, center of gravity) Etc.) may be used as to whether they fall within the range of other strokes.
Specifically, as illustrated in the example of FIG. 16B, the stroke size calculation module 110 according to the first embodiment calculates the range of the stroke 1600A or the stroke 1600B. That is, the range from the start point to the end point on the x-axis is calculated. The overlap calculation module 120 of the first embodiment calculates a predetermined position of the stroke 1600B or the stroke 1600A. For example, the center of the stroke (for example, the center C _j ). When the stroke size calculation module 110 targets the stroke 1600A, the overlap calculation module 120 targets the stroke 1600B. When the stroke size calculation module 110 targets the stroke 1600B, the overlap calculation module 120 targets the stroke 1600A. Then, the determination module 130 according to the first embodiment is configured so that the position (for example, the center C _j ) calculated by the overlap calculation module 120 is within the range (for example, the left end of the stroke 1600A) calculated by the stroke size calculation module 110. Whether or not the stroke 1600A and the stroke 1600B constitute the same character.

ストローク中のある一点として、中心を例にして説明する。
図１６（ｂ）の例に示すように、２本のストローク（ストローク１６００Ａ、ストローク１６００Ｂ）のうち、一方のストローク（ストローク１６００Ｂ）のストローク中心１６３０が、もう一方のストローク（ストローク１６００Ａ）の始点と終点に挟まれている範囲内に含まれている場合、その２本のストローク（ストローク１６００Ａ、ストローク１６００Ｂ）は同じ文字を構成していると判定する。
なお、第４の実施の形態におけるストロークの中心とした場合は、第１の実施の形態における閾値Ｔｈ＝０．５（５０％）とした場合と同等であることは、図１６（ｂ）の例から明らかである。
また、第１の実施の形態における閾値Ｔｈ＝０．７（７０％）としたい場合は、第４の実施の形態ではストロークの中心ではなく、ストロークの端点からストローク全体の長さの７０％の位置とすればよい。ここでのストロークの端点とは、重なり合っている範囲における端点である。図１６（ｂ）の例を用いて説明すると、ストローク１６００Ａの右端点から７０％、又はストローク１６００Ｂの左端点から７０％の位置とすればよい。 As one point in the stroke, the center will be described as an example.
As shown in the example of FIG. 16B, of two strokes (stroke 1600A, stroke 1600B), the stroke center 1630 of one stroke (stroke 1600B) is the start point of the other stroke (stroke 1600A). If it is included in the range between the end points, it is determined that the two strokes (stroke 1600A, stroke 1600B) constitute the same character.
In addition, when it is set as the center of the stroke in 4th Embodiment, it is equivalent to the case where threshold value Th = 0.5 (50%) in 1st Embodiment is shown in FIG.16 (b). It is clear from the example.
Further, when it is desired to set the threshold Th = 0.7 (70%) in the first embodiment, in the fourth embodiment, not the center of the stroke but 70% of the entire stroke length from the end point of the stroke. It may be the position. The end point of the stroke here is an end point in the overlapping range. If it demonstrates using the example of FIG.16 (b), what is necessary is just to be 70% from the left end point of stroke 1600B, or 70% from the right end point of stroke 1600A.

図１７は、座標値の中心の例を示す説明図である。ストロークの始点ｘ_１、ストロークの終点ｘ_Ｋから式（１）で座標値の中心１７３０を算出する。

ストロークの中心の他に、図１８〜２０の例に示すものを用いてもよい。
図１８は、時間の中心の例を示す説明図である。ストロークを予め定められた時間間隔でサンプリングした位置で中心を決定するものである。ストロークの始点ｘ_１、ストロークの終点ｘ_Ｋから式（２）で時間中心１８３０を算出する。

FIG. 17 is an explanatory diagram illustrating an example of the center of coordinate values. The center 1730 of the coordinate value is calculated from the stroke start point x ₁ and the stroke end point x _K by Equation (1).

In addition to the center of the stroke, those shown in the examples of FIGS.
FIG. 18 is an explanatory diagram showing an example of the center of time. The center is determined at a position where the stroke is sampled at a predetermined time interval. A time center 1830 is calculated from the stroke start point x ₁ and the stroke end point x _K by Equation (2).

図１９は、重心の例を示す説明図である。ストロークを予め定められた時間間隔でサンプリングした位置で重心を決定するものである。ストロークの始点ｘ_１、ストロークの終点ｘ_Ｋから式（３）で重心１９３０を算出する。

図２０は、重み付き重心の例を示す説明図である。ここでは、筆圧を考慮した重心とした例を示している。ストロークを予め定められた時間間隔でサンプリングした位置と筆圧で重み付き重心を決定するものである。ストロークの始点ｘ_１、ストロークの終点ｘ_Ｋから式（４）で重み付き重心２０３０を算出する。

FIG. 19 is an explanatory diagram illustrating an example of the center of gravity. The center of gravity is determined at a position where the stroke is sampled at a predetermined time interval. The center of gravity 1930 is calculated from the stroke start point x ₁ and the stroke end point x _K by Equation (3).

FIG. 20 is an explanatory diagram illustrating an example of a weighted center of gravity. Here, an example is shown in which the center of gravity is considered in consideration of writing pressure. The weighted center of gravity is determined by the position at which the stroke is sampled at a predetermined time interval and the writing pressure. A weighted centroid 2030 is calculated from the stroke start point x ₁ and the stroke end point x _K by Equation (4).

＜第５の実施の形態＞
図２１は、第５の実施の形態の構成例についての概念的なモジュール構成図である。第５の実施の形態は、ペア選択モジュール２１１０、統合判定モジュール２１２０、統合情報保持モジュール２１３０を有している。
第５の実施の形態は、２つ以上のストローク列Ｓの各要素どうしの全判定結果の情報Ｍを作成する。Ｍはストロークｓ_ｉとストロークｓ_ｊの判定結果ｍ_ｉｊの集合であり、最大要素数は、_＃｛Ｓ｝Ｃ_２である。 <Fifth embodiment>
FIG. 21 is a conceptual module configuration diagram of a configuration example according to the fifth embodiment. The fifth embodiment includes a pair selection module 2110, an integrated determination module 2120, and an integrated information holding module 2130.
In the fifth embodiment, information M of all determination results for each element of two or more stroke sequences S is created. M is a set of the determination results m _ij of the stroke s _i and the stroke s _j , and the maximum number of elements is _{# {S}} C ₂ .

ペア選択モジュール２１１０は、統合判定モジュール２１２０、統合情報保持モジュール２１３０と接続されている。ペア選択モジュール２１１０は、ストロークの集合Ｓから統合判定モジュール２１２０が処理する対象であるストロークの組（２つのストローク、s_ｉとs_ｊ）を選択する。つまり、ペア選択モジュール２１１０は、Ｓからストロークｓ_ｉとストロークｓ_ｊを出力する。全てのペアを出力した後、終了信号⊥を出力する。
統合判定モジュール２１２０は、ペア選択モジュール２１１０、統合情報保持モジュール２１３０と接続されている。統合判定モジュール２１２０は、第１の実施の形態から第４の実施の形態のいずれかである。
統合情報保持モジュール２１３０は、ペア選択モジュール２１１０、統合判定モジュール２１２０と接続されている。統合情報保持モジュール２１３０は、統合判定モジュール２１２０の判定結果を記憶する。つまり、統合情報保持モジュール２１３０は、受け付けた判定結果ｍ_ｉｊを保持する。終了信号⊥を受け付けた場合に、全判定結果情報Ｍを出力する。 The pair selection module 2110 is connected to the integration determination module 2120 and the integration information holding module 2130. The pair selection module 2110 selects a set of strokes (two strokes, s _i and s _j ) to be processed by the integrated determination module 2120 from the set of strokes S. That is, the pair selection module 2110 outputs the stroke s _i and the stroke s _j from S. After all pairs are output, the end signal ⊥ is output.
The integrated determination module 2120 is connected to the pair selection module 2110 and the integrated information holding module 2130. The integrated determination module 2120 is one of the first to fourth embodiments.
The integrated information holding module 2130 is connected to the pair selection module 2110 and the integrated determination module 2120. The integrated information holding module 2130 stores the determination result of the integrated determination module 2120. That is, the integrated information holding module 2130 holds the received determination result m _ij . When the end signal ⊥ is received, all determination result information M is output.

＜第６の実施の形態＞
図２２は、第６の実施の形態の構成例についての概念的なモジュール構成図である。第６の実施の形態は、サイズ算出モジュール２２１０、サイズ順ソートモジュール２２２０、ペア選択モジュール２２３０、オーバーラップ算出モジュール２２４０、判定モジュール２２５０、統合情報保持モジュール２２６０を有している。予め、個々のストロークのサイズを算出しておく。なお、第４の実施の形態と比較して、サイズの算出回数が少なくなる。また、判定処理は第３の実施の形態と同様にメモリ容量を抑制している。 <Sixth Embodiment>
FIG. 22 is a conceptual module configuration diagram of a configuration example according to the sixth embodiment. The sixth embodiment includes a size calculation module 2210, a size order sort module 2220, a pair selection module 2230, an overlap calculation module 2240, a determination module 2250, and an integrated information holding module 2260. The size of each stroke is calculated in advance. Note that the number of times of size calculation is reduced compared to the fourth embodiment. Further, the determination process suppresses the memory capacity as in the third embodiment.

サイズ算出モジュール２２１０は、サイズ順ソートモジュール２２２０と接続されている。サイズ算出モジュール２２１０は、対象となり得るストロークの長さを算出する。サイズ算出モジュール２２１０は、ストローク群Ｓの全ての要素（ストローク）サイズを算出し、全サイズ情報Ｗを作成する。対象となり得るストロークとしては、例えば、１行内の手書き文字のストローク群、１ページ内の文書内の手書き文字のストローク群が該当する。
サイズ順ソートモジュール２２２０は、サイズ算出モジュール２２１０、ペア選択モジュール２２３０と接続されている。サイズ順ソートモジュール２２２０は、サイズ算出モジュール２２１０によって算出された長さに基づいて、ストロークを昇順又は降順に整列する。サイズ順ソートモジュール２２２０は、ストロークのサイズの昇順又は降順にＳとＷをソートする。ソート後のＳとＷを、それぞれ、Ｓ’とＷ’とする。長さを比較したソートであれば、どのようなソートアルゴリズムを用いてもよい。 The size calculation module 2210 is connected to the size order sorting module 2220. The size calculation module 2210 calculates the length of a stroke that can be a target. The size calculation module 2210 calculates all element (stroke) sizes of the stroke group S and creates all size information W. Examples of strokes that can be targeted include stroke groups of handwritten characters in one line and stroke groups of handwritten characters in a document in one page.
The size order sort module 2220 is connected to the size calculation module 2210 and the pair selection module 2230. The size order sorting module 2220 sorts the strokes in ascending order or descending order based on the length calculated by the size calculating module 2210. The size order sorting module 2220 sorts S and W in ascending or descending order of stroke size. Let S 'and W' after sorting be S 'and W', respectively. Any sort algorithm may be used as long as the sorts are compared in length.

ペア選択モジュール２２３０は、サイズ順ソートモジュール２２２０、オーバーラップ算出モジュール２２４０、判定モジュール２２５０、統合情報保持モジュール２２６０と接続されている。ペア選択モジュール２２３０は、ストローク群から２つのストローク（ストロークｓ_ｉ、ストロークｓ_ｊ）の組を選択する。選択方法は、どのような選択を行ってもよい。例えば、ランダム（疑似乱数を含む）に選択してもよいし、前述したような全ての組み合わせを発生させるような選択であってもよいし、座標値を元にしたオーバーラップのありそうなストロークの組（具体的には、ストロークをｘ座標でソートして隣り合うもの）を選択するようにしてもよい。ペア選択モジュール２２３０は、Ｓ’からストロークｓ_ｉとｓ_ｊを出力する。また、Ｗ’からｉ番目の要素ｗ_ｉ又はｊ番目のｗ_ｊのいずれかを出力する。つまり、昇順ソートであればｉ＜ｊであるのでｗ_ｉを出力し、降順ソートであればｉ＞ｊであるのでｗ_ｊを出力する。この処理は、サイズ順ソートモジュール２２２０による整列順によって、ストロークの長さが短いストロークを選択し、その選択したストロークの長さを出力することに相当する。ここで、ストロークの長さを比較する必要はない。
オーバーラップ算出モジュール２２４０は、ペア選択モジュール２２３０、判定モジュール２２５０と接続されている。オーバーラップ算出モジュール２２４０は、第１の実施の形態のオーバーラップ算出モジュール１２０と同等の処理を行う。 The pair selection module 2230 is connected to the size order sorting module 2220, the overlap calculation module 2240, the determination module 2250, and the integrated information holding module 2260. The pair selection module 2230 selects a set of two strokes (stroke s _i , stroke s _j ) from the stroke group. Any selection method may be used. For example, it may be selected randomly (including pseudo-random numbers), or may be a selection that generates all combinations as described above, or a stroke that is likely to overlap based on coordinate values. (Specifically, strokes sorted by x coordinate and adjacent to each other) may be selected. The pair selection module 2230 outputs strokes s _i and s _j from S ′. Also, either the i-th element w _i or the j-th w _j is output from W ′. That is, since i <j in the ascending sort, w _i is output, and in the descending sort, i> j, so w _j is output. This process corresponds to selecting a stroke having a short stroke length according to the order of sorting by the size order sorting module 2220 and outputting the selected stroke length. Here, there is no need to compare stroke lengths.
The overlap calculation module 2240 is connected to the pair selection module 2230 and the determination module 2250. The overlap calculation module 2240 performs the same processing as the overlap calculation module 120 of the first embodiment.

判定モジュール２２５０は、ペア選択モジュール２２３０、オーバーラップ算出モジュール２２４０、統合情報保持モジュール２２６０と接続されている。判定モジュール２２５０は、ストロークｓ_ｉとストロークｓ_ｊのうち、オーバーラップ算出モジュール２２４０によって算出された距離とペア選択モジュール２２３０によって選択されたストロークの長さ（ｗ_ｉ又はｊ番目のｗ_ｊのいずれか一方）の比率に基づいて、ストロークｓ_ｉとストロークｓ_ｊが同じ文字を構成しているか否かを判定する。具体的には、ｉ＜ｊであるとすると、サイズ順ソートモジュール２２２０が昇順でソートを行った場合は、ストロークｓ_ｉを選択する。
統合情報保持モジュール２２６０は、ペア選択モジュール２２３０、判定モジュール２２５０と接続されている。統合情報保持モジュール２２６０は、判定モジュール２２５０による判定結果を記憶する。 The determination module 2250 is connected to the pair selection module 2230, the overlap calculation module 2240, and the integrated information holding module 2260. The determination module 2250 determines the distance calculated by the overlap calculation module 2240 and the length of the stroke selected by the pair selection module 2230 (wh _i or j-th w _{j) from} the stroke s _i and the stroke s _j . On the other hand, it is determined whether or not the stroke s _i and the stroke s _j constitute the same character. Specifically, assuming that i <j, when the size order sorting module 2220 performs sorting in ascending order, the stroke s _i is selected.
The integrated information holding module 2260 is connected to the pair selection module 2230 and the determination module 2250. The integrated information holding module 2260 stores the determination result by the determination module 2250.

図２３は、第６の実施の形態による処理例を示す説明図である。ストローク２３００Ａとストローク２３００Ｂでオーバーラップがあった場合に判定を行う。
昇順ソートであれば、ｉ＜ｊとしているため、必ず要素ｗ_ｉが比率における分母となる。具体的には、ストロークサイズＷ_ｉ２３１０とストロークサイズＷ_ｊ２３２０のソートの結果、ストローク２３００Ａ、ストローク２３００Ｂの順番になる。そして、このペアにおいてはストロークサイズＷ_ｉ２３１０が選択される。比率として（オーバーラップ間隔Ｖ_ｉｊ２３３０）／（ストロークサイズＷ_ｉ２３１０）を計算する。つまり、ここではストロークサイズＷ_ｉ２３１０とストロークサイズＷ_ｊ２３２０のいずれを分母にするかを比較する必要がない。 FIG. 23 is an explanatory diagram illustrating a processing example according to the sixth exemplary embodiment. A determination is made when there is an overlap between stroke 2300A and stroke 2300B.
In the ascending order sort, since i <j, the element w _i is always the denominator in the ratio. Specifically, as a result of sorting the stroke size W _i 2310 and the stroke size W _j 2320, the stroke 2300A and the stroke 2300B are in this order. In this pair, the stroke size W _i 2310 is selected. As a ratio, (overlap interval V _ij 2330) / (stroke size W _i 2310) is calculated. That is, here, it is not necessary to compare which of the stroke size W _i 2310 and the stroke size W _j 2320 is the denominator.

＜第７の実施の形態＞
図２４は、第７の実施の形態の構成例についての概念的なモジュール構成図である。第７の実施の形態は、統合情報作成モジュール２４１０、連結成分算出モジュール２４２０、統合ストローク保持モジュール２４３０を有しており、統合されたストローク列Ｃを作成する。
統合情報作成モジュール２４１０は、連結成分算出モジュール２４２０と接続されている。統合情報作成モジュール２４１０は、前述の実施の形態（第１の実施の形態〜第６の実施の形態）のいずれかである。 <Seventh embodiment>
FIG. 24 is a conceptual module configuration diagram of a configuration example according to the seventh embodiment. The seventh embodiment includes an integrated information creation module 2410, a connected component calculation module 2420, and an integrated stroke holding module 2430, and creates an integrated stroke sequence C.
The integrated information creation module 2410 is connected to the connected component calculation module 2420. The integrated information creation module 2410 is one of the above-described embodiments (first embodiment to sixth embodiment).

連結成分算出モジュール２４２０は、統合情報作成モジュール２４１０、統合ストローク保持モジュール２４３０と接続されている。連結成分算出モジュール２４２０は、統合情報作成モジュール２４１０によって同じ文字を構成していると判定されたストロークｓ_ｉとストロークｓ_ｊに基づいて、その文字を構成しているストローク群を抽出する。
統合ストローク保持モジュール２４３０は、連結成分算出モジュール２４２０と接続されている。統合ストローク保持モジュール２４３０は、入力された統合ストロークｃを保持する。⊥を受け付けると、統合ストローク列Ｃを出力する。 The connected component calculation module 2420 is connected to the integrated information creation module 2410 and the integrated stroke holding module 2430. The connected component calculation module 2420 extracts a stroke group constituting the character based on the stroke s _i and the stroke s _j determined by the integrated information creation module 2410 to constitute the same character.
The integrated stroke holding module 2430 is connected to the connected component calculation module 2420. The integrated stroke holding module 2430 holds the input integrated stroke c. When ⊥ is accepted, the integrated stroke sequence C is output.

図２５は、第７の実施の形態による処理例を示す説明図である。
連結成分算出モジュール２４２０は、全判定結果情報Ｍに基づいて、Ｓの要素（（ストローク２５１２、２５１４、２５１６）、（ストローク２５６２、２５６４、２５６６、２５６８））を統合し、統合ストロークｃ_ｋ（統合ストローク情報２５３０、２５８０）を作成する。これは、Ｓの要素を頂点として、Ｍの判定結果を辺（判定結果情報２５２２、２５２４）とする。具体的には、同じ文字を構成していると判定された場合を辺としてストロークの頂点を接続する。そして、できあがったグラフの連結成分（統合ストローク情報２５３０、２５８０）を１文字とすればよい。全ての連結成分を算出した後、終了信号⊥を出力する。グラフの連結成分の算出は既存技術を利用すればよい。 FIG. 25 is an explanatory diagram of a processing example according to the seventh embodiment.
The connected component calculation module 2420 integrates the elements of S ((strokes 2512, 2514, 2516), (strokes 2562, 2564, 2566, 2568)) based on the entire determination result information M, and an integrated stroke c _k (integrated). Stroke information 2530, 2580) is created. This means that the element of S is a vertex and the determination result of M is a side (determination result information 2522, 2524). Specifically, the vertices of the stroke are connected with the sides determined as constituting the same character. Then, the connected component (integrated stroke information 2530, 2580) of the completed graph may be one character. After calculating all connected components, an end signal ⊥ is output. The calculation of the connected components of the graph may be performed using existing technology.

＜第８の実施の形態＞
図２６は、第８の実施の形態の構成例についての概念的なモジュール構成図である。第８の実施の形態は、統合ストローク作成モジュール２６１０、オンライン文字列認識モジュール２６２０を有しており、文字認識を行う。
統合ストローク作成モジュール２６１０は、オンライン文字列認識モジュール２６２０と接続されており、第７の実施の形態である。
オンライン文字列認識モジュール２６２０は、統合ストローク作成モジュール２６１０と接続されており、オンライン文字列認識モジュール２６２０は、統合ストローク作成モジュール２６１０によって抽出されたストローク群を対象として文字認識を行い、文字認識結果を出力する。オンライン文字列認識モジュール２６２０は、具体的には、統合ストローク列Ｃを受け付け、オンライン文字認識を行う。これは、既存の文字認識技術を利用すればよい。 <Eighth Embodiment>
FIG. 26 is a conceptual module configuration diagram illustrating an exemplary configuration according to the eighth embodiment. The eighth embodiment includes an integrated stroke creation module 2610 and an online character string recognition module 2620, and performs character recognition.
The integrated stroke creation module 2610 is connected to the online character string recognition module 2620, which is the seventh embodiment.
The online character string recognition module 2620 is connected to the integrated stroke creation module 2610. The online character string recognition module 2620 performs character recognition on the stroke group extracted by the integrated stroke creation module 2610, and obtains the character recognition result. Output. Specifically, the online character string recognition module 2620 receives the integrated stroke string C and performs online character recognition. This may be done by using existing character recognition technology.

図２７を参照して、本実施の形態の情報処理装置のハードウェア構成例について説明する。図２７に示す構成は、例えばパーソナルコンピュータ（ＰＣ）などによって構成されるものであり、スキャナ等のデータ読み取り部２７１７と、プリンタなどのデータ出力部２７１８を備えたハードウェア構成例を示している。 With reference to FIG. 27, a hardware configuration example of the information processing apparatus according to the present embodiment will be described. The configuration shown in FIG. 27 is configured by a personal computer (PC), for example, and shows a hardware configuration example including a data reading unit 2717 such as a scanner and a data output unit 2718 such as a printer.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２７０１は、前述の実施の形態において説明した各種のモジュール、すなわち、ストロークサイズ算出モジュール１１０、オーバーラップ算出モジュール１２０、判定モジュール１３０、ストロークサイズ算出モジュール１０１２、１０１４、オーバーラップ算出モジュール１０２０、判定モジュール１０３２、１０３４、ＯＲ演算モジュール１０３６、ストロークサイズ算出モジュール１３１２、１３１４、オーバーラップ算出モジュール１３２０、ミニマム演算モジュール１３３２、判定モジュール１３３４、統合判定モジュール２１２０、統合判定モジュール２１２０、統合情報保持モジュール２１３０、サイズ算出モジュール２２１０、サイズ順ソートモジュール２２２０、ペア選択モジュール２２３０、オーバーラップ算出モジュール２２４０、判定モジュール２２５０、統合情報保持モジュール２２６０、統合情報作成モジュール２４１０、連結成分算出モジュール２４２０、統合ストローク保持モジュール２４３０、統合ストローク作成モジュール２６１０、オンライン文字列認識モジュール２６２０等の各モジュールの実行シーケンスを記述したコンピュータ・プログラムにしたがった処理を実行する制御部である。 The CPU (Central Processing Unit) 2701 is the various modules described in the above-described embodiments, that is, the stroke size calculation module 110, the overlap calculation module 120, the determination module 130, the stroke size calculation modules 1012, 1014, and the overlap calculation. Module 1020, determination modules 1032 and 1034, OR operation module 1036, stroke size calculation modules 1312 and 1314, overlap calculation module 1320, minimum operation module 1332, determination module 1334, integrated determination module 2120, integrated determination module 2120, integrated information holding Module 2130, size calculation module 2210, size order sorting module 2220 , Pair selection module 2230, overlap calculation module 2240, determination module 2250, integrated information holding module 2260, integrated information creating module 2410, connected component calculating module 2420, integrated stroke holding module 2430, integrated stroke creating module 2610, online character string recognition It is a control part which performs the process according to the computer program which described the execution sequence of each module, such as the module 2620.

ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２７０２は、ＣＰＵ２７０１が使用するプログラムや演算パラメータ等を格納する。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２７０３は、ＣＰＵ２７０１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を格納する。これらはＣＰＵバスなどから構成されるホストバス２７０４により相互に接続されている。 A ROM (Read Only Memory) 2702 stores programs used by the CPU 2701, operation parameters, and the like. A RAM (Random Access Memory) 2703 stores programs used in the execution of the CPU 2701, parameters that change as appropriate during the execution, and the like. These are connected to each other by a host bus 2704 including a CPU bus.

ホストバス２７０４は、ブリッジ２７０５を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バスなどの外部バス２７０６に接続されている。 The host bus 2704 is connected to an external bus 2706 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 2705.

キーボード２７０８、マウス等のポインティングデバイス２７０９は、操作者により操作される入力デバイスである。ディスプレイ２７１０は、液晶表示装置又はＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）などがあり、各種情報をテキストやイメージ情報として表示する。 A keyboard 2708 and a pointing device 2709 such as a mouse are input devices operated by an operator. The display 2710 includes a liquid crystal display device or a CRT (Cathode Ray Tube), and displays various types of information as text or image information.

ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）２７１１は、ハードディスクを内蔵し、ハードディスクを駆動し、ＣＰＵ２７０１によって実行するプログラムや情報を記録又は再生させる。ハードディスクには、本実施の形態が受け付けるストロークＡ：１００Ａ、１００Ｂ、判定結果１９９、統合ストローク保持モジュール２４３０の処理結果、文字認識結果等が格納される。さらに、その他の各種のデータ処理プログラム等、各種コンピュータ・プログラムが格納される。 An HDD (Hard Disk Drive) 2711 includes a hard disk, drives the hard disk, and records or reproduces a program executed by the CPU 2701 and information. The hard disk stores the strokes A: 100A and 100B accepted by the present embodiment, the determination result 199, the processing result of the integrated stroke holding module 2430, the character recognition result, and the like. Further, various computer programs such as various other data processing programs are stored.

ドライブ２７１２は、装着されている磁気ディスク、光ディスク、光磁気ディスク、又は半導体メモリ等のリムーバブル記録媒体２７１３に記録されているデータ又はプログラムを読み出して、そのデータ又はプログラムを、インタフェース２７０７、外部バス２７０６、ブリッジ２７０５、及びホストバス２７０４を介して接続されているＲＡＭ２７０３に供給する。リムーバブル記録媒体２７１３も、ハードディスクと同様のデータ記録領域として利用可能である。 The drive 2712 reads data or a program recorded on a removable recording medium 2713 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and reads the data or program into an interface 2707 and an external bus 2706. , To the RAM 2703 connected via the bridge 2705 and the host bus 2704. The removable recording medium 2713 can also be used as a data recording area similar to the hard disk.

接続ポート２７１４は、外部接続機器２７１５を接続するポートであり、ＵＳＢ、ＩＥＥＥ１３９４等の接続部を持つ。接続ポート２７１４は、インタフェース２７０７、及び外部バス２７０６、ブリッジ２７０５、ホストバス２７０４等を介してＣＰＵ２７０１等に接続されている。通信部２７１６は、通信回線に接続され、外部とのデータ通信処理を実行する。データ読み取り部２７１７は、例えばスキャナであり、ドキュメントの読み取り処理を実行する。データ出力部２７１８は、例えばプリンタであり、ドキュメントデータの出力処理を実行する。 The connection port 2714 is a port for connecting the external connection device 2715 and has a connection unit such as USB or IEEE1394. The connection port 2714 is connected to the CPU 2701 and the like via the interface 2707, the external bus 2706, the bridge 2705, the host bus 2704, and the like. A communication unit 2716 is connected to a communication line and executes data communication processing with the outside. The data reading unit 2717 is, for example, a scanner, and executes document reading processing. The data output unit 2718 is a printer, for example, and executes document data output processing.

なお、図２７に示す情報処理装置のハードウェア構成は、１つの構成例を示すものであり、本実施の形態は、図２７に示す構成に限らず、本実施の形態において説明したモジュールを実行可能な構成であればよい。例えば、一部のモジュールを専用のハードウェア（例えば特定用途向け集積回路（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ：ＡＳＩＣ）等）で構成してもよく、一部のモジュールは外部のシステム内にあり通信回線で接続しているような形態でもよく、さらに図２７に示すシステムが複数互いに通信回線によって接続されていて互いに協調動作するようにしてもよい。また、複写機、ファックス、スキャナ、プリンタ、複合機（スキャナ、プリンタ、複写機、ファックス等のいずれか２つ以上の機能を有している画像処理装置）などに組み込まれていてもよい。 Note that the hardware configuration of the information processing apparatus illustrated in FIG. 27 illustrates one configuration example, and the present embodiment is not limited to the configuration illustrated in FIG. 27, and the modules described in the present embodiment are executed. Any configuration is possible. For example, some modules may be configured with dedicated hardware (for example, Application Specific Integrated Circuit (ASIC), etc.), and some modules are in an external system and connected via a communication line Alternatively, a plurality of systems shown in FIG. 27 may be connected to each other via communication lines so as to cooperate with each other. Further, it may be incorporated in a copying machine, a fax machine, a scanner, a printer, a multifunction machine (an image processing apparatus having any two or more functions of a scanner, a printer, a copying machine, a fax machine, etc.).

また、前述の実施の形態の説明において、予め定められた値との比較において、「以上」、「以下」、「より大きい」、「より小さい（未満）」としたものは、その組み合わせに矛盾が生じない限り、それぞれ「より大きい」、「より小さい（未満）」、「以上」、「以下」としてもよい。
なお、前述の各種の実施の形態を組み合わせてもよく（例えば、ある実施の形態内のモジュールを他の実施の形態内に追加する、入れ替えをする等も含む）、また、各モジュールの処理内容として背景技術で説明した技術を採用してもよい。 Further, in the description of the above-described embodiment, “more than”, “less than”, “greater than”, and “less than (less than)” in a comparison with a predetermined value contradicts the combination. As long as the above does not occur, “larger”, “smaller (less than)”, “more than”, and “less than” may be used.
Note that the above-described various embodiments may be combined (for example, adding or replacing a module in one embodiment in another embodiment), and processing contents of each module The technique described in the background art may be employed.

なお、説明したプログラムについては、記録媒体に格納して提供してもよく、また、そのプログラムを通信手段によって提供してもよい。その場合、例えば、前記説明したプログラムについて、「プログラムを記録したコンピュータ読み取り可能な記録媒体」の発明として捉えてもよい。
「プログラムを記録したコンピュータ読み取り可能な記録媒体」とは、プログラムのインストール、実行、プログラムの流通などのために用いられる、プログラムが記録されたコンピュータで読み取り可能な記録媒体をいう。
なお、記録媒体としては、例えば、デジタル・バーサタイル・ディスク（ＤＶＤ）であって、ＤＶＤフォーラムで策定された規格である「ＤＶＤ−Ｒ、ＤＶＤ−ＲＷ、ＤＶＤ−ＲＡＭ等」、ＤＶＤ＋ＲＷで策定された規格である「ＤＶＤ＋Ｒ、ＤＶＤ＋ＲＷ等」、コンパクトディスク（ＣＤ）であって、読出し専用メモリ（ＣＤ−ＲＯＭ）、ＣＤレコーダブル（ＣＤ−Ｒ）、ＣＤリライタブル（ＣＤ−ＲＷ）等、ブルーレイ・ディスク（Ｂｌｕ−ｒａｙＤｉｓｃ（登録商標））、光磁気ディスク（ＭＯ）、フレキシブルディスク（ＦＤ）、磁気テープ、ハードディスク、読出し専用メモリ（ＲＯＭ）、電気的消去及び書換可能な読出し専用メモリ（ＥＥＰＲＯＭ（登録商標））、フラッシュ・メモリ、ランダム・アクセス・メモリ（ＲＡＭ）、ＳＤ（ＳｅｃｕｒｅＤｉｇｉｔａｌ）メモリーカード等が含まれる。
そして、前記のプログラム又はその一部は、前記記録媒体に記録して保存や流通等させてもよい。また、通信によって、例えば、ローカル・エリア・ネットワーク（ＬＡＮ）、メトロポリタン・エリア・ネットワーク（ＭＡＮ）、ワイド・エリア・ネットワーク（ＷＡＮ）、インターネット、イントラネット、エクストラネット等に用いられる有線ネットワーク、あるいは無線通信ネットワーク、さらにこれらの組み合わせ等の伝送媒体を用いて伝送させてもよく、また、搬送波に乗せて搬送させてもよい。
さらに、前記のプログラムは、他のプログラムの一部分であってもよく、あるいは別個のプログラムと共に記録媒体に記録されていてもよい。また、複数の記録媒体に分割して
記録されていてもよい。また、圧縮や暗号化など、復元可能であればどのような態様で記録されていてもよい。 The program described above may be provided by being stored in a recording medium, or the program may be provided by communication means. In that case, for example, the above-described program may be regarded as an invention of a “computer-readable recording medium recording the program”.
The “computer-readable recording medium on which a program is recorded” refers to a computer-readable recording medium on which a program is recorded, which is used for program installation, execution, program distribution, and the like.
The recording medium is, for example, a digital versatile disc (DVD), which is a standard established by the DVD Forum, such as “DVD-R, DVD-RW, DVD-RAM,” and DVD + RW. Standard “DVD + R, DVD + RW, etc.”, compact disc (CD), read-only memory (CD-ROM), CD recordable (CD-R), CD rewritable (CD-RW), Blu-ray disc ( Blu-ray Disc (registered trademark), magneto-optical disk (MO), flexible disk (FD), magnetic tape, hard disk, read-only memory (ROM), electrically erasable and rewritable read-only memory (EEPROM (registered trademark)) )), Flash memory, Random access memory (RAM) SD (Secure Digital) memory card and the like.
The program or a part of the program may be recorded on the recording medium for storage or distribution. Also, by communication, for example, a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), a wired network used for the Internet, an intranet, an extranet, etc., or wireless communication It may be transmitted using a transmission medium such as a network or a combination of these, or may be carried on a carrier wave.
Furthermore, the program may be a part of another program, or may be recorded on a recording medium together with a separate program. Moreover, it may be divided and recorded on a plurality of recording media. Further, it may be recorded in any manner as long as it can be restored, such as compression or encryption.

１１０…ストロークサイズ算出モジュール
１２０…オーバーラップ算出モジュール
１３０…判定モジュール
１０１２…ストロークサイズ算出モジュール
１０１４…ストロークサイズ算出モジュール
１０２０…オーバーラップ算出モジュール
１０３２…判定モジュール
１０３４…判定モジュール
１０３６…ＯＲ演算モジュール
１３１２…ストロークサイズ算出モジュール
１３１４…ストロークサイズ算出モジュール
１３２０…オーバーラップ算出モジュール
１３３２…ミニマム演算モジュール
１３３４…判定モジュール
２１１０…ペア選択モジュール
２１２０…統合判定モジュール
２１３０…統合情報保持モジュール
２２１０…サイズ算出モジュール
２２２０…サイズ順ソートモジュール
２２３０…ペア選択モジュール
２２４０…オーバーラップ算出モジュール
２２５０…判定モジュール
２２６０…統合情報保持モジュール
２４１０…統合情報作成モジュール
２４２０…連結成分算出モジュール
２４３０…統合ストローク保持モジュール
２６１０…統合ストローク作成モジュール
２６２０…オンライン文字列認識モジュール DESCRIPTION OF SYMBOLS 110 ... Stroke size calculation module 120 ... Overlap calculation module 130 ... Judgment module 1012 ... Stroke size calculation module 1014 ... Stroke size calculation module 1020 ... Overlap calculation module 1032 ... Judgment module 1034 ... Judgment module 1036 ... OR operation module 1312 ... Stroke Size calculation module 1314 ... Stroke size calculation module 1320 ... Overlap calculation module 1332 ... Minimum calculation module 1334 ... Determination module 2110 ... Pair selection module 2120 ... Integration determination module 2130 ... Integrated information holding module 2210 ... Size calculation module 2220 ... Sort in size order Module 2230 ... Pair selection module 2240 Overlap calculation module 2250 ... determination module 2260 ... integrated information holding module 2410 ... integrated information creation module 2420 ... connected component calculating module 2430 ... integrated stroke holding module 2610 ... integrated stroke creation module 2620 ... online character string recognition module

Claims

Based on the first coordinate information sequence and the second coordinate information sequence configured by the coordinate information, the first line segment represented by the first coordinate information sequence and the second coordinate information sequence First calculating means for calculating a distance of a portion where the second line segments to be overlapped with each other;
Second calculating means for calculating a length of the first line segment or the second line segment;
Based on the ratio between the distance calculated by the first calculation means and the length calculated by the second calculation means, the first line segment and the second line segment constitute the same character. An information processing apparatus comprising: determination means for determining whether or not there is.

The second calculation means includes:
Third calculating means for calculating the length of the first line segment;
Fourth calculating means for calculating the length of the second line segment,
The determination means includes
Based on the ratio of the distance calculated by the first calculation means and the length calculated by the third calculation means, the first line segment and the second line segment constitute the same character. Second determination means for making a temporary determination as to whether or not,
Based on the ratio between the distance calculated by the first calculation means and the length calculated by the fourth calculation means, the first line segment and the second line segment constitute the same character. Third determination means for making a temporary determination as to whether or not,
A first determination is made as to whether or not the first line segment and the second line segment constitute the same character based on the determination result by the second determination unit and the determination result by the third determination unit. The information processing apparatus according to claim 1, further comprising: 4 determination means.

The second calculation means includes:
Third calculating means for calculating the length of the first line segment;
Fourth calculating means for calculating the length of the second line segment;
Selecting means for selecting one of the length calculated by the third calculating means and the length calculated by the fourth calculating means;
The determination unit is configured such that the first line segment and the second line segment form the same character based on a ratio between the distance calculated by the first calculation unit and the length selected by the selection unit. The information processing apparatus according to claim 1, wherein it is determined whether or not the information processing is performed.

The first calculating means calculates a predetermined position of the second line segment or the first line segment;
The second calculation means calculates the range of the first line segment or the second line segment,
The determination unit determines whether the position calculated by the first calculation unit is included in the range calculated by the second calculation unit, and whether or not the first line segment and the second line are included. The information processing apparatus according to claim 1, wherein it is determined whether or not minutes constitute the same character.

The second calculation means calculates a length of a line segment that can be a target,
Alignment means for aligning line segments in ascending or descending order based on the length calculated by the second calculating means;
The determination unit selects a line segment having a short line length from the first line segment and the second line segment according to the alignment order by the alignment unit, and calculates the line segment by the first calculation unit. Determining whether or not the first line segment and the second line segment constitute the same character based on a ratio between the measured distance and the length of the selected line segment. The information processing apparatus according to claim 1.

Extracting means for extracting a group of line segments constituting the character based on the first line segment and the second line segment determined to constitute the same character by the determining means. The information processing apparatus according to claim 1, wherein the information processing apparatus is an information processing apparatus.

The information processing apparatus according to claim 6, further comprising: a character recognition unit that performs character recognition on the line segment group extracted by the extraction unit.

Computer
Based on the first coordinate information sequence and the second coordinate information sequence configured by the coordinate information, the first line segment represented by the first coordinate information sequence and the second coordinate information sequence First calculating means for calculating a distance of a portion where the second line segments to be overlapped with each other;
Second calculating means for calculating a length of the first line segment or the second line segment;
Based on the ratio between the distance calculated by the first calculation means and the length calculated by the second calculation means, the first line segment and the second line segment constitute the same character. Information processing program for functioning as determination means for determining whether or not there is.