KR100301216B1

KR100301216B1 - Online text recognition device

Info

Publication number: KR100301216B1
Application number: KR1019980005891A
Authority: KR
Inventors: 다이조 가메시로; 다케로리 가와마타
Original assignee: 다니구찌 이찌로오, 기타오카 다카시; 미쓰비시덴키 가부시키가이샤
Priority date: 1997-03-04
Filing date: 1998-02-25
Publication date: 2001-11-30
Also published as: TW399187B; CN1201207A; JPH10247221A; KR19980079762A; JP3657077B2; CN1096043C

Abstract

입력 문자 패턴 중의 예기치 않은 성분으로 기인하는 인식율의 저하를 방지하고, 또한, 계속되는 글자 등에 대한 인식율의 향상을 도모하는 온라인 문자 인식 장치를 제공한다.An on-line character recognition apparatus is provided that prevents a decrease in recognition rate due to unexpected components in an input character pattern, and further improves the recognition rate for subsequent characters and the like.

입력부(1)로부터의 입력문자 패턴상의 좌표점을 절선 근사함으로써 얻은 각 세그먼트에 관한 방향 및 길이 및 특징점을 추출하는 특징 추출부(2)와, 사전내의 문자의 세그먼트와 입력 문자 패턴에서 얻은 세그먼트와의 대응짓기를 행하고 세그먼트 대응짓기 거리를 산출하는 특징점 대응짓기부(3)와, 사전내의 문자의 특징점조에 대응한 입력 문자 패턴상의 특징점조(特徵点組)에 의해 결정되는 구간의 특징정보를 대응 스트로그 특징으로서 추출하는 지정구간 특징 추출부(4)와, 산출한 대응 스트로그 특징의 거리와 세그먼트 대응짓기 거리를 병용해서 후보문자를 산출하는 특징 조합부(5)를 갖춘다.A feature extracting unit 2 for extracting a direction and a length and a feature point for each segment obtained by cutting a coordinate point on the input character pattern from the input unit 1 with a cut line, a segment of a character in a dictionary and a segment obtained from an input character pattern; Correspondence between the feature point mapping unit 3 for performing correspondence and calculating the segment correspondence distance, and the feature information of the section determined by the feature emphasis on the input character pattern corresponding to the feature emphasis of the characters in the dictionary. A feature section extracting section 4 for extracting as a stroke feature and a feature combination section 5 for calculating candidate characters by using the calculated distance of the corresponding stroke feature and the segment correspondence distance are provided.

Description

Online character recognition device

본 발명은 펜 컴퓨터 등으로 문자를 수기해서 입력하는 온라인 문자 인식 장치, 특히 계속되는 글자 등에 대한 문자 인식율의 향상을 도모하는 온라인 문자 인식 장치에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an on-line character recognition apparatus for handwriting characters into a pen computer and the like, and more particularly, to an on-line character recognition apparatus for improving character recognition rates for subsequent characters.

펜과 태블릿(tablet)을 입력 수단으로 하는 펜 컴퓨터에 문자코드를 입력하기 위한 요소기술인 온라인 문자 인식에 있어서, 해서(楷書)로 필기된 문자는 공지의 기본 스트로크 방식(미리 수 종류의 스트로크의 형상을 기본 스트로크로서 정의하고, 문자를 기본 스트로크의 조합으로 표한한다)이나 그 밖의 각종 인식방식에 의해 고정밀도 인식이 가능하다. 그러나 계속되는 글자에 대한 인식 성능은 해서에 비해 충분하지 않다. 이 때문에 종래부터 계속되는 글자에 대응한 온라인 문자 인식 방식의 연구가 행해져 왔다. 예를 들면 전자 통신 학회 논문지 J66-D No. 5, 제593~600페이지에 기재된 「선택적 스트로크 결합에 의한 획수ㆍ필순에 의존하지 않는 온라인 문자 인식」이 있다. 이하, 이것을 종래예 1로 한다. 종래예 1에 의하면 입력 패턴과 사전에 있어서 스토로크(펜다운(pen down)에서 펜업(pen up)까지의 좌표열의 단위) 수의 작은 편의 스트로크를 스트로크 수가 많은 편으로 1 대 1로 대응시켜 스트로크가 많은 편에서 대응 지어지지 않은 스트로크는 이미 대응짓고 있는 스트로크에 선택적으로 결합하고, 결합후의 사전과 입력 패턴의 좌표점간의 거리를 DP(Dynamic Programing) 매칭을 사용해서 산출하고, 후보문자를 출력함으로써 계속되는 글자 인식을 가능케 하고 있다. DP 매칭에 대해서는 예를 들면 「패턴인식」(후나구보 노보루 저: 교리쓰출판)의 제62페이지부터 기술되어 있으므로 여기에서는 상세한 언급은 않는다. 종래예 1에서는 좌표점을 DP 매칭의 특징에 사용하고 있으나, 이밖에는 제18도에 도시하는 운필에 따라 등분할한 좌표점간의 방향성분(방향코드)을 사용하는 방법도 있다.In online character recognition, which is an element technology for inputting a character code into a pen computer using a pen and a tablet as a means of input, the written character is written in a known basic stroke method (previous form of strokes). Is defined as the basic stroke, and characters are represented by a combination of the basic strokes) and other various recognition methods. However, the performance of continuous letters is not enough compared to the sea. For this reason, the research of the on-line character recognition system corresponding to the character continued conventionally has been performed. For example, the Journal of Electronics and Communication Engineers J66-D No. 5, pages 593 to 600, entitled "Online Character Recognition Not Relying on Stroke or Mandatory Order by Selective Stroke Combination". Hereinafter, this is called conventional example 1. According to the prior art example 1, the small stroke of the number of strokes (a unit of coordinate string from pen down to pen up) in the input pattern and the dictionary corresponds to one to one stroke with a large number of strokes. On the other hand, the unmatched strokes are selectively combined with the already matched strokes, the distance between the coordinate points of the dictionary and the input pattern after combining is calculated using DP (Dynamic Programming) matching, and the candidate characters are output. It allows for continuous character recognition. Since DP matching is described, for example, from page 62 of "Pattern Recognition" (Funagubo Novoru, Kyoritsu Publishing), detailed description is not made here. In Conventional Example 1, the coordinate point is used as a feature of DP matching, but there is also a method of using a direction component (direction code) between coordinate points that is equally divided according to the stroke shown in FIG.

제18도에서는, 입력 패턴에 대해 운필 방향에 따라 스트로크를 흡사 한 획으로 쓰고 있는 것과 같이 모두 연결된 상태로 하고, 스트로크를 적당한 폭으로 등분할 근사하고 있다. 제18도의 각각의 분할점간(a1, a2, a3, a4, …, a21)의 방향 성분을 예를 들면 제19도에 도시하는 8방향 코드로 근사하여, 이 방향 성분을 DP 매칭의 특징으로 사용해서 계속되는 글자 인식을 할 수도 있다.In Fig. 18, all of the strokes are connected to each other in the same manner as the strokes are written in the stroke direction, and the strokes are approximated by appropriate widths. The direction component between each of the dividing points a1, a2, a3, a4, ..., a21 in FIG. 18 is approximated by an eight-way code shown in FIG. 19, for example, and this direction component is used as a feature of DP matching. You can also continue the character recognition.

또한, 기본 스트로크를 사용한 방법에서는, 계속되는 글자에 대응한 사전을 작성하여 인식하는 방법 혹은 스트로크의 분리정보를 사전에 가지고, 입력 패턴과 사전의 스트로크 수가 동일하게 되도록 스트로크를 분해하는 방법이 있다.Further, in the method using the basic stroke, there is a method of creating and recognizing a dictionary corresponding to a subsequent letter, or a method of decomposing the stroke so that the stroke number of the input pattern and the dictionary is the same with the separation information of the stroke in advance.

예를 들면, 기본 스트로크를 사용한 종래예로서, 일본 특허공개 평2-10473호가 있다. 이하, 이것을 종래예 2로 한다. 여기에서 이 종래예 2에 있어서 구성 및 동작을 설명한다.For example, Japanese Patent Laid-Open No. 2-10473 is a conventional example using a basic stroke. Hereinafter, this is called conventional example 2. Here, the structure and operation in this conventional example 2 are demonstrated.

제20도는 종래예 2의 온라인 문자 인식 장치의 기본 구성을 도시한 블록 구성도이다. 제20도에는 좌표 입력 장치(21)와, 좌표 입력 장치(21)의 출력을 입력하는 기본선분 식별회로(22)와, 기본선분 식별회로(22)의 출력을 입력하고 순차 출력하는 선분 코드 송출회로(23)와, 선분코드 버퍼(24)와, 판정회로(29)가 재인식으로 한 경우 스트로크의 선분을 순차 분해하는 선분 분해회로(25)와, 제어회로(26)와, 선분코드 송출회로(23)의 출력과 사전 기억부(28)의 출력을 비교하는 비교회로(27)와, 비교회로(27)의 출력을 입력해서 문자의 판정을 행하는 판정회로(29)와, 기억되어 있는 사전 데이타를 비교회로(27)에 순차 송출하는 사전 기억부(28)가 도시되어 있다.20 is a block diagram showing the basic configuration of the on-line character recognition apparatus of the conventional example 2. As shown in FIG. 20 shows a line segment code for inputting the coordinate input device 21, the base line segment identification circuit 22 for inputting the output of the coordinate input unit 21, and the output of the base line segment identification circuit 22, and sequentially outputting them. The circuit 23, the line segment code buffer 24, the line segment decomposition circuit 25 which sequentially decomposes the line segment of the stroke when the determination circuit 29 is re-recognized, the control circuit 26, and the line segment code sending circuit A comparison circuit 27 for comparing the output of 23 with the output of the dictionary storage 28, a determination circuit 29 for inputting the output of the comparison circuit 27 to determine characters, and a stored dictionary The preliminary storage section 28 for sequentially sending data to the comparison circuit 27 is shown.

좌표 입력장치(21)로부터 출력되는 좌표점의 시계열 정보를 공급된 기본선분 식별 회로(22)에서는, 스트로크를 절선 근사하고, 각각의 절선(세그먼트)의 방향 선분을 제19도에 도시하는 8방향 코드로 나타낸다. 다음에, 제21도에 도시하는 방향 코드열과 기본 스트로크의 대응표를 사용해서 입력 스트로크가 어느 기본 스트로크에 속하는지를 결정한다.In the basic line segment identification circuit 22 supplied with time series information of coordinate points output from the coordinate input device 21, the stroke is approximated by a line, and the eight-directions of the line segments of the respective cut lines (segments) are shown in FIG. Represented by code. Next, using the correspondence table between the direction code string and the basic stroke shown in FIG. 21, it is determined which basic stroke the input stroke belongs to.

다음에, 제22도의 패턴을 사용해서 종래예 2의 동작을 설명한다. 제22도에서는 선 101, 102, 103, 104, 105의 순으로 스트로크를 기입하고 있다. 각 스트로크를 절선 근사하고, 8방향 코드로 필순에 따라서 나타내면, {(1), (6), (7), (1, 7), (1)}로 된다. 이것을 제21도에 도시하는 기본 스트로크 표를 사용해서 기본 스트로크 열{(1), (3), (4), (7), (1)}을 얻는다.Next, operation | movement of the prior art example 2 is demonstrated using the pattern of FIG. In FIG. 22, strokes are written in the order of lines 101, 102, 103, 104, and 105. FIG. When each stroke is approximated by a cut line and shown in the order of order by the eight-way code, it becomes {(1), (6), (7), (1, 7), (1)}. The basic stroke trains {(1), (3), (4), (7), and (1)} are obtained using the basic stroke table shown in FIG.

제22도의 문자 패턴은, 5획으로 기입하고 있으므로, 비교회로(27)의 5획 사전과 조합처리를 하고, 판정회로(29)로 후보문자 판정을 한다. 그 결과, 사전내의 문자 「石」과 일치하고, 문자코드를 출력한다.Since the character pattern of FIG. 22 is written in five strokes, the combination process is performed with the five stroke dictionaries of the comparison circuit 27, and the determination circuit 29 makes a candidate character determination. As a result, the character code in the dictionary is matched and the character code is output.

다음에, 제23도의 계속되는 글자 패턴으로 동작을 설명한다. 제23도에서는 선 106, 107, 108, 109의 순으로 기입하고 있다. 제23도의 계속되는 글자 패턴에 대해서는, 마찬가지로 기본선분 식별회로(22)에서 기본 스트로크열{(1), (3), (4), (21)}을 얻는다. 획수가 4이므로 비교회로(27)는 4획 사전과 비교한다. 이 경우 판정회로(29)는, 사전기억부(28)내에 「石」의 4획 사전이 존재하지 않으면 문자를 출력할 수가 없다. 거기에서, 제어회로(26)로 되돌아가고, 선분 분해회로(25)를 사용해서 스트로크를 순차 분해한다. 분해 스트로크와 분해룰은, 미리 사전기억부(28)에 등록되어 있고, 여기에서는 제24도에 도시하는 룰을 사용해서 기본 스트로크(21)를 (7), (1)로 분할하고, 다시 획수를 5로 한다. 이 결과, 입력 패턴의 기본 스트로크 열은 {(1), (3), (4), (7), (1)}로 수정되고, 비교회로(27)는 사전 기억부(28)내의 5획 사전과 조합 작업을 한다. 그 결과, 사전내의 문자 「石」과 일치하고, 판정회로(29)는 결과를 출력한다.Next, the operation will be described in the continuing character pattern of FIG. In FIG. 23, lines 106, 107, 108, and 109 are written in order. As for the subsequent character pattern of FIG. 23, the basic stroke sequences {(1), (3), (4), (21)} are similarly obtained by the basic line segment identification circuit 22. As shown in FIG. Since the stroke count is 4, the comparison circuit 27 compares with the 4-stroke dictionary. In this case, the judging circuit 29 cannot output characters unless there are four stroke dictionaries of "stone" in the dictionary storage unit 28. From there, the control circuit 26 is returned, and the stroke is decomposed sequentially using the line segment decomposition circuit 25. The decomposition stroke and the decomposition rule are registered in advance in the pre-memory section 28. Here, the basic stroke 21 is divided into (7) and (1) using the rules shown in FIG. Is 5. As a result, the basic stroke column of the input pattern is corrected to {(1), (3), (4), (7), (1)}, and the comparison circuit 27 has five strokes in the pre-storage unit 28. Work with dictionaries. As a result, in accordance with the character "石" in the dictionary, the determination circuit 29 outputs the result.

그런, 종래예 1에 의하면 계속되는 글자의 인식이 가능하나, 예를 들면 제25(a)도 및 제25(b)도에 도시한 바와 같이 사전 패턴(제25(a)도)과 입력패턴(제25(b)도)을 조합하는 경우, 제25(c)도에 도시한 바와 같이, 위치 어긋남 혹은 변형에 의해 대응하는 좌표점간의 거리가 커져, 그 결과, 사전과의 거리가 커져서 잘못 판독하기 쉬운 문제점이 있었다.According to the conventional example 1, the following letters can be recognized, but for example, as shown in FIGS. 25 (a) and 25 (b), the dictionary pattern (figure 25 (a)) and the input pattern ( In the case of combining FIG. 25 (b), as shown in FIG. 25 (c), the distance between the corresponding coordinate points is increased due to position shift or deformation, and as a result, the distance from the dictionary becomes larger and thus is incorrectly read. There was a problem that was easy to do.

또한, 제18도에 도시하는 바와 같이 DP 매칭에 사용하는 특징에 방향 코드를 사용하면, 계속되는 글자 인식이 가능한 외에 제25(c)도에 도시한 바와 같이 위치의 어긋남에 대해서는 강해지나, 운필 방향이 유사한 문자끼리, 예를 들면 「伎」와 「」, 「却」과 「劫」, 혹은 「村」과 「杖」등의 문자를 잘못 인식하기 쉬운 문제점이 있었다.In addition, when the direction code is used for the feature used for DP matching as shown in FIG. 18, the character recognition can be continued, and the shift in position becomes stronger as shown in FIG. 25 (c). These similar characters, for example, "伎" and " , "," And "劫", or "村" and "杖" have a problem that is easy to misunderstand.

또한, 자형(字形)이 흐트러져 사전과의 대응부분의 방향차가 큰 문자를 인식하는 경우, 입력 패턴과 사전과의 DP 매칭으로 얻어지는 코스트값이 자형이 정돈된 패턴에 비해 커져 그 결과 다른 문자에 잘못 판독하기 쉬운 문제점이 있었다.In addition, when a character is disturbed and a character having a large direction difference between the corresponding part with the dictionary is recognized, the cost value obtained by DP matching between the input pattern and the dictionary becomes larger than the pattern in which the character is arranged, resulting in an error in other characters. There was a problem that was easy to read.

다시, 예를 들면 제26(a)도 및 제26(b)도에 도시하는 바와 같이, 「꺽임 부분」「눌림 부분」의 성분을 갖는 문자(제26(a)도)와 「꺽임 부분」「눌림 부분」의 성분을 갖지 않는(즉, 대응하는 성분을 갖지 아니함) 사전(제26(b)도)와의 정합의 거리가 커져 잘못 판독하기 쉬운 문제점이 있었다. 이것에 대해서, 예를 들면 스트로크의 시점, 종점 부근의 꺾어접는 성분(예를 들면 연속하는 직선부의 각도차가 90도 이하인 시점 또는 종점의 선분)의 방향 코드를 무시하는, 혹은 가중치 처리를 하는 등의 방법이 고려되나, 시점, 종점 부근의 꺾어접는 성분의 노이즈이거나 문자에 필요한 특징인가는 문자를 인식하지 않으면 판정되지 않는다. 이 때문에, 스트로크의 시점, 종점 부근의 성분을 단순하게 무시할 수가 없는 문제점이 있었다.Again, as shown, for example, in Fig. 26 (a) and 26 (b), the characters (Fig. 26 (a)) and the "folding part" having components of the "folding part" and the "pressed part" are shown. There was a problem in that the distance of matching with the dictionary (Fig. 26 (b)) which does not have a component of the "pressed part" (i.e., does not have a corresponding component) becomes large, and it is easy to read it incorrectly. On the other hand, for example, disregarding the direction code of the folding component near the end point of the stroke, the end point (for example, the starting point or the segment of the end point where the angular difference of the continuous straight line is 90 degrees or less), or performing a weighting process. The method is considered, but it is not determined if the character is not recognized whether it is the noise of the folding component near the starting point, the end point, or a feature required for the character. For this reason, there existed a problem that the component of the starting point of stroke and vicinity of an end point cannot simply be ignored.

한편, 기본 스트로크 등의 스트로크 특징을 사용해서 인식하는 방법은, 사전과 입력 패턴의 획수를 일치시키지 않으면 거리를 계산할 수 없는 문제점이 있고,이어짐 글자에 대응하기 위해서는, 미리 이어짐 글자의 패턴을 사전에 등록하거나 사전의 문자 패턴으로 이어지기 쉬운 부분을 문자마다 기술할 필요가 있었다. 즉, 종래예 2에서는 입력 패턴의 계속된 스트로크의 분해룰이 분해사전에 존재하지 않으면 스트로크를 분해할 수 없고 잘못 인식하는 문제가 있었다. 이것에 대해서 모든 문자의 갖가지 이어짐 글자 스트로크에 대처하려면 막대한 사전용량을 필요로 하는 문제점이 있었다.On the other hand, a method of recognizing using stroke features such as a basic stroke has a problem in that the distance cannot be calculated unless the number of strokes of the dictionary and the input pattern match. It was necessary to describe for each character a portion that would be easy to register or lead to a character pattern of a dictionary. That is, in the conventional example 2, if the decomposition rule of the continuous stroke of the input pattern does not exist before decomposition, there is a problem that the stroke cannot be disassembled and recognized incorrectly. On the other hand, there is a problem that requires enormous dictionary capacity to cope with all the following character strokes of all characters.

또한, 예를 들면 분해사전을 사용하지 않고, 복수의 방향 코드를 갖는 스트로크에 대해 방향 코드마다 분해하는 방법을 사용해서 해결하려해도, 이어서 쓴 스트로크 부분에는 본래 스트로크로서 표출하는 실제 스트로크 이외에 해서에서는 표출하지 않는 가상 스트로크를 포함하는 경우가 있다. 제27(a)도 및 제27(b)도에 이 예를 도시한다. 제27(a)도의 입력 패턴은 제27(b)도의 사전에 비해, 정확한 획수에서는 가상 스트로크로 되는 성분(30 및 31)을 여분으로 갖는다. 이 때문에 단지 직선 성분으로 분해한 직선 코드열은 실제 스트로크만의 특징을 갖는 사전과는 반드시 일치하지 않기 때문에 잘못 인식하는 경우가 있고, 방향코드로 단순하게 분할할 수 없는 문제점이 있었다.For example, even if a solution having a plurality of direction codes is decomposed for each direction code without using an decomposition dictionary, the solution is to be solved by using a method other than the actual stroke that is originally expressed as a stroke. It may contain virtual strokes that do not. This example is shown in FIG. 27 (a) and 27 (b). The input pattern of FIG. 27 (a) has extra components 30 and 31, which become virtual strokes at the correct stroke number, compared to the dictionary of FIG. 27 (b). For this reason, straight code strings decomposed into straight line components do not necessarily coincide with dictionaries which have the characteristics of actual strokes, so they may be misidentified, and there is a problem that they cannot be simply divided into direction codes.

본 발명은 이상과 같은 문제를 해결하기 위해 이루어진 것이며, 그 목적은, 입력 문자 패턴중의 예기치 않은 성분에 기인하는 인식율의 저하를 방지하고, 또한 이어짐 글자(띄어쓰지 않고 연달아 붙여 쓴 글자) 등에 대한 인식율의 향상을 도모하는 온라인 문자 인식 장치를 제공하는데 있다.SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and an object thereof is to prevent a decrease in recognition rate due to an unexpected component in an input character pattern, and to prevent the occurrence of subsequent characters (letters written in succession). An object of the present invention is to provide an on-line character recognition apparatus for improving the recognition rate.

제1도는 본 발명에 관한 온라인 문자 인식 장치의 실시예 1을 도시한 블록 구성도.1 is a block diagram showing the first embodiment of the on-line character recognition apparatus according to the present invention.

제2도는 실시예 1에 있어서 사용하는 사전에 수록된 문자 「家」에 관한 정보의 내용예를 도시한 도면.FIG. 2 is a diagram showing an example of the contents of information about the character "家" recorded in a dictionary used in Example 1. FIG.

제3도는 실시예 1에 있어서 사용하는 사전에 수록된 문자 「琢」에 관한 정보의 내용예를 도시한 도면.FIG. 3 is a diagram showing an example of the contents of information about the character "琢" recorded in the dictionary used in Example 1. FIG.

제4도는 실시예 1에 있어서 문자 인식 처리를 도시한 플로차트.4 is a flowchart showing character recognition processing in the first embodiment.

제5도는 실시예 1에 있어서 문자 인식 처리중 특징 추출부가 행하는 처리를 도시한 플로차트.5 is a flowchart showing processing performed by the feature extraction unit during the character recognition processing in the first embodiment.

제6(a)도는 실시예 1에 있어서 사용하는 16방향 코드의 예를 도시한 도면.6 (a) is a diagram showing an example of the 16-way code used in the first embodiment.

제6(b)도는 DP 매칭에 사용하는 값을 설정한 표를 도시한 도면.6 (b) shows a table in which values used for DP matching are set.

제7도는 실시예 1에 있어서 입력패턴에 대해 특징 추출 처리를 한 후의 패턴을 도시한 도면.FIG. 7 is a diagram showing a pattern after the feature extraction processing is performed on the input pattern in Example 1. FIG.

제8도는 실시예 1에 있어서 입력패턴에 대해서 특징 추출 처리를 행함으로써 추출된 형식을 표형식으로 도시한 도면.8 is a table showing a form extracted by performing feature extraction processing on the input pattern in Example 1 in a tabular form.

제9도는 사전내의 문자 「家」의 시점, 종점에 대응하는 입력 패턴의 좌표점을 도시한 도면.9 is a diagram showing coordinate points of the input pattern corresponding to the start point and the end point of the character "家" in the dictionary.

제10도는 사전내의 문자 「家」에 대한 입력 패턴의 대응 스트로크 특징을 도시한 도면.FIG. 10 is a diagram showing corresponding stroke characteristics of an input pattern with respect to the character "家" in a dictionary. FIG.

제11도는 사전내의 문자 「琢」의 시점, 종점에 대응하는 입력 패턴의 좌표점을 도시한 도면.FIG. 11 is a diagram showing coordinate points of an input pattern corresponding to a start point and an end point of a character "琢" in a dictionary; FIG.

제12도는 사전내의 문자 「琢」에 대한 입력 패턴의 대응 스트로크 특징을 도시한 도면.Fig. 12 is a diagram showing the corresponding stroke characteristics of the input pattern with respect to the character "琢" in the dictionary.

제13(a)도 내지 제13(d)도는 문자 패턴에 의해 변동이 큰 부분의 예를 도시한 도면.13 (a) to 13 (d) are diagrams showing examples of portions with large variation due to character patterns.

제14도는 사전내의 문자 「木」의 세그먼트 사전의 내용예를 도시한 도면.14 is a diagram showing an example of the contents of a segment dictionary of the character "木" in the dictionary.

제15도는 제13(d)도에 도시한 문자 「木」의 세그먼트 특징을 도시한 도면.FIG. 15 is a diagram showing segment characteristics of the character "木" shown in FIG. 13 (d).

제16도는 방향 비의존 코드를 사용한 문자 「木」의 세그먼트 사전을 도시한 도면.FIG. 16 is a diagram showing a segment dictionary of the character "木" using a direction independent code. FIG.

제17(a)도 및 제17(b)도는 실시예 2에 있어서 방향 비의존 코드와의 대응짓기 수의 제한을 설명하기 위해 사용하는 도면.17 (a) and 17 (b) are diagrams used for explaining the limitation of the number of correspondences with the direction independent code in the second embodiment.

제18도는 종래예 1에 있어서 방향 코드를 사용한 인식 방식의 특징을 도시한 도면.Fig. 18 is a diagram showing the features of the recognition method using the direction code in the conventional example 1.

제19도는 8방향 코드의 예를 도시한 도면.19 shows an example of an eight-way code.

제20도는 종래예 1의 온라인 문자 인식 장치의 기본 구성을 도시한 블록 구성도.20 is a block diagram showing the basic configuration of the on-line character recognition device of Conventional Example 1. FIG.

제21도는 방향 코드열과 스트로크 코드와의 대응표를 도시한 도면.21 is a diagram showing a correspondence table between a direction code sequence and a stroke code.

제22도는 종래예 2의 동작을 설명하기 위해 사용하는 문자 「石」의 입력 패턴을 도시한 도면.FIG. 22 is a diagram showing an input pattern of the character "石" used for explaining the operation of the conventional example 2. FIG.

제23도는 종래예 2의 동작을 설명하기 위해 사용하는 문자 「石」의 입력 패턴을 도시한 도면.FIG. 23 is a diagram showing an input pattern of the character "石" used for explaining the operation of the conventional example 2. FIG.

제24도는 계속되는 글자 스트로크의 분해룰(rule)을 도시한 도면.24 is a diagram illustrating a decomposition rule of a subsequent character stroke.

제25(a)도 내지 제25(c)도는 종래예 1에 있어서 거리계산시의 위치의 벗어남에 의한 영향을 도시한 도면.25 (a) to 25 (c) are diagrams showing the effect of deviation of the position at the time of distance calculation in the conventional example 1. FIG.

제26(a)도 및 제26(b)도는 「꺽임 부분」「눌린 부분」을 포함하는 입력 패턴과 「꺽임 부분」「눌린 부분」을 포함하지 않은 사전의 예를 도시한 도면.26 (a) and 26 (b) are diagrams showing examples of dictionaries that do not include an input pattern including a "bent part" and a "pressed part" and a "bent part" and a "pressed part".

제27(a)도 및 제27(b)도는 계속되는 글자에 의해 가상 스트로크가 표출하는 경우의 예를 도시한 도면.27 (a) and 27 (b) are diagrams showing an example of the case where the virtual stroke is expressed by the letters that follow.

* 도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

1 : 입력부 2 : 특징 추출부1 input unit 2 feature extraction unit

3 : 특징점 대응짓기부 4 : 지정구간 특징 추출부3: feature point corresponding building unit 4: feature section feature extraction unit

5 : 특징 조합부 6 : 사전기억부5: feature combination unit 6: pre-memory unit

7 : 출력부7: output unit

이상과 같은 목적을 달성하기 위해, 제1의 발명에 관한 온라인 문자 인식 장치는 문자패턴의 좌표점열 데이타를 입력으로 하여, 그것의 입력 문자 패턴에 해당한 문자 코드를 출력하는 온라인 문자 인식 장치에 있어서, 상기 입력문자 패턴을 기술할 때의 스트로크 상의 좌표점열 데이타를 입력하는 입력 수단과, 상기 입력 수단에서 입력된 좌표점열 데이타에 포함되는 시계열순으로 정열된 좌표점을 절선 근사함으로써 얻은 각 직선 부분을 세그먼트로 하고, 그의 각 세그먼트에 관한 특징 정보와 각 세그먼트의 끝점인 특징점을 추출하는 특징추출 수단과, 문자를 구성하는 세그먼트에 관한 특징 정보 및 특징점을 문자마다 수록한 사전을 미리 기억하는 사전 기억 수단과, 상기 사전에 기술된 각 문자의 특징 정보와 상기 특징 추출 수단에서 추출된 특징 정보에 의거해서 상기 사전내의 각 문자를 구성하는 세그먼트와 상기 입력 문자 패턴에서 얻은 세그먼트를 대응짓기를 행하고, 세그먼트 대응짓기 거리를 산출하는 특징점 대응짓기 수단과, 상기 사전이 지정하는 스트로크 상의 특징점조에 대응한 상기 입력 패턴의 스트로크 상의 특징점조에 의해 결정되는 구간의 특징 정보를 대응 스트로크 특징으로 해서 추출하는 지정 구간 특징 추출 수단과, 상기 지정구간 특징 추출 수단에 의해 추출된 대응 스트로크 특징을 사전내의 특징 정보와 조합하고, 대응 스트로크 특징의 거리를 산출하는 특징 조합 수단과, 상기 특징 조합 수단으로 얻어진 후보 문자 코드를 출력하는 출력 수단을 구비하며, 상기 특징 조합 수단은, 산출한 대응 스트로크 특징의 거리 및 상기 특징점 대응짓기 수단에 의해 산출된 세그먼트 대응짓기 거리에 의거해서 상기 사전내의 입력 문자 패턴에 대응한 문자를 특정하는 것이다.In order to achieve the above object, in the on-line character recognition apparatus according to the first invention, the coordinate character string data of the character pattern is inputted, and the on-line character recognition apparatus outputs a character code corresponding to the input character pattern. And inputting means for inputting coordinate point data on a stroke when describing the input character pattern, and each straight line portion obtained by performing a cut line approximation of coordinate points arranged in time series included in the coordinate point data input by the input means. Characteristic extraction means for extracting feature information about each segment and the feature point that is the end point of each segment, and dictionary storage means for storing in advance a dictionary containing feature information and feature points about the segments constituting the character for each character And feature information of each character previously described and features extracted by said feature extraction means. On the basis of the gong information, the segment constituting means for correlating the segments constituting the characters in the dictionary with the segments obtained from the input character pattern, and calculating the segment correspondence distance, and the feature point on the stroke designated by the dictionary. Designation section feature extraction means for extracting feature information of a section determined by feature stiffness on the stroke of the corresponding input pattern as a corresponding stroke feature, and corresponding stroke feature extracted by the designation section feature extraction means in the dictionary. And combination means for calculating the distance of the corresponding stroke feature, and output means for outputting a candidate character code obtained by the feature combining means, wherein the feature combining means includes the calculated distance of the corresponding stroke feature and the feature combination means. The three calculated by the feature point mapping means Characters corresponding to the input character pattern in the dictionary are specified based on the segment correspondence distance.

제2의 발명에 관한 온라인 문자 인식 장치는, 제1의 발명에 있어서, 상기 특징점 대응짓기 수단은, 상기 사전내의 문자의 각 스트로크의 시점과 종점에 대해 상기 입력 문자 패턴의 좌표점을 각각 1개 대응시켜, 상기 지정 구간 특징추출 수단은, 상기 시점에 대응하는 상기 입력 문자 패턴의 특징점과 상기 종점에 대응하는 상기 입력 문자 패턴의 특징점을 특징점조로 하는 것이다.In the first aspect of the invention, in the on-line character recognition device according to the second aspect of the invention, the feature point correspondence means includes one coordinate point of the input character pattern for each of a start point and an end point of each stroke of the character in the dictionary. Correspondingly, the designated section feature extraction means sets the feature points of the input character pattern corresponding to the viewpoint and the feature points of the input character pattern corresponding to the endpoint.

제3의 발명에 관한 온라인 문자 인식장치는, 제1의 발명에 있어서, 각 세그먼트에 관한 정보에는, 각 세그먼트의 방향 및 길이가 포함되어 있고, 상기 특징점 대응짓기 수단은 특징 추출 수단이 산출한 각 세그먼트의 방향 및 길이에 의거해서, 대응짓기된 각 세그먼트의 코스트를 산출하고 그 코스트에 의거해서 세그먼트 대응짓기 거리를 산출하는 것이다.In the on-line character recognition device according to the third invention, in the first invention, the information on each segment includes a direction and a length of each segment, and the feature point correspondence means is an angle calculated by the feature extraction means. Based on the direction and length of the segment, the cost of each associated segment is calculated, and the segment mapping distance is calculated based on the cost.

다음에 도면에 의거해서 본 발명의 가장 적합한 실시예에 대해서 설명한다.Next, the most suitable embodiment of this invention is described based on drawing.

[실시예 1]Example 1

제1도는 본 발명에 관한 온라인 문자 인식 장치의 실시예 1을 도시한 블록 구성도이다. 본 실시예에 있어서 온라인 문자 인식 장치는 입력부(1), 특징추출부(2), 특징점 대응짓기부(3), 지정구간 특징추출부(4), 특징조합부(5), 사전기억부(6) 및 출력부(7)로 구성된다. 입력부(1)는 입력 수단으로서 준비되고, 사용자가 태블릿 등에 펜으로 입력한 문자 데이타(입력 문자 패턴)를 기술할 때의 스트로크 상의 좌표점열 데이타를 입력한다. 특징 추출부(2)는 특징 추출 수단으로서 준비되고, 입력부(1)에 입력된 좌표점열 데이타에 포함되는 시계열 순으로 정렬된 좌표점을 절선에 근사함으로써 얻은 각 직선부분을 세그먼트로 하고, 그의 각 세그먼트에 관한 특징 정보와 각 세그먼트의 끝점인 특징점을 추출한다. 특징점 대응짓기부(3)는 특징점 대응짓기 수단으로서 준비되고, 사전에 기술된 각 문자의 특징 정보와 특징추출부(2)에서 추출된 특징 정보에 의거해서 사전내의 각 문자를 구성하는 세그먼트와 입력 문자 패턴으로부터 얻은 세그먼트와의 대응짓기를 행하고, 세그먼트 대응짓기 거리를 산출한다. 지정 구간 특징 추출부(4)는 지정구간 특징 추출수단으로서 준비되고, 사전이 지정하는 스트로크 상의 특징점조에 대응한 입력문자 패턴의 스트로크 상의 특징점조에 의해 결정되는 구간의 특징 정보를 대응 스트로크 특징으로서 추출한다. 특징조합부(5)는 특징 조합 수단으로서 준비되고, 지정구간 특징 추출부(4)에 의해 추출된 대응 스트로크 특징의 거리를 산출한다. 사전 기억부(6)는 사전기억 수단으로서 준비되고, 전술한 사전을 미리 기억한다. 본 실시예에 있어서 사전에는 문자를 구성하는 세그먼트에 대한 특징 정보 및 특징점이 문자마다 수록되어 있다. 출력부(7)는 출력 수단으로서 준비되고, 특징 조합부(5)에서 얻어진 후보 문자 코드를 출력한다.1 is a block diagram showing the first embodiment of the on-line character recognition apparatus according to the present invention. In the present embodiment, the online character recognition apparatus includes an input unit 1, a feature extracting unit 2, a feature point correspondence unit 3, a designation section feature extracting unit 4, a feature combining unit 5, and a pre-memory unit ( 6) and an output unit 7. The input unit 1 is prepared as an input means, and inputs coordinate point data on a stroke when the user describes character data (input character pattern) input by a pen with a tablet or the like. The feature extracting section 2 is prepared as a feature extracting means, and each of the linear portions obtained by approximating the coordinate points arranged in time series included in the coordinate sequence data input to the input section 1 to the cut lines is used as a segment. Feature information about a segment and a feature point which is an end point of each segment are extracted. The feature point correspondence unit 3 is prepared as a feature point correspondence means and inputs segments and inputs constituting each character in the dictionary based on feature information of each character previously described and feature information extracted from the feature extractor 2. The correspondence with the segment obtained from the character pattern is performed, and the segment association distance is calculated. The designated section feature extracting section 4 is prepared as a feature section extracting section, and extracts, as a corresponding stroke feature, feature information of the section determined by the feature dots on the stroke of the input character pattern corresponding to the feature points on the stroke designated by the dictionary. . The feature combining section 5 is prepared as feature combining means and calculates the distance of the corresponding stroke feature extracted by the designated section feature extracting section 4. The dictionary storage unit 6 is prepared as a dictionary storage means, and stores the dictionary mentioned above in advance. In the present embodiment, the dictionary stores feature information and feature points for segments constituting characters for each character. The output unit 7 is prepared as an output means and outputs the candidate character code obtained by the feature combination unit 5.

제2도는 사전중의 문자 「家」에 관한 정보를 표 형식으로 도시한 도면이고 제3도는 사전중의 문자 「琢」에 관한 정보를 표 형식으로 도시한 도면이다. 사전 기억부(6)가 기억하는 사전에 포함되는 내용 및 특징은 문자코드, 세그먼트의 특징 정보로서 방향 코드 및 세그먼트 길이, 스트로크의 외접 구형(직사각형) 폭, 스트로크의 외접 구형 높이이다. 세그먼트의 방향 코드와 세그먼트 길이는 스트로크 이외에 가상 스트로크에 대해서도 유지한다. 또한 스트로크란 펜다운에서 펜업까지의 좌표열의 단위를 말하나, 여기에서는 이 스트로크를 실제 스트로크 혹은 스트로크의 종점(펜업위치)으로부터 다음의 스트로크의 시점(펜다운 위치)을 잇는 스트로크를 가상 스트로크라 하기로 한다. 실제 스트로크는 1 스트로크에 대해 복수의 세그먼트를 유지할 수 있으나 가상 스트로크는 1 스트로크에 대해 1 세그먼트로 한다. 복수의 세그먼트를 갖는 스트로크의 시점으로부터 종점의 방향은 제2도, 제3도에서 괄호 내에 표시한다. 또한 제2도, 제3도에는 도시하지 않았으나 각각의 세그먼트가 실제 스트로크이거나 가상 스트로크인가를 식별하는 스트로크 식별코드를 유지한다.FIG. 2 is a diagram showing information about the character "家" in the dictionary in a tabular form, and FIG. 3 is a diagram showing information about the character "琢" in the dictionary in a tabular format. The contents and features included in the dictionary stored in the dictionary storage section 6 are the character code and the segment code as the characteristic information of the segment, the circumscribed rectangle (rectangular) width of the stroke, and the circumscribed rectangle height of the stroke. The direction code and segment length of the segment are maintained for the virtual stroke in addition to the stroke. The stroke is a unit of coordinate string from pen down to pen up, but in this case, the stroke connecting the actual stroke or the end point (pen up position) of the next stroke (pen down position) is called a virtual stroke. do. The actual stroke can hold a plurality of segments for one stroke, but the virtual stroke is one segment for one stroke. The direction of the end point from the start of the stroke having a plurality of segments is indicated in parentheses in FIGS. 2 and 3. Although not shown in Figs. 2 and 3, a stroke identification code for identifying whether each segment is a real stroke or a virtual stroke is maintained.

제4도는 본 실시예에 있어서 문자 인식처리의 플로차트이고, 제5도는 특징 추출부(2)의 처리를 도시한 플로차트이다. 제6(a)도는 16방향 코드의 예이고 제6(b)도는 DP 매칭에 사용하는 값을 설정한 표를 도시한 도면이다. 또한 본 실시예에 있어서는 사전내의 각 문자를 구성하는 세그먼트와 입력 문자 패턴에서 얻은 세그먼트와의 대응짓기를 DP 매칭에 의해 행하는 것으로 한다.FIG. 4 is a flowchart of character recognition processing in the present embodiment, and FIG. 5 is a flowchart showing the processing of the feature extraction unit 2. FIG. FIG. 6 (a) shows an example of a 16-way code, and FIG. 6 (b) shows a table in which values used for DP matching are set. In this embodiment, the correspondence between the segments constituting each character in the dictionary and the segments obtained from the input character pattern is performed by DP matching.

다음으로 본 실시예에 있어서 인식 처리의 흐름을 제4도의 플로차트에 의거해서 설명한다.Next, the flow of the recognition process in the present embodiment will be described based on the flowchart of FIG.

먼저, 입력부(1)는 사용자가 태블릿에 펜으로 기입한 수기 문자 데이타의 시계열 순으로 정렬된 좌표열을 얻는다(스텝 100). 다음으로, 특징 추출부(2)는 전처리, 특징 추출을 행하나(스텝 101), 이 처리의 상세한 것에 대해서는 제5도에 도시한 플로차트를 사용해서 설명한다.First, the input unit 1 obtains a coordinate string arranged in chronological order of handwritten character data written by the user with a pen on the tablet (step 100). Next, the feature extraction unit 2 performs preprocessing and feature extraction (step 101). Details of this processing will be described using the flowchart shown in FIG.

특징추출부(2)는 입력 좌표열에 대해 연속되는 좌표점간의 거리를 기준폭과 비교하고 거리가 기준폭을 넘지 않는 점의 추출 처리를 행한다(스텝 201). 본 실시예에서는 이 추출 후의 좌표점에서 다음의 좌표 즉 시계열 순으로 정열한 좌표점을 절선 근사함으로써 얻은 각 직선 부분을 세그먼트라 칭하기로 한다. 다음으로 세그먼트의 방향 코드를 추출한다(스텝 202). 세그먼트 방향 코드는 실제 스트로크 이외에 가상 스트로크에 대해서도 추출한다. 여기에서는 제6(a)도에 도시하는 16방향 코드를 사용해서 방향 코드열을 추출한다. 그래서 세그먼트의 결합 처리를 한다(스텝 203). 여기에서는 인접한 세그먼트의 방향이 근사한 경우 구체적으로는 인접하는 세그먼트간의 방향차가 ±1의 경우 그 세그먼트 끼리를 결합하고 결합한 세그먼트 방향 코드를 재계산한다. 예를 들면 방향 코드(8)의 세그먼트에 이어서 방향코드(9)의 세그먼트가 나타난 경우 이들의 세그먼트를 결합해서 방향 코드(8)의 단일한 세그먼트로 한다. 단, 가상 스트로크에 대해서는 결합 처리를 실행하지 않는다. 그래서 결합후의 세그먼트 길이를 산출한다(스텝 204). 세그먼트 길이는 추출 처리에 사용한 기준폭의 몇 배 인지로 표기한다. 본 실시예에서는 특징 정보로서 세그먼트의 방향을 나타내는 방향 코드와 길이를 추출한다. 이상의 특징 추출 처리후의 입력 패턴을 제7도에 도시한다. 또한 추출한 특징을 표 형식으로 도시한 것을 제8도에 도시한다.The feature extraction unit 2 compares the distance between the successive coordinate points with respect to the input coordinate string with the reference width and performs extraction processing of the point where the distance does not exceed the reference width (step 201). In the present embodiment, each linear portion obtained by performing a line approximation to the next coordinate, that is, the coordinate points arranged in time series, from the coordinate point after the extraction will be referred to as a segment. Next, the direction code of the segment is extracted (step 202). The segment direction code is extracted for the virtual stroke in addition to the actual stroke. Here, the direction code string is extracted using the 16-way code shown in FIG. 6 (a). Thus, the segment joining process is performed (step 203). Here, when the directions of the adjacent segments are approximate, specifically, when the direction difference between adjacent segments is ± 1, the segments are combined and the segment direction codes combined are recalculated. For example, if a segment of the direction code 9 appears after the segment of the direction code 8, these segments are combined to form a single segment of the direction code 8. However, the joining process is not executed for the virtual stroke. Thus, the segment length after joining is calculated (step 204). The segment length is indicated by the number of times the reference width used for the extraction process. In this embodiment, the direction code and the length indicating the direction of the segment are extracted as the feature information. 7 shows the input pattern after the feature extraction processing described above. FIG. 8 shows the extracted features in tabular form.

다음에 제4도로 되돌아가, 특징점 대응짓기부(3)는 사전에서 문자 데이타를 1개 인출한다(스텝 102). 이 예에서는 제2도에 도시한 「家」의 사전을 인출한다. 다음에 특징점 대응짓기부(3)는 입력패턴과 사전내의 문자 「家」 사이에 있어서 DP 매칭에 의한 세그먼트의 대응짓기를 행한다(스텝 103). DP 매칭은 다음과 같이 행한다.Next, returning to Fig. 4, the feature point mapping unit 3 draws out one character data from the dictionary (step 102). In this example, the dictionary of "house" shown in FIG. 2 is taken out. Next, the feature point correspondence unit 3 performs correspondence of segments by DP matching between the input pattern and the character "家" in the dictionary (step 103). DP matching is performed as follows.

입력 패턴의 세그먼트를 Si={si(1), si(2), ‥ si(i), ‥si(I)}, 사전의 세그먼트를 Sd={sd(1), sd(2), ‥sd(j), ‥sd(J)}라 하면,Si = {si (1), si (2), ... si (i), ... si (I)} as the segments of the input pattern, and Sd = {sd (1), sd (2), ... (j), ... sd (J)}

[수학식 1][Equation 1]

를 실행한다. 다음에 이 식을 수학식 1로 한다. 또한 함수 min는 최소값을 구하기 위한 함수이다.Run Next, let this expression be an expression (1). Also, the function min is a function for finding the minimum value.

여기에서, 다음 수학식 2를 사용한다.Here, the following equation (2) is used.

[수학식 2][Equation 2]

D[si(i+1)] [sd(j+1)] = a[si(i+1), sd(j+1)] * (｜si(i+1)｜ + ｜sd(j+1)｜)D [si (i + 1)] [sd (j + 1)] = a [si (i + 1), sd (j + 1)] * (| si (i + 1) | + | sd (j + 1) ｜

이하 이 식을 수학식 2로 한다. 수학식 1에서 d[i+1] [j+1]은 시점에서 si(i+1), sd(j+1)까지의 대응 짓는 코스트의 누적을 표시한다. 수학식 2에서 D[si(i+1)] [sd(j+1)]는 세그먼트 si(i+1)와 세그먼트 si(i+1)와 세그먼트 sd(j+1)의 대응짓는 코스트를 나타낸다. a[si(i+1), sd(j+1)]는 세그먼트 si(i+1)와 세그먼트 sd(j+1)의 방향차에 의해 결정되는 값이고 여기에서는 제6(b)도에 도시하는 표의 값을 사용한다. ｜si(i+1)｜ 및 ｜sd(j+1)｜는 세그먼트 si(i+1), sd(j+1)의 세그먼트 길이 이다. 또 여기에서는 도시하지 않았으나 최소값을 부여하는 대응짓기 경로표도 유지한다.This expression is referred to as Equation 2 below. In Equation 1, d [i + 1] [j + 1] represents an accumulation of corresponding costs from si (i + 1) to sd (j + 1) at the time point. In Equation 2, D [si (i + 1)] [sd (j + 1)] denotes a corresponding cost of the segment si (i + 1) and the segment si (i + 1) and the segment sd (j + 1). Indicates. a [si (i + 1), sd (j + 1)] is a value determined by the direction difference between the segment si (i + 1) and the segment sd (j + 1), and is shown in FIG. Use the values in the table shown. Si (i + 1) and sd (j + 1) are the segment lengths of the segments si (i + 1) and sd (j + 1). Although not shown here, we also maintain a mapping table that gives the minimum values.

수학식 1을 점차적으로 계산하고 최종적으로Calculate Equation 1 gradually and finally

[수학식 3][Equation 3]

dist dp=d[I][J]/(I+J)dist dp = d [I] [J] / (I + J)

를 계산한다. 이하 이 식을 수학식 3으로 한다. 이 수학식 3을 사전과의 DP 매칭의 코스트(세그먼트 대응짓기 거리)로 한다. 수학식 3내의 dist는, DP 매칭의 코스트를 구하기 위한 함수(distance)를 의미한다. 또한 이 DP 매칭은 상술한 「패턴인식」(후나구보노보루 저: 쿄리쓰 출판)에 기재되어 있는 방법을 사용하고 있다.Calculate This expression is referred to as Equation 3 below. This expression (3) is taken as the cost (segment correspondence distance) of DP matching with a dictionary. Dist in Equation 3 means a function for obtaining a cost of DP matching. In addition, this DP matching uses the method described in the above-mentioned "pattern recognition" (Funagubo noboru by Kyoritsu Publishing).

여기에서, 이어짐 글자는 정확한 획수로 쓰여진 경우에 비해 획수가 감소하고 있으므로, 정확한 획수의 문자사전과 이어짐 글자 입력 패턴의 스트로크 및 세그먼트를 대응지음으로써 입력패턴에 대해 사전의 성분이 복수로 대응하는 경우가 있다. 그러나 통상 사전의 실제 스트로크 또는 실제 세그먼트가 입력 패턴의 가상 스트로크로 되는 것은 아니다. 따라서 사전중의 문자를 구성하는 부분으로서 표출되는 스트로크에 대응한 세그먼트와 입력 문자 패턴을 구성하는 스트로크 중 표출하지 않은 부분에 대응한 세그먼트를 대응짓지 않도록 할 필요가 있다. 거기에서, 세그먼트의 DP 매칭 때에 입력패턴의 가상 스트로크와 사전의 실제 스트로크 성분의 계산시에는 이들 세그먼트가 대응지어 지지 않도록 D[si(i)][sd(j)]의 값에 큰 페날티 거리를 주고, 이에 따라 실제로는 있을 수 없는 대응짓기를 저지한다. 이에 따라, 이어짐 글자라도 문자의 오인을 보다 확실하게 방지할 수가 있다. 물론 다른 방법을 써서 세그먼트가 대응지어지지 않도록 하여도 좋다.Here, since the number of strokes decreases as compared to the case where the following character is written with the correct stroke number, a plurality of components of the dictionary correspond to the input pattern by matching the character dictionary of the exact stroke number with the stroke and the segment of the character input pattern. There is. However, the actual actual stroke or actual segment of the dictionary is not usually the virtual stroke of the input pattern. Therefore, it is necessary to prevent the segment corresponding to the stroke which is expressed as a part which comprises the character in a dictionary from the segment corresponding to the unexpressed part of the stroke which comprises an input character pattern. Therein, when the DP matching of the segments is performed, the penality distance large to the value of D [si (i)] [sd (j)] so that these segments do not correspond to each other when calculating the virtual stroke of the input pattern and the actual actual stroke components. , Thus preventing the correspondence that is not really possible. As a result, even a continuation character can be prevented more erroneously from the character. Of course, other methods may be used to prevent the segments from being associated.

제2도에서 도시하는 「家」의 사전의 특징과 제8도에 도시하는 입력 패턴의 특징의 대응 짓기를 수학식 1 내지 수학식 3 및 제6(a)도 및 제6(b)도를 사용해서 계산하면 dist dp=682를 얻는다.Correlation between the features of the dictionary of "house" shown in FIG. 2 and the features of the input pattern shown in FIG. 8 is shown in Equations 1 to 3 and 6 (a) and 6 (b). And use it to get dist dp = 682.

다음에 제4도에 있어서, 특징점 대응짓기부(3)는 스텝(103)에서 얻은 도시하지 않은 파스 표를 사용해서 사전스트로크의 시점, 종점에 대응하는 입력 패턴의 좌표점을 얻는다.Next, in FIG. 4, the feature point correspondence unit 3 obtains the coordinate points of the input pattern corresponding to the start point and the end point of the prestroke by using the parcel table (not shown) obtained in step 103. Next, as shown in FIG.

다음에 제4도에 있어서 특징점 대응짓기부(3)는 스텝(103)에서 얻은 도시하지 않은 파스 표를 사용해서 사전 스트로크의 시점, 종점에 대응하는 입력 패턴의 좌표점을 얻는다(스텝 104). 제9도에 사전내의 문자 「家」의 스트로크의 시점, 종점에 대응하는 입력 패턴의 좌표점을 도시한다. 이어서 지정 구간 특징 추출부(4)는, 입력 패턴의 대응점간의 특징추출을 행한다(스텝 105). 여기에서는 제9도에 도시하는 「家」의 각 스트로크의 시점, 종점에 대응하는 점의 조(組)를 특징점조로 하고 이 특징점조를 구성하는 시점에 대응하는 입력문자 패턴의 특징점(시점)과 이 종점에 대응하는 입력문자 패턴의 특징점(종점)을 입력문자 패턴에 있어서 특징점조로 한다. 그래서 이 입력문자 패턴에 있어서 특징점조에 끼워지는 좌표점열에서 문자입력패턴의 시점, 종점간의 외접 구형 폭, 외접 구형 높이, 시점에서 종점으로의 방향을 구하고 또 각 특징점조간의 가상스트로크(종점에 대응하는 점에서 다음의 스트로크의 시점에 대응하는 점으로의 벡터)의 방향 및 거리를 구한다. 이후 이들의 특징을 대응 스트로크 특징이라 칭하기로 한다. 그 결과를 제10도에 도시한다.Next, in Fig. 4, the feature point correspondence unit 3 obtains the coordinate points of the input pattern corresponding to the start point and the end point of the prestroke by using the parcel table (not shown) obtained in step 103 (step 104). 9 shows the coordinate points of the input pattern corresponding to the start point and the end point of the stroke of the character "家" in the dictionary. Next, the designated section feature extraction unit 4 performs feature extraction between corresponding points of the input pattern (step 105). Here, the characteristic point (start) of the input character pattern corresponding to the point in time at which the point corresponding to the start point and the end point of each stroke of the "house" shown in FIG. The feature point (end point) of the input character pattern corresponding to this end point is a feature point in the input character pattern. Thus, in this input character pattern, the coordinate sequence inserted in the feature point is obtained from the start point of the character input pattern, the circumscribed rectangle width between the end points, the circumscribed rectangle height, and the direction from the start point to the end point, and the virtual stroke (corresponding to the end point). The direction and distance of the vector) from the point to the point corresponding to the starting point of the next stroke are obtained. These features will hereinafter be referred to as corresponding stroke features. The result is shown in FIG.

다음에, 특징 조합부(5)는 사전의 스트로크 특징과 입력패턴의 대응스트로크 특징의 조합을 행한다(스텝 106). 대응 스트로크 특징의 조합은 예를 들면(외접 구형 폭의 차)+(외접 구형 높이의 차)+(시점에서 종점으로의 방향의 차)+(가상 스트로크의 방향차)+(가상 스트로크의 길이차)를 사용해서 계산한다. 사전에 대응하는 가상 스트로크가 입력패턴에 존재하지 않은 경우는 그 부분의 계산은 하지 않는다. 제2도와 제10도의 특징 사이에서 상기한 계산을 하고 사전 「家」와의 대응스트로크 특징의 거리 dist st=93를 얻는다.Next, the feature combination section 5 combines a prior stroke feature and the corresponding stroke feature of the input pattern (step 106). The combination of the corresponding stroke features is, for example, (difference of circumscribed spherical width) + (difference of circumscribed spherical height) + (difference in direction from start to end point) + (difference in direction of virtual stroke) + (length difference of virtual stroke) Calculate using If a virtual stroke corresponding to the dictionary does not exist in the input pattern, the part is not calculated. The above calculation is made between the features of FIG. 2 and FIG. 10, and the distance dist st = 93 of the corresponding stroke feature with the dictionary "house" is obtained.

다음으로, 특징조합부(5)는 조합하는 사전이 존재하는가를 판단한다(스텝 107). 사전내의 다른 문자가 존재하는 경우는 스텝(102)으로 되돌아가고 다음의 문자와의 조합을 한다. 이 경우는 다른 문자가 존재하고 제3도의 사전 「琢」과 조합한다. 특징점 대응짓기부(3)는 스텝 102 및 스텝 103을 위와 같은 처리를 하고 「琢」의 DP 매칭의 코스트 disp dp=674를 얻는다. 마찬가지로 특징점 대응짓기부(3)는 스텝 104를 실행하고 입력 패턴의 문자 「琢」의 스트로크의 시점, 종점에 대응하는 좌표점을 구한다. 이 결과를 제11도에 도시한다.Next, the feature combination section 5 determines whether a dictionary to combine exists (step 107). If there are other characters in the dictionary, the process returns to step 102 and combines with the next character. In this case, other characters exist and are combined with dictionary "琢" in FIG. The feature point mapping unit 3 performs the above processing in steps 102 and 103 and obtains a cost disp dp = 674 of DP matching of "琢". Similarly, the feature point mapping unit 3 executes step 104 to find a coordinate point corresponding to the start point and the end point of the stroke of the character “” of the input pattern. This result is shown in FIG.

다음에 지정구간 특징 추출부(4)는 스텝 105를 실행하고 사전 「家」와 같이 대응스트로크 특징을 추출한다. 그 결과를 제12도에 도시한다. 그래서 특징 조합부(5)는 제12도에 도시하는 스트로크 특징과 제3도의 사전을 참조해서 계산하고 스트로크 특징의 거리 dist st=223을 얻는다.Next, the designated section feature extracting section 4 executes step 105 and extracts the corresponding stroke feature as before. The result is shown in FIG. Thus, the feature combination section 5 calculates with reference to the stroke feature shown in FIG. 12 and the dictionary of FIG. 3 and obtains the distance dist st = 223 of the stroke feature.

이상의 흐름을 참조하는 문자가 사전중에 없어질 때까지 계속하고, 문자가 없어지면 출력부(7)는 인식 결과의 소팅(sorting) 작업을 한다(스텝 108). 결과의 소팅은,The character referring to the above flow is continued until the character disappears in the dictionary, and when the character disappears, the output unit 7 sorts the recognition result (step 108). The sorting of the result,

[수학식 4][Equation 4]

dist all = α x dist dp + β x dist st(α, β는 무게의 정수)dist all = α x dist dp + β x dist st (α, β are integers of weight)

을 각각의 사전에 대해 구한다. 다음에 이식을 수학식 4로 한다. 이제 α=1, β=1라 하면,Find for each dictionary. Next, the transplantation is represented by equation (4). Now let α = 1 and β = 1

dist all 「家」 = 682 + 93 = 775dist all 「家」 = 682 + 93 = 775

dist all 「琢」 = 674 + 223 = 897dist all 「琢」 = 674 + 223 = 897

로 된다. dist all을 오름차순으로 소팅 함으로써 「家」를 제1후보문자로, 「琢」을 제2후보문자로 한다. 최후로, 출력부(7)는 후보문자 「家」 및 「琢」을 출력해서 종료한다(스텝 109).It becomes By sorting dist all in ascending order, "家" is the first candidate and "琢" is the second candidate. Finally, the output unit 7 outputs the candidate characters "家" and "琢" and ends (step 109).

상기한 처리를 한 결과 「家」가 최종적으로 인식결과로 된다. DP 매칭만의 결과로서는 「琢」이 후보 1위이지만, 대응 스트로크가 특징을 병용해서 후보문자를 산출함으로써 정답을 얻을 수가 있다.As a result of the above processing, "house" is finally a recognition result. As a result of the DP matching only, "Y" is the candidate first, but the correct answer can be obtained by calculating the candidate character using the corresponding stroke in combination.

이상과 같이, 본 실시예에 의하면 DP 매칭과 대응 스트로크 특징을 병용해서 인식 처리를 행함으로써, 자형에 일그러진 이어짐 글자 패턴에 대해, 이어짐 글자에 대응한 사전 데이타를 유지하지 않아도 인식할 수 있다.As described above, according to the present embodiment, the recognition process is performed by using the DP matching and the corresponding stroke feature together, so that it is possible to recognize the subsequent letter pattern that has been distorted in the shape of a child without maintaining the dictionary data corresponding to the subsequent letter.

또한, 실시예 1에서는 입력문자 패턴을 모든 사전중의 문자와 매칭을 하고 있으나 소수의 특징을 써서 대분류를 행하고 대분류 결과에 대해서 DP 매칭의 계산을 하고 대응스트로크 특징을 계산할 수도 있다. 또한 상기한 예에서는 α=1, β=1로 하여 DP 매칭과 대응스트로크 특징의 가중치 처리를 같게 하였으나 이 값은 이것에 한정된 것은 아니다. 또한 최종적으로 거리를 계산하는 식(수학식 4)은 DP 매칭의 결과와 대응 스트로크 특징의 결과에 가중치 처리를 한 값의 합으로 하였으나 예를 들면 대응 스트로크 특징을 써서 소팅하고 제1의 후보문자의 거리가 어떤 값보다 큰 경우 DP 매칭의 결과만을 써서 소팅을 고치고 제1위를 후보문자로 하는 등 단순한 계수로서가 아니고 다른 계산 방법이나 조건을 부가하는 등으로 가중치 처리를 해서 바른 해법을 얻도록 할 수도 있다.Further, in the first embodiment, the input character pattern is matched with all the characters in the dictionary. However, it is also possible to perform large classification using a few features, calculate DP matching for the large classification result, and calculate corresponding stroke characteristics. In the above example, the weighting process of the DP matching and the corresponding stroke feature is made equal to α = 1 and β = 1, but this value is not limited to this. Finally, the formula for calculating the distance (Equation 4) is a sum of weighted values of the result of the DP matching and the result of the corresponding stroke feature. If the distance is larger than a certain value, use the result of DP matching only to sort the result and use the first character as the candidate character, not just as a coefficient, but to add a different calculation method or condition to get the correct solution. It may be.

다시, 세그먼트의 대응짓기를 DP 매칭을 사용하고 있었으나 DP 매칭에 한정되지 않고, 이완법, 그 밖의 방법이라도 관계없다. 또한 대응스트로크 특징은 대응 부분의 폭, 높이, 시점에서 종점으로의 방향, 가상 스트로크의 폭, 방향으로 사용해서 설명하였으나 이 대신에 다른 특징, 예를 들면 기본 스트로크를 추출해도 좋다.Again, DP matching is used to match the segments, but the present invention is not limited to DP matching, and may be a relaxation method or another method. Incidentally, the corresponding stroke feature has been described in terms of the width, height of the corresponding part, the direction from the starting point to the end point, the width of the virtual stroke, and the direction. Alternatively, other features, for example, the basic stroke, may be extracted.

[실시예 2]Example 2

다음으로 실시예 1의 DP 매칭에 있어서 문자 패턴의 변동에 의한 코스트의 상승을 억제하는 방법을 실시예 1에서 사용한 제6(a)도 및 제6(b)도와, 제13(a)도 내지 제13(d)도, 제14도, 제15도 및 제16도를 써서 설명한다. 제13(a)도 내지 제13(d)도는 변동이 큰 부분을 포함하는 문자의 예, 제14도는 문자 「木」의 세그먼트 사전의 내용예, 제15도는 제13(d)도의 세그먼트 특징, 그리고, 제16도는 방향 비의존 코드를 사용한 문자 「木」의 세그먼트 사전을 각각 도시한 도면이다.Next, sixth (a) and sixth (b), thirteenth (a) to sixth (a) and sixth (b) methods used in the first embodiment to suppress the increase in cost due to the variation of the character pattern in the DP matching of the first embodiment. 13 (d), 14, 15 and 16 will be described. Figures 13 (a) to 13 (d) show examples of characters including a large variation, Figure 14 shows examples of the contents of the segment dictionary of the letter "木", Figure 15 shows the characteristics of the segments of Figure 13 (d), FIG. 16 is a diagram showing segment dictionaries of the character "木" using direction independent codes, respectively.

실시예 1과 같이 세그먼트의 방향 코드 및 세그먼트 길이를 써서 DP 매칭을 하는 경우 개인에 따라서는 입력문자가 있는 스트로크에 「꺽임 부분」을 붙이는 경우 혹은 문자 패턴이 있는 스트로크의 종점과 다음의 스트로크의 시점과의 거리가 가까운 경우 등 세그먼트 또는 가상 스트로크의 방향 코드가 상당히 변동하는 경우가 있다. 제13(a)도~제13(c)도에 도시하는 바와 같은 문자에서는 도면중 ○내의 부분의 가상 스트로크의 방향차는 16방향 코드를 사용하면 패턴간에서 8로 되고 DP 매칭의 코스트를 증가시키는 원인으로 된다. 또한 제13(d)도에 도시하는 바와 같이 「꺽임 부분」이 있는 패턴에 대해 사전에 「꺽임 부분」이 없는 경우, DP 매칭에서는 코스트가 커지고 이와 같은 스트로크가 동일문자 중에 다수 존재하면 결과로서 다른 문자로 잘못 판독하는 경우가 있다.When performing DP matching using the direction code and segment length of a segment as in Example 1, when attaching a "break" to a stroke with an input character, depending on the individual, or when the end of the stroke with a character pattern and the start of the next stroke The direction code of a segment or a virtual stroke may fluctuate considerably, for example, when the distance from a distance is near. In the characters shown in FIGS. 13 (a) to 13 (c), the direction difference of the virtual stroke of the part in (circle) in the figure becomes 8 between patterns using a 16-way code, which increases the cost of DP matching. Cause. In addition, as shown in Fig. 13 (d), when there is no "breaking part" in advance for the pattern with the "breaking part", the cost is large in DP matching, and when a large number of such strokes exist among the same characters, the result is different. There is a case of misreading by character.

본 실시예에 있어서는 이것을 방지하기 위해 미리 방향차가 개인 혹은 문자 패턴에 의해 크게 다른 부분은 방향차를 계산하지 않고, 스트로크의 길이 정보만을 이용하여 계산하는 세그먼트를 준비함으로써 이 문제점을 회피하도록 한 것을 특징으로 하고 있다.In this embodiment, in order to prevent this, this problem is avoided by preparing a segment which calculates using only the length information of the stroke, without calculating the direction difference, in which the direction difference greatly differs depending on individual or character pattern. I am doing it.

예를 들면 제13(d)도의 패턴이 세그먼트의 방향 코드 열과 세그먼트 길이를 제15도와 같이 추출하고, 「木」의 사전에 세그먼트의 방향 코드열과 세그먼트길이를 제14도와 같이 도시한다. 세그먼트 특징의 DP 매칭 때 입력 패턴의 가상 스트로크와 사전의 실제 스트로크가 대응하는 것을 금지하면 제14도, 제15도에서는 함께 획수가 4획과 동등하므로 각각 필순에 따라서 스트로크를 1대 1로 대응짓기 된다. 즉 제15도의 입력 스트로크의 2획째의 방향코드{9, 1, 3}는 사전의 2획째의 방향 코드 {9}와 대응짓도록 되고, 입력 패턴의 2획째의 「꺽임 부분」의 세그먼트와 사전의 대응짓기 코스트는 수학식 2 및 제6(d)도를 써서 계산하면 코스트=방향차(4)*세그먼트 길이의 합=20x(7+1)=160으로 된다.For example, the pattern of FIG. 13 (d) extracts the direction code string and the segment length of the segment as shown in FIG. 15, and shows the direction code string and the segment length of the segment in advance as shown in FIG. If the virtual stroke of the input pattern and the prior actual stroke are forbidden to correspond during the DP matching of the segment feature, in FIG. 14 and FIG. 15, the number of strokes is equal to four strokes, respectively. do. That is, the direction code {9, 1, 3} of the 2nd stroke of the input stroke of FIG. 15 is matched with the direction code {9} of the 2nd stroke of a dictionary, and the segment and dictionary of the "break part" of the 2nd stroke of an input pattern are referred to. The matching cost of is calculated by using Equations 2 and 6 (d), and the sum of cost = direction difference (4) * segment length = 20x (7 + 1) = 160.

이것에 대해, 방향 코드의 변동에 대처한 본 실시예에 있어서 특징적인 사전의 예를 제16도에 도시한다. 본 실시예에 있어서는 소정의 세그먼트 즉 상술한 바와 같이 입력되는 문자 패턴에 따라서는 개인차가 나기 쉽고 「꺽임 부분」 등의 방향이 분산되는 것으로 생각되고 방향 코드가 상당히 변동하는 경우의 어떤 세그먼트에 관한 특징 정보에 방향 비의존 정보를 부가하도록 하였다. 본 실시예에서는 방향 비의존 정보로서 방향 비의존 코드 번호를 쓰고 있다. 제6(a)도의 16방향 코드에 있어서 방향 비의존 코드 번호를 17로 가상적으로 준비하고 제16도의 2획째에 유지한다. 방향코드가 17인 세그먼트의 세그먼트 계산은 방향차를 0으로 해서 DP 매칭을 하도록 정의하고, 제16도의 2획째의 스트로크와 제15도의 2획째의 스트로크의 코스트 계산을 제14도의 경우와 같이 DP 매칭을 계산하고 수학식 5를 써서 계산한다. 제14도의 경우와 같이 2획째의 대응짓기 결과, 사전{9, 17}과 입력패턴{9, 13}에 대해 {9}와 {9}, {17}과 {13}이 대응짓고, 수학식 5에서의 {17}와 {13}의 코스트 계산은 방향차 0x(1+1)=0으로 되고 {9}와 {9}의 코스트 계산 0x(7x7)=0을 가하면 스트로크 단위에서의 코스트는 반드시 일정한 값 0으로 된다. 이와 같이 방향 비의존 코드를 사용함으로써 제14도의 사전과의 코스트 160에 비해 DP 매칭에 있어서 코스트가 작아지고 결과로서 사전과의 거리가 작아지고 잘못 인식을 방지할 수가 있다.On the other hand, Fig. 16 shows an example of the characteristic dictionary in the present embodiment which copes with the change in the direction code. In the present embodiment, a certain segment, i.e., the character pattern input as described above, tends to have individual differences, and it is considered that a direction such as a "folded part" is dispersed and a direction code varies considerably. Direction-independent information is added to the information. In this embodiment, the direction independent code number is used as the direction independent information. In the sixteenth direction code of FIG. 6 (a), the direction independent code number is virtually prepared as 17 and held in the second stroke of FIG. The segment calculation of the segment having the direction code of 17 is defined to perform DP matching with the direction difference set to 0, and the DP calculation of the second stroke of FIG. 16 and the stroke of the second stroke of FIG. 15 is the same as that of FIG. Calculate and calculate by using Equation 5. As in the case of FIG. 14, as a result of the second match, {9}, {9}, {17}, and {13} correspond to the dictionary {9, 17} and the input pattern {9, 13}, and The cost calculation of {17} and {13} at 5 results in a direction difference of 0x (1 + 1) = 0 and the cost calculation of {9} and {9} with 0x (7x7) = 0 gives the cost in stroke units. It must be a constant value of zero. By using the direction independent code in this way, the cost is smaller in DP matching compared to the cost 160 with the dictionary shown in FIG. 14, and as a result, the distance from the dictionary is small, and misrecognition can be prevented.

그러나, 방향 비의존 코드와의 DP 매칭을 하는 경우 대응짓기가 기대대로 되지 않는 경우가 존재한다. 이것을 제17(a)도 및 제17(b)도를 써서 설명한다. 이제 제17(a)도의 절선과 제17(b)도의 사전과의 DP 매칭을 하는 것으로 한다. 제17(a)도는 5개의 세그먼트(11~15)를 갖추고 제17(b)도는 2개의 세그먼트(16, 17)를 갖춘 것으로 한다. 제6(a)도를 사용하면 각각의 세그먼트의 방향 코드는 세그먼트(11)가 9, 세그먼트(12)가 5, 세그먼트(13)가 9, 세그먼트(14)가 5, 세그먼트(15)가 9이고 또 세그먼트(16)는 9, 세그먼트(17)는 방향 비의존 코드 17로 한다. 또한 세그먼트 길이는 모두 1로 한다. 여기에서 제17(a) 및 17(b)도를 수학식 1 및 수학식 2를 사용해서 대응짓기를 행한다. 먼저 세그먼트(11)와 세그먼트(16)를 대응짓고 그것의 코스트는 방향차 0x(1+1)=0이다. 다음에 수학식 1로 세그먼트 12와 세그먼트 16은 방향차가 4이므로 코스트는 제6(b)도와 수학식 2에서 20x(1+1)=40, 세그먼트 11과 세그먼트 17은 방향 비의존 코드와의 대응지어져 있으므로 코스트 0, 세그먼트 12와 세그먼트 17도 방향 비의존 코드와의 대응지어져 있으므로 코스트 0으로 되고, 가장 코스트가 작은 세그먼트 11와 세그먼트 17 혹은 세그먼트 12와 세그먼트 17이 대응짓는다. 마찬가지로 계산하여 나머지의 세그먼트(13, 14, 15)는 모든 세그먼트(17)와 대응짓고 코스트도 각각 0으로 된다. 그 결과 제17(a) 및 17(b)도와의 세그먼트 대응짓기 코스트는 수학식 3을 써서 0/(5+2)=0으로 된다.However, in the case of DP matching with the direction independent code, there is a case where the correspondence is not as expected. This will be described with reference to FIGS. 17 (a) and 17 (b). Now, it is assumed that DP matching is performed between the cutting line of Fig. 17 (a) and the dictionary of Fig. 17 (b). Fig. 17 (a) has five segments 11 to 15, and Fig. 17 (b) has two segments 16 and 17. Figs. Using Fig. 6 (a), the direction code of each segment is 9 for segment 11, 5 for segment 12, 9 for segment 13, 5 for segment 14, 9 for segment 15, The segment 16 is 9 and the segment 17 is a direction independent code 17. In addition, the segment length shall be all 1. Here, the seventeenth (a) and the seventeenth (b) are matched by using the equations (1) and (2). First, segment 11 and segment 16 are associated and its cost is direction difference 0x (1 + 1) = 0. Next, since Segment 12 and Segment 16 have a direction difference of 4 in Equation 1, the cost is 20x (1 + 1) = 40, Segment 11 and Segment 17 correspond to direction independent codes in Equation 6 (b) and Equation 2. Therefore, cost 0, segment 12, and segment 17 are also associated with the direction independent code, and thus cost 0, and segment 11 and segment 17 or segment 12 and segment 17 having the smallest cost are associated with each other. Similarly, the remaining segments 13, 14, and 15 correspond to all segments 17, and the cost is zero, respectively. As a result, the segment correspondence cost with the 17th (a) and 17 (b) degrees is 0 / (5 + 2) = 0 using equation (3).

이 결과는, 제17(a) 및 17(b)도가 동일한 것을 의미하고 잘못된 대응짓기로 된다. 이와 같은 대응짓기를 허여해 버리면 방향 비의존 코드를 포함하나 전혀 형상이 유사하지 않은 문자와 인식되는 일이 있다. 이것을 방지하기 위해 본 실시예에 있어서는 방향 비의존 코드가 부가된 세그먼트에 대응지어지는 입력 문자 패턴에서 얻은 세그먼트의 수의 상한을 준비하도록 하였다.This result implies that the seventeenth (a) and the seventeenth (b) degrees are the same, and are mismatched. If you allow such a match, it may be recognized with characters that contain direction-independent codes but are not similar in shape at all. In order to prevent this, in this embodiment, the upper limit of the number of segments obtained from the input character pattern corresponding to the segment to which the direction independent code is added is prepared.

예를 들면 방향 비의존 코드와의 대응짓는 상한 수를 1로 하면 제17(a) 및 17(b)도와의 코스트는 세그먼트(11~15)와 세그먼트 16이 대응짓고, 세그먼트 15와 세그먼트 17이 대응짓는다. 그것의 코스트는(세그먼트 11와 세그먼트 16이 코스트=)0+(세그먼트 12와 세그먼트16의 코스트=)40+(세그먼트 13과 세그먼트 16의 코스트=)0+(세그먼트 14와 세그먼트 16의 코스트)40+(세그먼트 15와 세그먼트 16의 코스트=))+(세그먼트 15와 세그먼트 17의 코스트=) 0=80으로 되어 앞서의 코스트 0에 비해서 기대대로의 코스트를 얻는다.For example, if the upper limit number corresponding to the direction independent code is 1, the costs of the 17th (a) and 17 (b) degrees correspond to the segments 11 to 15 and the segment 16, and the segment 15 and the segment 17 correspond. Build. Its cost is (segment 11 and segment 16 cost =) 0+ (cost = segment 12 and segment 16) 40+ (cost = segment 13 and segment 16) 0+ (cost of segment 14 and segment 16) 40 + (Cost = segment 15 and segment 16)) + (cost = segment 15 and segment 17) 0 = 80 and the cost as expected compared to the previous cost 0 is obtained.

또한 실시예에서는 방향 코드가 17인 세그먼트에 의거해 산출되는 코스트를 0으로 되도록 그것의 방향차를 0으로 해서 「꺽임 부분」 등의 세그먼트를 무시하도록 하였으나 코스트를 0이외의 일정값으로 하거나 0으로 되지 않도록 방향차를 설정하는 등 응용할 수도 있다.Also, in the embodiment, the direction difference is set to 0 so that the cost calculated based on the segment having the direction code of 17 is set to 0, and the segments such as the "bent portion" are ignored. It can also be applied such as setting the direction difference so as not to.

또한 상기한 예에서는 DP 매칭의 대응점의 계산식을 수학식 1 내지 수학식 3을 사용하였으나 이것에 한정되지 않고 다른 식을 써도 된다.In the above example, equations 1 to 3 are used for the calculation of the corresponding points of the DP matching, but the present invention is not limited thereto, and other equations may be used.

본 발명에 의하면, 잘못 인식하기 쉬운 이어짐 글자(띄어쓰지 않고 연달아 붙여 쓴 글자)나, 운필 방향이 유사한 문자 끼리에 대해서, 세그먼트 대응짓기와 대응 스크로크 특징을 병용해서 문자인식 처리를 함으로써, 보다 상세한 문자 검정이 가능해지고, 그 결과 정밀도가 양호한 문자인식을 행할 수 있다. 또한, 동일한 필순에서는, 문자의 어느 부분을 계속해도, 1종류의 사전으로 인식이 가능하고, 다른 이어짐 글자에 각각 대응하는 사전을 준비할 필요가 없다. 따라서, 예를 들면, 자형이 일그러진 이어짐 글자 패턴에 대해, 이어짐 글자에 대응한 사전 데이타를 유지하지 않아도 인식할 수 있도록 되며, 사전 작성의 노력 및 사전 용량의 삭감 등 부수하는 각종 효과도 거둘 수가 있다.According to the present invention, a character recognition process is performed by using a segment correspondence and a corresponding stroke feature for a letter that is easy to be misrecognized (letters written in succession without writing) or letters having similar writing directions. Character verification becomes possible, and as a result, character recognition with high precision can be performed. In addition, in the same mandatory order, it is possible to recognize as one kind of dictionary even if any part of a character is continued, and it is not necessary to prepare the dictionary corresponding to each subsequent letter. Therefore, for example, the pattern of the distorted continuation letters can be recognized without maintaining the dictionary data corresponding to the continuation letters, and the various effects such as the effort to make the dictionary and the reduction of the dictionary capacity can be achieved. .

또한, 계산에 의해 최종적인 문자인식을 할 때에 세그먼트 대응짓기와 대응 스트로크 특징의 가중치 처리를 설정할 수 있도록 함으로써, 보다 정확한 문자인식 처리를 제공할 수가 있다.Further, when the final character recognition is performed by calculation, the weight correspondence between the segment correspondence and the corresponding stroke feature can be set, so that more accurate character recognition processing can be provided.

또한, 방향 비의존 정보가 부가된 세그먼트에 대응 지어지는 입력문자 패턴으로부터 얻은 세그먼트의 수의 상한을 준비하도록 함으로써, 더욱 고정밀도의 문자인식을 할 수 있다.Further, by preparing an upper limit of the number of segments obtained from the input character pattern corresponding to the segment to which the direction independent information is added, more accurate character recognition can be achieved.

Claims

An online character recognition apparatus for inputting coordinate string data of a character pattern and outputting a character code corresponding to the input character pattern, wherein the coordinate string data on a stroke when the input character pattern is described is inputted. The input means and each linear portion obtained by segmented approximation of the coordinate points arranged in time series included in the coordinate point sequence data input by the input means are segmented, and the characteristic information about each segment and each segment Feature extraction means for extracting a feature point, which is an end point of the word, dictionary storage means for storing in advance a feature containing information on the segments constituting the character and a feature point for each character, feature information of each character described in the dictionary and the Segments constituting each character in the dictionary based on feature information extracted by feature extraction means And feature point correspondence means for performing a correspondence with the segment obtained from the input character pattern, and calculating a segment correspondence distance, and by the feature acceleration on the stroke of the input character pattern corresponding to the feature tone on the stroke designated by the dictionary. Designation section feature extraction means for extracting the feature information of the section to be used as the corresponding stroke feature, and corresponding stroke feature extracted by the designation section feature extraction means are combined with the feature information in the dictionary, and thus the distance of the corresponding stroke feature. And means for outputting a candidate character code obtained by the feature combining means, wherein the feature combining means is calculated by the distance of the calculated corresponding stroke feature and the feature point correspondence means. In the dictionary based on the segment correspondence distance Online character recognition device, characterized in that for identifying the character corresponding to the input character pattern.

The feature point correspondence means according to claim 1, wherein the feature point correspondence means associates one coordinate point of the input character pattern with respect to the start point and the end point of each stroke of the character in the dictionary, and the designation section feature extraction means includes: the view point; And characterizing the feature point of the input character pattern corresponding to and the feature point of the input character pattern corresponding to the endpoint.

The information about each segment includes a direction and a length of each segment, and the feature point correspondence means corresponds to the segment based on the direction and length of each segment calculated by the feature extraction means. And calculating a cost of each constructed segment, and calculating a segment correspondence distance based on the cost.