KR100356503B1

KR100356503B1 - Device for recognizing learning character

Info

Publication number: KR100356503B1
Application number: KR1019940033153A
Authority: KR
Inventors: 강민석
Original assignee: 엘지전자 주식회사
Priority date: 1994-12-07
Filing date: 1994-12-07
Publication date: 2002-12-11
Also published as: KR960025222A

Abstract

PURPOSE: A device for recognizing a learning character is provided to recognize an on-line English sentence and to learn an unrecognized English sentence promptly by expressing learning strokes and a position relation. CONSTITUTION: A stroke number endowing unit(20) matches each stoke of an inputted character with a learned stoke, and registers the stoke numbers. A stroke database(40) stores the learned stokes. A character database(50) includes the learned characters. A candidate character matching unit(30) matches the stroke numbers transmitted from the stroke number endowing unit(20) with the learned character, and registers many candidate characters. A character sorting unit(60) sorts the registered candidate characters according to approximations. An output unit(70) outputs a character recognized through the character sorting unit(60).

Description

Learning type character recognition device

본 발명은 영문을 인식하기 위한 장치에 관한 것으로, 특히 학습획과 이들의 위치관계를 정량적으로 표현하여 온라인 필기영문을 인식할 뿐만아니라 인식되지 않는 임의의 필체의 영문을 즉석에서 학습시킬 수 있도록 하는 학습형 문자인식장치에 관한 것이다.The present invention relates to a device for recognizing English, and in particular to quantitatively express the learning strokes and their positional relationship not only to recognize online handwriting English, but also to instantly learn English of any handwriting that is not recognized It relates to a learning type character recognition device.

종래의 영문자 인식처리 과정은 제 1 도에 도시된 바와같이 태블렛을 통해 입력되는 문자로 부터 불필요한 점을 제거하고 필요한 정보만 처리하는 전처리단계와, 상기 전처리단계를 거친 점들로부터 퓨리변환을 이용하여 특징을 추출하는 특징추출단계와, 상기 특징추출단계에서 얻어진 퓨리계수와 기준문자의 퓨리계수를 비교하여 문자를 인식하도록 하는 인식단계로 구성된다.The conventional alphabet recognition process uses a preprocessing step of removing unnecessary points from a character input through a tablet and processing only necessary information as shown in FIG. 1, and using a Fourier transform from the points which have undergone the preprocessing step. A feature extraction step for extracting a feature and a recognition step for recognizing a character by comparing the Fourier coefficient obtained from the feature extraction step and the Fourier coefficient of the reference character.

상기에서와 같은 단계로 이루어진 종래의 기술에 대하여 살펴보면 다음과 같다.Looking at the prior art made of the steps as described above are as follows.

사용자가 태블렛에 종이 위에 필기하듯이 문자를 쓰게되면 전처리단계에서 태블렛으로 입력된 점들중에서 불필요한 점들은 제거하고 필요한 정보를 받아 크기를 정규화하는 작업을 행하고 나면, 특징추출단계에서는 사용자가 글자를 쓰면 획이 지나가면서 생긴 좌표의 열을 입력으로 받고 X좌표의 열과 Y좌표의 열을 퓨리변환 즉, 쓰여진 모든획이 연결되어 있다고 가정하여 하나의 획을 만든뒤 획을 이루는 일정한 시간간격을 가진점들의 X좌표와 Y좌표를 각각 퓨리변환하여 퓨리계수를 얻게된다.When the user writes the text as if writing on the tablet, the unnecessary points are removed from the points input to the tablet in the preprocessing step, and the necessary information is normalized. If you write it, you will receive a column of coordinates generated as the stroke passes, and a column of X coordinates and Y coordinates is a Fourier transform, that is, all strokes that are written are connected. Fourier transforms are obtained by Fourier transforming the X and Y coordinates.

상기 특징추출단계를 통해 퓨리계수가 얻어지게 되면 인식단계에서는 이미 입력되어 있는 기준문자의 퓨리계수와 비교하여 문자를 인식해낸다.When the Fourier coefficient is obtained through the feature extraction step, the recognition step recognizes the character by comparing it with the Fury coefficient of the reference character already input.

즉, 각 문자에 대한 퓨리계수를 데이타 베이스화하여 입력의 퓨리계수와 기준문자의 퓨리계수를 비교해서 가장 비슷한 것을 인식문자로 출력하게 된다.In other words, the Fourier coefficient for each character is made into a database, and the most similar one is output as a recognition character by comparing the Fourier coefficient of the input and the Fourier coefficient of the reference character.

그러나, 상기에서와 같은 종래의 기술에 있어서 획의 퓨리계수를 구하는데 X축과 Y축을 분리하여 구하므로 계산시간이 길어지는 문제와 인식시 개인적인 습관이 있는 특별한 글씨는 인식할 수 없는 단점이 있다.However, in the prior art as described above, the Purity coefficient of the stroke is obtained by separating the X-axis and the Y-axis, so that the computation time is long and the special letters with personal habits in recognition are not recognizable. .

따라서, 본 발명의 목적은 입력되는 문자의 획을 구하여 학습되어진 기준문자의 획과 비교하여 복수의 후보를 구한 뒤 그 후보들의 시작점과 끝점 및 중심점값을 기준문자의 것과 매칭하여 얻은 후보문자들을 근접도에 따라 서열을 매겨 최종 후보문자를 인식하고 그 인식문자를 출력하도록 한 학습형 문자인식장치를 제공함에 있다.Accordingly, an object of the present invention is to obtain a plurality of candidates by comparing the stroke of the input character to obtain the stroke of the input character, and then close the candidate characters obtained by matching the start and end points and the center point of the candidates with those of the reference character. According to the present invention, there is provided a learning type character recognition device that recognizes the final candidate character and outputs the recognition character.

본 발명의 다른 목적은 인식되지 않는 문자에 대하여는 그 문자의 새로운 획과 획번호를 데이타베이스에 추가하고 다시 학습시킬 수 있도록 한 학습형 문자인식장치를 제공함에 있다.It is another object of the present invention to provide a learning type character recognition device for re-learning a new stroke and stroke number of the character for a character which is not recognized.

상기 목적을 달성하기 위한 본 발명은 입력문자의 각 획을 이미 학습되어진 획과 매칭하여 복수의 획번호를 등록하여 두는 인식획 번호부여부와, 상기 학습되어진 획의 데이타베이스를 저장하고 있는 획데이타 베이스부와. 학습되어진 문자의 데이타베이스를 가지고 있는 문자데이타 베이스부와, 상기 인식획 번호부여부에서 넘어온 복수의 획번호와 상기 문자데이타 베이스부에서 이미 학습되어진 문자와 매칭하여 복수의 후보문자를 등록하여 두는 후보문자 매칭부와, 상기 후보문자 매칭부에 등록된 후보문자들을 근접도에 따라 서열을 매기는 문자소팅부와, 상기 문자 소팅부를 거쳐 인식된 문자를 출력하는 출력부로 구성한다.The present invention for achieving the above object is a stroke database that stores the database of the stroke and whether the recognition stroke numbering to register a plurality of stroke numbers by matching each stroke of the input character with the stroke already learned; Wealth. A candidate character that registers a plurality of candidate characters by matching a character database part having a database of learned characters, a plurality of stroke numbers passed from the recognition stroke numbering part and characters already learned in the character database part; And a matching unit, a character sorting unit for ranking the candidate characters registered in the candidate character matching unit according to proximity, and an output unit for outputting the recognized characters through the character sorting unit.

이와같이 구성된 본 발명의 동작 및 작용효과에 대하여 상세히 설명하면 다음과 같다.When described in detail with respect to the operation and effect of the present invention configured as described above.

입력부(10)를 통해 문자가 입력되면 그 문자의 각 획을 거리필터링으로 전처리를 행한 뒤 획테이타 베이스부(40)에 학습되어져 있는 획과 매칭을 하여 학습된 획들중에서 비슷한 것의 번호리스트를 뽑아놓는다. 이때 상기 획데이타베이스부(40)는 기준문자의 각 획에 대한 정보를 거리필터링한 데이타로 변환한 뒤 고유의 번호를 붙여 저장함에 있어, 처음 학습된 획은 1, 두번째는 2,.. 와 같은 식으로 고유번호를 가지게 된다.When a character is input through the input unit 10, each stroke of the character is preprocessed by distance filtering, and then matched with the stroke learned in the stroke data base unit 40, and a number list of similar ones is drawn from the learned strokes. . At this time, the stroke database unit 40 converts the information about each stroke of the reference character into data filtered by distance filtering, and stores them with a unique number. The first learned stroke is 1, the second is 2,. In the same way, it has a unique number.

이때, 개발자는 기준문자 획에 대해서 학습을 시키고 사용자는 자신의 필체에 따라서 추가적으로 학습을 시킬 수 있다.At this time, the developer may learn about the reference character stroke and the user may additionally learn according to his handwriting.

상기 인식획 번호부여부(20)에서 획 데이타베이스와 매칭을 할때 DP매칭이나, 뉴럴네트웍등 획간의 유사도를 측정할 수 있는 방법이라면 어떤 방법을 사용해도 상관이 없는데, 단 매칭결과가 하나의 획만 나와서는 안되고 복수의 후보가 나올 수 있어야 한다.If the recognition stroke numbering unit 20 can measure the similarity between strokes, such as DP matching or neural networks, when the matching with the stroke database, any method may be used, except that the matching result is only one stroke. It should not come out, but it should be possible to have multiple candidates.

이상에서와 같은 방법으로 하여 복수의 획 후보가 나오면 그 후보를 입력받는 후보문자 매칭부(30)는 모든 영문자에 대해 획번호 리스트와 각 획들의 시작점, 끝점, 중심점값을 정규화하여 저장해놓은 문자데이타 베이스부(50)에 있는 각 문자와 하나씩 그 값의 차이를 구하여 더한값이 일정한 범위안에 들면 일단 인식된 후보문자로 등록한다.In the same way as described above, when a plurality of stroke candidates appear, the candidate character matching unit 30 receiving the candidates characterizes the stroke number list and the start point, end point, and center point values of the respective strokes for all English characters. The difference between the values of each character in the base unit 50 and one of them is obtained, and if the added value is within a predetermined range, it is registered as a recognized character once.

이와같은 방법으로 인식된 후보문자가 복수로 등록되면 문자소팅부(60)에서 그 후보문자들의 매칭값을 가지고 후보들의 서열을 매기고 그 서열중 제일처음의 문자를 인식문자로 하고 그 문자를 출력부(70)를 통해 출력시킨다.When a plurality of candidate characters recognized in this manner are registered, the character sorting unit 60 ranks candidates with matching values of the candidate characters, and uses the first character of the sequence as a recognition character and outputs the characters. Output through 70.

이때 상기 후보문자 매칭부(30)에서 인식되는 후보가 없을때는 학습상태로 전환되는데, 그 학습상태에 들어서면 사용자로 부터 어떤 문자인지 입력을 인식획 번호부여부(20)를 통해 그 입력문자에 대한 각각의 획번호를 부여받아 획데이타 베이스부(40)에 저장하도록 하고, 그 각각의 획의 시작점, 끝점 및 무게중심값을 문자데이타 베이스부(50)에 저장한다.At this time, when there is no candidate recognized by the candidate character matching unit 30, the learning state is switched, and when entering the learning state, the character input from the user through the recognition stroke numbering unit 20 for the input character Each stroke number is given and stored in the stroke data base unit 40, and the start point, end point, and center of gravity value of each stroke are stored in the character data base unit 50. FIG.

여기서, 획번호 부여시 일단 인식획 번호부여부(20)에 입력하여 인식해본뒤 인식이 되면 해당 획번호를 부여하고, 인식되지 않을때는 새로 획과 획번호를 획데이타 베이스부(40)에 추가한다.Here, when the stroke number is assigned, the recognition stroke number is inputted to the numbering unit 20, and then recognized, and when recognized, the corresponding stroke number is added. When not recognized, the stroke and stroke number are newly added to the stroke database base unit 40. .

이상에서 설명한 바와같이 본 발명은 휴대용 개인용 단말기와 문자입력이 필요하나 키보드를 붙이기가 어려운 제품이나 키보드 사용에 익숙하지 않은 사용자들을 위해 데스크탑 컴퓨터에 장착하여 사용할 수 있도록 한다.As described above, the present invention allows a portable personal terminal and text input, but it is difficult to attach the keyboard or users who are not familiar with using the keyboard can be used in a desktop computer.

제 1 도는 종래의 영문자 인식처리 흐름도.1 is a flow chart of a conventional English character recognition process.

제 2 도는 본 발명의 학습형 문자인식장치 구성도.2 is a block diagram of a learning type character recognition device of the present invention.

******* 도면의 주요부분에 대한 부호의 설명 ************** Explanation of symbols on the main parts of the drawings *******

10 : 입력부 20 : 인식획 번호부여부10: input unit 20: recognition stroke numbering

30 : 후보문자 매칭부 40 : 획데이타 베이스부30: candidate character matching unit 40: stroke data base unit

50 : 문자데이타 베이스부 60 : 문자소팅부50: character data base portion 60: character sorting portion

70 : 출력부70: output unit

Claims

A recognition stroke number for registering a plurality of stroke numbers by matching each stroke of the input character with a stroke already learned, a stroke database portion storing a database of the learned stroke, and a database of learned characters A candidate character matching unit which registers a plurality of candidate characters by matching a character data bass unit having a character, a plurality of stroke numbers passed from the recognition stroke numbering unit, and characters already learned in the character database base unit, and the candidate character matching unit; And a character sorting unit for ranking the candidate characters registered in the character matching unit according to proximity, and an output unit for outputting the recognized characters through the character sorting unit.

The learning type character recognition apparatus according to claim 1, wherein the stroke data base unit converts the information on each stroke of the reference character into data filtered with distance, and stores them with a unique number.

The apparatus of claim 1, wherein the stroke data base unit stores information on a new stroke and stroke number of the character when the candidate character recognition fails.

The learning type character recognition device of claim 1, wherein the character data base unit normalizes and stores a list of stroke numbers constituting the existing alphabet and start, end, and center value of each stroke.

The learning type character recognition apparatus according to claim 1, wherein the character data base unit adds new stroke information on the character that fails to recognize the candidate character.