KR930001471B1

KR930001471B1 - Character recognition method and apparatus for handwrite and printer

Info

Publication number: KR930001471B1
Application number: KR1019900018517A
Authority: KR
Inventors: 고인규; 박병철; 최태승; 한영덕; 김지덕; 한미라
Original assignee: 한국과학기술원; 이상수
Priority date: 1990-11-15
Filing date: 1990-11-15
Publication date: 1993-02-27
Also published as: KR920010492A

Abstract

내용 없음.No content.

Description

Method and apparatus for recognizing handwritten and printed characters

제1도는 종래의 패턴 또는 캐랙터 인식 시스템의 처리 순서를 보여주는 블록도.1 is a block diagram showing a processing sequence of a conventional pattern or character recognition system.

제2도는 본 발명이 적용될 수 있는 이미지 처리장치의 한 실시예를 보여주는 도면.2 is a view showing an embodiment of an image processing apparatus to which the present invention can be applied.

제3도는 본 발명에 따라 외부 및 내부 윤곽과 윤곽 추적 시작점을 결정하는 순서를 보여주는 도면.3 shows a sequence for determining external and internal contours and contour tracking starting points in accordance with the present invention.

제4도는 본 발명이 적용되는 경우에 각 윤곽 화소에서 방향 코오드를 보여주는 도면.4 shows a directional code at each contour pixel when the present invention is applied.

제5도는 종래의 기술로서 윤곽 추적에서 방향코오드 추출의 순서를 보여주는 도면.5 is a diagram showing a sequence of direction code extraction in contour tracking according to the conventional art.

제6도는 본 발명에 적용되는 방향코오드로 부터 방향자 코오드를 추출하는 순서를 보여주는 도면.6 is a view showing a sequence of extracting the director code from the direction code applied to the present invention.

제7도는 본 발명에 따라 방향자 코오드로부터 제1스므싱레인지의 방향자 코오드를 추출하는 순서를 보여주는 도면.7 is a view showing a sequence of extracting the director code of the first smearing range from the director code according to the present invention.

제8도는 본 발명에 따라 제R스므싱레인지까지 순자로 방향자 코오드열을 추출하는 순서를 보여주는 도면.8 is a view showing a sequence of extracting a director code string into a sequential order to the R-summing range according to the present invention.

제9도는 본 발명에 따라 소정 스므싱레인지의 방향자 코오드열로 부터 추출된 형상코오드열로 부터 대치를 처리하는 순서를 보여주는 도면.9 is a view showing a procedure of processing replacement from a shape code string extracted from a director code string of a predetermined smoothing range according to the present invention.

제10도는 본 발명에 따라 텍스트의 패턴을 인식하기 위한 순서를 보여주는 도면.10 shows a sequence for recognizing a pattern of text in accordance with the present invention.

제11도는 제10도의 과정중 각 스므싱레인지에서 패턴인식 성공수를 카운트하기 위한 수순을 보여주는 도면.FIG. 11 is a diagram illustrating a procedure for counting the number of successful pattern recognition in each smoothing range during the process of FIG. 10.

제12도는 영어 캐랙터 "E"의 이미지 패턴을 보여주는 도면.12 shows an image pattern of the English character "E".

제12a도∼제12f도는 본 발명에 따라 제12도의 이미지 패턴을 처리한 결과의 형상을 보여주는 도면.12A-12F show the shape of the result of processing the image pattern of FIG. 12 in accordance with the present invention.

제13도는 다른 스타일의 영어 캐랙터 "E"의 이미지 패턴을 보여주는 도면.FIG. 13 shows an image pattern of an English character "E" in a different style.

제13a도∼제13c도는 본 발명에 따라 제13도의 이미지 패턴을 처리한 결과의 형상을 보여주는 도면.13A-13C show the shape of the result of processing the image pattern of FIG. 13 in accordance with the present invention.

제14도는 한자 캐랙터 "可"의 이미지 패턴을 보여주는 도면.14 is a view showing an image pattern of the Chinese character character "可".

제14a도∼제14j도는 본 발명에 따라 제14도의 이미지 패턴을 처리한 형상들을 보여주는 도면.14A-14J illustrate shapes of processing the image pattern of FIG. 14 in accordance with the present invention.

제15도는 다른 스타일의 한자 캐랙터 "可"의 이미지 패턴을 보며주는 도면.15 is a view showing the image pattern of the Chinese character character "可" in a different style.

제15a도∼제15c도는 본 발명에 따라 제15도의 이미지 패턴을 처리한 형상들을 보여주는 도면.15A-15C show the shapes of the image pattern of FIG. 15 processed in accordance with the present invention.

제16도는 한글캐랙터 "굴"의 이미지 패턴을 보여주는 것으로서 백화소 "0"는 생략되어 있는 도면.FIG. 16 shows the image pattern of the Korean character "oyster", in which the white pixel "0" is omitted.

제16a도와 제16b도는 본 발명에 따라 제16도의 이미지 패턴을 처리한 형상들을 보여주는 도면.16A and 16B show shapes in which the image pattern of FIG. 16 has been processed in accordance with the present invention.

제17도는 일본어 캐랙터"ス"의 이미지 패턴을 보여주는 것으로서 백화소 "0"는 생략되어 있는 도면.FIG. 17 shows an image pattern of the Japanese character "S", in which the white pixel "0" is omitted.

제17a도∼제17d도는 본 발명에 따라 제17도의 이미지 패턴을 처리한 형상들을 보여주는 도면.17A-17D illustrate shapes in which the image pattern of FIG. 17 is processed in accordance with the present invention.

제18도는 다른 스타일의 일본어 캐랙터 "ス"의 이미지 패턴을 보여주는 것으로서 백화소 "0"는 생략되어 있는 도면.FIG. 18 is a view showing an image pattern of Japanese character "S" of a different style, in which the white pixel "0" is omitted.

제18a도∼제18d도는 본 발명에 따라 제18도의 이미지 패턴을 처리한 형상들을 보여주는 도면.18A-18D show the shapes of the image pattern of FIG. 18 processed according to the present invention.

본 발명은 필기체 캐랙터, 프린트 캐랙터 및 그 밖은 어떤 패턴을 인식할 수 있는 방법 및 장치에 관한 것으로, 특히 컴퓨터, 타자기, 음성합성기등과 같은 전자 장치의 인간-기계 인터페이스로 사용할 수 있는 패턴 인식방법 및 장치에 관한 것이다. 패턴 인식에 관하여 여러가지의 시스템들이 제안되어 왔다. 패턴 인식 시스템은 전형적으로 인식되어야할 패턴을 나타내는 데이터를 입력하고, 입력패턴을 인식하기 위하여 공지의 패턴과 비교를 하기 위한 소정 작업을 행한다. 제1도는 종래의 패턴인식 시스템의 처리 수순을 보여주는 블럭도이다. 입력패턴(10)은 인식되기를 원하는 패턴이다. 디지타이저(digitier)(11) 예를 들면 팩시밀리 기기, 전자복사기 및 광학캐랙터 인식 시스템과 같은 그러한 장치에서 사용되는 디지타이저(11)는 입력패턴을 시스템 메모리에 저장하기 위하여 백에 대응하는 2진데이터 "0"과, 흑에 대응하는 2진데이터 "1"의 디지탈 이미지 데이터로 변환을 한다. 회색 분해(gray level)시 임계값(threshold)보다 클때 1, 적을때 "0"로 처리한다. 입력된 이미지 데이터는 각각의 캐랙터로 세그멘트되고(과정 12), 세그멘트된 각 캐랙터의 특정 추출이 행재진다(과정 13). 상기 추출된 특징을 나타내는 데이터는 공지의 패턴데이터중 하나와 매치하기 위하여 비교된다(과정 14). 매칭된 패턴-데이터는 프린터, 디스플레이 장치 및 음성 합성기와 같은 출력장치에 제공되는 ASCⅡ등과 같은 출력데이터로 변환된다(과정 15). 공지의 기술에 의해 제안된 특정 추출방법은 세선화 방법과 윤곽추적방법이다. 세선화 방법에 의한 특정 추출은 굵은획으로 쓰여진 문자로 부터 골격(skeleton)을 추출하는 처리에서 많은 시간을 소모하고, 획의 불균일 또는 잡음의 민감성에 기인한 정확한 골격추출의 어려움을 초래한다. 윤곽추적방법에 의한 특정 추출은 획의 경계에 있는 흑 또는 2진 "1"을 방향 코오드를 사용하여 추적하기 때문에 상기 세선화 방법보다 고속 처리가 가능하다.The present invention relates to a method and apparatus capable of recognizing a pattern in a handwritten character, a print character, and the like, in particular, a pattern recognition method that can be used as a human-machine interface of an electronic device such as a computer, a typewriter, a speech synthesizer, and the like. And to an apparatus. Various systems have been proposed for pattern recognition. The pattern recognition system typically inputs data indicative of a pattern to be recognized, and performs a predetermined operation for comparing with a known pattern in order to recognize the input pattern. 1 is a block diagram showing a processing procedure of a conventional pattern recognition system. The input pattern 10 is a pattern to be recognized. Digitizer 11 Digitizer 11, used in such devices as, for example, facsimile machines, electronic copiers and optical character recognition systems, has a binary data " 0 " corresponding to the bag to store input patterns in system memory. And digital image data of binary data " 1 " corresponding to black. It is treated as 1 when the gray level is greater than the threshold and 0 when the gray level is less. The input image data is segmented into each character (step 12), and specific extraction of each segmented character is performed (step 13). Data representing the extracted feature is compared to match one of the known pattern data (step 14). The matched pattern-data is converted into output data such as ASCII or the like provided to an output device such as a printer, a display device and a speech synthesizer (step 15). Specific extraction methods proposed by known techniques are thinning and contour tracking. The specific extraction by the thinning method consumes a lot of time in the process of extracting the skeleton from the characters written in thick strokes, and leads to the difficulty of accurate extraction of the skeleton due to stroke irregularity or noise sensitivity. The specific extraction by the contour tracking method tracks the black or binary " 1 " at the boundary of the stroke using the directional code, thereby enabling faster processing than the thinning method.

그러나 윤곽 추적 방법에서, 획의 불균일성 또는 잡음을 제거할 수 있는 특정 추출의 처리가 요구된다. 윤곽추적방법의 한 종래기술은 1972년 11월 IEEE Trans, Comput, Vol.C-1, No.11, pp 1206-1216 및 "광학 문자 인식 방법(Method of optical character recognition)"으로 명칭된 1988년 9월 20일자로 W.C.Scott씨에게 발행된 미국특허 번호 4,773,098호에 개시되어 있다.However, in the contour tracking method, processing of a specific extraction that can remove the nonuniformity or noise of the stroke is required. One prior art of the contour tracking method is November 1972, IEEE Trans, Comput, Vol. C-1, No. 11, pp 1206-1216 and 1988, named "Method of optical character recognition". No. 4,773,098, issued September 20, to WCScott.

상기 특허에서는, 이미지로 부터 디지타이즈된 입력패턴이 메모리로 전송된다. 메모리에 있는 데이터로부터 문자의 경계를 나타내는 데이터만이 읽혀진다. 윤곽추적중 높이, 폭, 경계선길이(perimter) 및 면적과 같은 캐랙터 변수가 결정되고, 이 캐랙터 변수는 캐랙터를 매치하는 기준패턴과 비교된다. 매칭되는 기준 패턴의 존재시 기준패턴의 출력에 의하여 패턴인식이 이루어진다. 그러나 이 방법은 문자의 크기에 상관 없는 특정 추출방법이지만, 스캐너의 에러 리딩, 용지의 질이나 잉크의 번짐 등에 기인한 잡음을 제거하기 위한 더 이상의 처리를 행하고 있지 않다. 또한 이 방법에서 개인적인 필기 스타일, 문자체에 따른 다른 획의 특징, 종이와 잉크의 질에 따라 여러 형태로 변화된 캐랙터를 인식하는 것이 곤란하다.In this patent, the input pattern digitized from the image is transferred to the memory. Only data representing character boundaries is read from the data in memory. During contour tracking, character variables such as height, width, perimter and area are determined, which are compared with the reference pattern that matches the character. Pattern recognition is performed by the output of the reference pattern in the presence of a matching reference pattern. However, this method is a specific extraction method irrespective of the size of the character, but no further processing is performed to remove noise due to the error reading of the scanner, the quality of the paper or the smearing of the ink. In this method, it is also difficult to recognize characters that have changed in various forms depending on the personal writing style, the characteristics of different strokes according to the typeface, and the quality of paper and ink.

일반적으로 캐랙터는 필기스타일, 인쇄스타일 등에 따라 여러가지의 형성을 낳을 수 있다. 즉 획의 모서리에서 복잡하게 발전하는 형상, 부분적으로 번진 윤곽, 직선에서 다른 형태로 진행하는 변화(예를 들어, 한자의 "木"), 하나 이상의 자소로 구성된 캐랙터 형상(예를 들어 한글의 "서, 소, 곳"등)등이다. 그러므로 하나의 캐랙터는 여러가지의 형성 변경된 캐랙터들로 쓰여지거나 인쇄될 수 있다. 그러므로 단순한 윤곽 추적방법을 사용한 특정 추출방법이 사용되는 경우, 하나의 동일 캐랙터에 대하여 여러가지의 기준 패턴들이 독출 전용 메모리(ROM ; 롬) 또는 특정 저장수단에 기억되지 않으면 안된다. 이러한 경우 롬 또는 저장수단에 기억되야 하는 량이 한정되지 않기 때문에 전체적인 롬의 용량이 크게 증가되는 문제점을 갖게 된다. 특히 다양한 획을 갖는 한자, 한글 및 일어의 경우 심각한 문제가 발생한다.In general, a character can produce various formations according to a writing style, a printing style, and the like. Complex shapes that develop at the edges of strokes, partially bleed contours, changes that progress from a straight line to other shapes (for example, "木" in Chinese characters), and character shapes consisting of one or more phonemes (for example, " Standing, cow, place ", etc.). Therefore, one character can be written or printed with various modified characters. Therefore, when a specific extraction method using a simple contour tracking method is used, various reference patterns for one same character must be stored in a read only memory (ROM) or a specific storage means. In this case, since the amount to be stored in the ROM or storage means is not limited, the overall ROM capacity is greatly increased. In particular, serious problems arise in the case of Chinese characters, Korean characters, and Japanese characters having various strokes.

어떤 캐랙터 특히 한자와 같은 캐랙터는 외부 및 내부윤곽을 가지고 있기 때문에 그러한 문자의 특정 추출에서 내부 및 외부 윤곽추출을 하는 것이 요구되고 있다. 따라서 본 발명의 목적은 윤곽 추적방법을 사용하는 캐랙터 특정 추출에서 라인상에 있는 잡음 또는 불균일성을 제거하는 캐랙터 인식방법 및 장치를 제공함에 있다.Because some characters, especially characters such as Chinese characters, have external and internal contours, it is required to extract internal and external contours in certain extraction of such characters. It is therefore an object of the present invention to provide a method and apparatus for recognizing a character that eliminates noise or non-uniformity on a line in character-specific extraction using the contour tracking method.

본 발명의 또다른 목적은 윤곽 추적방법을 사용하는 캐랙터 특정 추출에서 쓰여진 또는 인쇄된 캐랙터의 스타일들 또는 크기에 관계없이 패턴을 인식할 수 있는 방법 및 장치를 제공함에 있다.It is still another object of the present invention to provide a method and apparatus capable of recognizing a pattern regardless of the style or size of a written or printed character in a character specific extraction using the contour tracking method.

본 발명의 또다른 목적은 손으로 쓰여진 또는 인쇄된 캐랙터의 스타일들 또는 크기에 관계없이 기준패턴의 기억용량을 최소화하는 패턴 인식 방법 및 장치를 제공함에 있다It is another object of the present invention to provide a pattern recognition method and apparatus for minimizing the storage capacity of a reference pattern regardless of the styles or sizes of handwritten or printed characters.

본 발명의 또다른 목적은 이미지 패턴의 외부 및 내부 윤곽을 추적하는 방법 및 장치를 제공함에 있다.It is another object of the present invention to provide a method and apparatus for tracking the outer and inner contours of an image pattern.

본 발명의 또다른 목적은 라인의 굵기가 가늘거나, 굵음에 관계없이 정확하게 이미지 패턴의 특정 추출을 윤곽 추적방법으로 행할 수 있는 패턴 인식방법 및 장치를 제공함에 있다.It is still another object of the present invention to provide a pattern recognition method and apparatus capable of performing a specific extraction of an image pattern accurately by a contour tracking method regardless of a thin line or a thick line.

상기와 같은 본 발명의 목적을 달성하기 위하여 본 발명은 상기 저장된 이미지 데이터로 부터 상기 패턴의 윤곽을 추적한 방향 코오드열을 추출하는 과정과, 상기 방향코오드열로 부터 각 윤곽 화소점의 방향 변화를 제공하는 방향자 코오드열을 추출하는 과정과, 상기 패턴의 잡음을 제거하기 위하여 상기 방향자 코오드열로 부터 소정 스므싱레인지의 방향자 코오드열을 추출하는 과정으로 이루어짐을 특징으로 한다. 또한 상기 이미지 패턴에 대하여 다수의 라인을 일방향으로 스케닝하는 과정과, 상기 스케닝중 상기 이미지 패턴의 윤곽과 만나는 제1윤곽점을 검출하는 과정과, 상기 제 1윤곽점 검출시 이 윤곽점으로 부터 윤곽 추적을 하는 과정과, 동일 라인 스케닝중 상기 이미지 패턴상의 윤곽 추적되지 않은 제2윤곽점을 검출하는 과정 및 상기 동일 라인 스케닝중 상기 제2윤곽점을 검출된 윤곽점의 갯수에 따라 상기 제2윤곽점이 내부 또는 외부 윤곽인지를 검출하는 과정으로 이루어짐을 특징으로 한다. 또한 대치 코오드를 메모리 장치에 저장하는 과정과, 상기 방향자 코오드열로 부터 소정 스므싱레인지의 방향자 코오드열을 추출하는 과정 및 상기 추출된 방향자 코오드열중 선택된 제1코오드열을 상기 대치 코오드와 비교하고 대치하는 과정을 행할 수 있도록 함을 특징으로 한다.In order to achieve the object of the present invention as described above, the present invention extracts a direction code string that traces the contour of the pattern from the stored image data, and changes the direction of each contour pixel point from the direction code string. And extracting a director coded string of a predetermined smoothing range from the director coded string to remove noise of the pattern. The method may further include scanning a plurality of lines in one direction with respect to the image pattern, detecting a first contour point that meets the contour of the image pattern during the scanning, and contouring from the contour point when the first contour point is detected. The second contour according to the process of tracking, the process of detecting the second uncontoured contour point on the image pattern during the same line scanning, and the number of the contour points detected from the second contour point during the same line scanning. Characterized in that the process consists of detecting whether the point is an internal or external contour. The method may further include storing a replacement code in the memory device, extracting a director code string of a predetermined smoothing range from the director code string, and selecting a first code string selected from the extracted director code strings with the replacement code. It is characterized in that the process of comparing and replacing can be performed.

이하 본 발명에 따라 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명한다. 본 발명에서 사용되는 "이미지"란 용어는 캐랙터, 부호, 도형 등을 표시하는 것으로 이해하여야 한다. 제2도를 참조하면, 본 발명에 따른 이미지 처리 방법을 실현하는 장치의 일 실시예를 나타내고 있다. 참조번호 20은 이미지 스케너로서 이미지를 디지털 데이터로 변환한다. 본 발명에서 사용된 스케너(20)는 MICROTEE Co.에서 제조된 MS-2, 300A이다. 이 스케너(20)는 300 DPI(Dots Per Inch)의 흑백 화소를 처리할 수 있다. 그러나 본 발명은 그러한 스케너(20)에 한정되는 것이 아님을 이해하여야 한다. 참조번호(22)는 마이크로 프로세서로서 본 발명에 따라 이미지 처리를 제어한다. 메모리(24)는 독출전용메모리(Read Only Memory : 이하 "ROM"이라 칭함)와, 랜덤 엑세스 메모리(RAM)로 구성된다. ROM에는 본 발명에 따른 이미지 처리 프로그램과 캐랙터에 대해 처리된 데이터와 비교를 위한 사전 즉 기준패턴들 및 후술하게 되는 대치를 위한 데이터가 저장되어 있다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. The term " image " used in the present invention should be understood to indicate a character, a sign, a figure, or the like. 2, an embodiment of an apparatus for realizing an image processing method according to the present invention is shown. Reference numeral 20 is an image scanner, which converts an image into digital data. The scanner 20 used in the present invention is MS-2, 300A manufactured by MICROTEE Co. The scanner 20 may process black and white pixels having 300 dots per inch (DPI). However, it should be understood that the present invention is not limited to such scanner 20. Reference numeral 22 is a microprocessor that controls image processing in accordance with the present invention. The memory 24 is composed of a read only memory (hereinafter referred to as " ROM ") and a random access memory (RAM). The ROM stores dictionaries, i.e., reference patterns and data for substitution, which will be described later, for comparison with the data processed for the image processing program and the character according to the present invention.

상기 스케너(20)로 부터 디지타이즈된 이미지 데이터는 공지의 프리프로쎄서(도시하지 않았음)에서 또는 마이크로프로쎄서(22)의 처리에 의해 2진 디지트들의 마트릭스로서 처리된후 상기 RAM에 저장된다. 만약 한페이지의 텍스트가 스케너(20)에 의해 스캔되었다면, 상기 RAM에 저장된 데이터는 공지의 이미지 세그멘테이숀(segmentation) 처리에 의해 각 캐랙터로 분리된다. 세그멘트된 각 캐랙터의 이미지 데이터는 후술하는 본 발명의 특징에 따라 마이크로 프로쎄서(22)의 제어하에 내부와 외부 윤곽 추출과 특정 추출 및 텍스트의 패턴 인식 등의 작업을 행하기 위하여 처리된다. 그러한 본 발명의 특정 작업이 행해진후 패턴 인식된 데이터가 ASCⅡ 코오드로 변환되고, 프린터, 디스플레이, 음성합성기 등과 같은 출력장치(26)로 출력된다.The image data digitized from the scanner 20 is stored in the RAM after being processed as a matrix of binary digits in a known preprocessor (not shown) or by the processing of the microprocessor 22. If one page of text is scanned by the scanner 20, the data stored in the RAM is separated into each character by a known image segmentation process. Image data of each segmented character is processed to perform operations such as internal and external contour extraction, specific extraction, and pattern recognition of text under the control of the microprocessor 22 according to the characteristics of the present invention described below. After such specific work of the present invention is performed, pattern recognized data is converted into ASCII code and output to an output device 26 such as a printer, display, speech synthesizer, or the like.

본 발명의 바림직한 또 하나의 실시예에 의해 상기 스케너(20)와 범용 콤퓨터를 사용한 패턴 인식 시스템이 사용될 수 있다. 그러한 경우 스케너(20)로 부터 디지타이즈된 이미지 데이터는 범용 콤퓨터에 내장된 디지탈 신호 처리 보오드와 같은 프리프로쎄서에 의해 2진 디지트들의 마트릭스로 처리되고, 버퍼 메모리내에 저장된다. 본 발명에 따른 패턴 인식 작업을 행하기 위한 작업을 키이보오드로 입력된 프로그램을 저장하고 있는 하드디스크 또는 플로피 디스크와 같은 저장 장치의 작업 수순에 의해 행해진다. 이 경우 사용된 범용콤퓨터는 "SUN Microsystem"사에서 제작된 "SPARC station 1"이다.According to another preferred embodiment of the present invention, a pattern recognition system using the scanner 20 and a general purpose computer may be used. In such a case, the image data digitized from the scanner 20 is processed into a matrix of binary digits by a preprocessor such as a digital signal processing board embedded in a general-purpose computer, and stored in a buffer memory. The operation for performing the pattern recognition operation according to the present invention is performed by the operation procedure of a storage device such as a hard disk or a floppy disk which stores a program input to a keyboard. The general purpose computer used in this case is "SPARC station 1" manufactured by "SUN Microsystem".

[내부 및 외부 윤곽 추출][External and External Contour Extraction]

패턴을 인식하기 위한 특정 추출은 본 발명에서 윤곽 추적방법을 기본적으로 사용한다. 윤곽 추적을 행하기 전에 윤곽 추적 시작점을 찾아야 하며, 어떤 종류의 패턴은 내부 윤곽을 가지고 있다. 내부윤곽을 가지고 있고, 이 내부 윤곽이 패턴의 특징을 결정하는데 중요한 역할을 하는 패턴들은 내부 및 외부 윤곽의 시작점을 찾아야할 뿐만 아니라 내부 및 외부 윤곽의 구별을 하지 않으면 안된다. 또한 외부 획 또는 형상으로 들러 쌓인 속에 별도로 내부 획 또는 형상을 가진 패턴도 전술한 바와 같다. 한글과 같이 하나의 캐랙터가 연결되지 않은 자음과 모음으로 결합된 경우, 각각의 자소의 특정 추출을 하기 위하여 각 자소의 윤곽 추적 시작점을 찾지 않으면 안된다. 제 3도를 참조하면 본 발명에 따라 내부 및 외부 윤곽과 이들의 윤곽 추적 시작점을 찾는 방법의 순서도가 도시되어 있다. 설명의 편이를 위하여 제14도에 도시된 한자를 예를들어 제3도의 순서도와 함께 설명한다. 제14도의 캐랙터는 전술한 바와 같이 세그멘트 처리에 의해 RAM 또는 버퍼메모리내에 저장된 이미지 패턴이며 여기서 2진 "0"은 입력 패턴의 백화소의 이미지 값이며, 2진 "1"은 흑화소의 이미지값을 나타내고, 부호 R은 행을, 부호 C는 열을 나타낸다. 그러므로 도시된 캐랙터는 행과 열로 38×38의 N×M 마트릭스 형태로 저장된 이미지 패턴이다. 외부 및 내부 윤곽을 착지 위하여 마이크로 프로쎄서(22)의 제어하에 첫번째 열(C=1)로 부터 시작하면서 열 또는 수직 스캔이 개시된다.Specific extraction for recognizing a pattern basically uses the contour tracking method in the present invention. Before contour tracking is performed, the contour tracking starting point must be found, and some types of patterns have internal contours. Patterns that have an inner contour and whose inner contour plays an important role in determining the characteristics of the pattern must not only find the starting point of the inner and outer contours, but must also distinguish between the inner and outer contours. In addition, as described above, a pattern having an inner stroke or a shape separately in which the outer stroke or the shape is stacked. When one character is combined with unconsonant consonants and vowels, like Hangeul, it is necessary to find the starting point of contour tracking of each phoneme in order to extract each phoneme. Referring to FIG. 3, there is shown a flow chart of a method for finding the inner and outer contours and their contour tracking starting points in accordance with the present invention. For ease of explanation, the Chinese characters shown in FIG. 14 will be described with the flowchart of FIG. 3, for example. The character of FIG. 14 is an image pattern stored in RAM or buffer memory by segment processing as described above, where binary "0" is an image value of a white pixel of an input pattern, and binary "1" represents an image value of a black pixel. , Symbol R represents a row and symbol C represents a column. The illustrated character is therefore an image pattern stored in the form of a 38x38 NxM matrix in rows and columns. A column or vertical scan is initiated starting from the first column C = 1 under the control of the microprocessor 22 to land the outer and inner contours.

상기 스케닝중 인접하는 두개의 화소의 이미지값 IPD(R, C)와 IPD(R+1, C)를 읽고 비교하는 것(302-304의 과정)에 의해 윤곽이 찾아질 수 있다. 예를들어 첫번째 열을 스케닝하는중(과정들 302-306의 연속 수행), 2진 변화가 없기 때문에 R=38(=N)일때 과정 305로 부터 과정 307-309까지의 수순이 순차로 진행된다. 그러므로 첫 번째 열의 스케닝은 종료하고 다음열(C=2)의 스케닝이 개시된다. 두 번째 열을 스케닝하면서, 5번째형이 선택되면(R=5), IPD(5, 2)≠IPD(6, 2)이므로, 과정(304)에서 과정 310로 수순이 진행된다. 2진 변화의 검출로 부터 카운트수 CNT는 2로 된다(과정 310). 화소위치 SP1(6, 2)에서의 화소는 이전에 윤곽 추적된 적이 없기 때문에 과정 311을 통해 과정 312로 수순이 진행된다. 과정 312에서 CNT(=2)는 짝수이기 때문에, 과정 313에서 위치 SP1(6, 2)가 외부 윤곽추적 시작점으로 기억된 후 이 시작점 SP1(6, 2)으로 부터 윤곽 추적이 과정 314에서 진행된다. 윤곽에 있는 화소들을 추적하면서 후술하는 방향 코오드가 결정되고 추적된 화소들을 태깅하는 것에 의해 추적된 화소들을 표시한다. 그후 과정 315에서 행이 하나 증가되고 과정 302로 간다. 행의 증가로 R이 8이 되면(R=8), 다음 2진 변화가 일어나기 때문에 과정 310에서 카운트수 CNT는 3이 된다.The contour can be found by reading and comparing the image values IPD (R, C) and IPD (R + 1, C) of two adjacent pixels during the scanning (process 302-304). For example, while scanning the first column (continuously performing steps 302-306), the sequence from step 305 to step 307-309 proceeds sequentially when R = 38 (= N) because there is no binary change. . Therefore, the scanning of the first column ends and the scanning of the next column (C = 2) is started. Scanning the second column, if the fifth type is selected (R = 5), then the procedure proceeds from step 304 to step 310 since IPD (5, 2) ≠ IPD (6, 2). From the detection of the binary change, the count number CNT becomes two (step 310). Since the pixel at the pixel position SP1 (6, 2) has never been contour tracked before, the procedure proceeds to step 312 through step 311. Since CNT (= 2) is even in step 312, in step 313, position SP1 (6, 2) is stored as the external contour tracking start point and contour tracking proceeds from step 314 from this starting point SP1 (6, 2). . While tracking the pixels in the contour, the directional code described below is determined and marks the traced pixels by tagging the traced pixels. The row is then incremented by one in step 315 and proceeds to step 302. When R becomes 8 due to the increase of the row (R = 8), the next binary change occurs, so that the count number CNT becomes 3 in step 310.

그러나 위치 (8, 2)의 화소는 이미 추적된 점이기 때문에 과정 315가 과정 311후 행해진다. 그러므로 R=9부터 R=37까지는 2진 변화가 없기 때문에 과정 302-306이 반복적으로 행해진다. R=38이면 과정 305과 307-309의 과정을 실행하는 것에 의해 카운트수 CNT가 1로 초기화되고 다음열(C=3)이 스캔된다. 제3번째열 부터 제7번째열까지의 스케닝중 2진 변화가 검출되면 이에 대응하는 화소는 이미 윤곽 추적된 점이 되기 때문에 과정 311후 과정 315가 진행되고 새로운 윤곽 추적 시작점이 발견되지 않는다. 제 9번째열을 스케닝하면서 제10번째행이 선택되면, IPD(10, 8)≠IPD(11, 8)이기 때문에, 과정 310에서 카운트 값 CNT는 4가 된다. 그러나 화소위치 SP2(11, 8)에서의 화소는 이전에 추적된 바가 결코 없기 때문에 과정 313에서 상기 위치 SP2(11, 8)가 새로운 또 하나의 외부 윤곽 추적 시작점으로서 결정되고 저장되며, 과정 314에서 방향 코오드 추출을 하기 위하여 윤곽 추적이 행해진다.However, process 315 is performed after process 311 because the pixels at positions 8 and 2 are already traced points. Therefore, since there is no binary change from R = 9 to R = 37, steps 302-306 are repeatedly performed. If R = 38, the count number CNT is initialized to 1 by executing steps 305 and 307-309, and the next column (C = 3) is scanned. If a binary change is detected during the scanning from the third column to the seventh column, the corresponding pixel is already contour tracked, so step 315 is performed after step 311, and no new contour tracking start point is found. If the tenth row is selected while scanning the ninth column, the count value CNT becomes 4 in step 310 because it is IPD (10, 8)? IPD (11, 8). However, since the pixel at pixel position SP2 (11, 8) has never been tracked before, in step 313 the position SP2 (11, 8) is determined and stored as another new outer contour tracking starting point, and in step 314 Contour tracking is performed for directional code extraction.

전술한 방식으로 스케닝을 하면서 제12번째열의 제15번째행(C=12, R=15)이 선택되면 이미지값 IPD(15, 12)는 이미지값 IPD(16, 12)와 같지 않기 때문에 과정 310에서 카운트수 CNT는 5가 된다. 그러나 위치 SP3(15, 12)의 화소는 이전에 추적된 바가 결코 없기 때문에 과정 312에서, 상기 위치 SP3(15, 12)는 내부 윤곽 시작점으로 결정되고 저장되며, 과정 314에서 내부 윤곽 방향 코오드들이 윤곽 추적방법에 의해 추출된다. 전술한 방식으로 스케닝하면서 위치(37, 38)의 화소가 선택되면 과정 305와 307를 통해 외부 및 윤곽과 추적 시작점의 추출이 종료되고 다음 작업으로 이동된다.If the fifteenth row (C = 12, R = 15) of the twelfth column is selected while scanning in the above-described manner, the process value 310 is not equal to the image value IPD (16, 12). The count number CNT becomes 5 at. However, in step 312, the location SP3 (15, 12) is determined and stored as an internal contour starting point, because the pixel at location SP3 (15, 12) has never been tracked before, and in step 314 the internal contour direction codes are contoured. Extracted by the tracking method. When the pixels at the positions 37 and 38 are selected while scanning in the above-described manner, the extraction of the outer and the contour and the trace starting point is terminated through the processes 305 and 307 and the process is moved to the next task.

본 발명의 바람직한 실시예로서 본 발명은 열스케닝으로 내부 및 외부 윤곽 추적 시작점들을 찾는 방법을 설명하였지만 행스케닝방법이 사용될 수도 있다. 또한 캐랙터가 얇은 굵기를 갖는 부분을 가지고 있고, 윤곽 추적중 이미 윤곽추적된 점을 만남으로써 시작점으로 돌아오지 못하는 경우가 있다. 이때에는 추적된 점을 지나 시작점으로 돌아옴과 동시에 이미 추적된 윤곽점의 외부에 윤곽점을 설정하는 것에 의해 내부 및 외부 윤곽 추출의 수순이 행해진다.As a preferred embodiment of the present invention, the present invention has described a method for finding internal and external contour trace starting points by column scanning, but a row scanning method may be used. In addition, the character may have a thin thickness, and may not return to the starting point by encountering a contour traced point during contour tracking. At this time, the procedure of extracting the inner and outer contours is performed by setting the contour point outside the contour point already tracked while returning to the starting point after the tracked point.

[특정 추출][Specific extraction]

본 발명에 따라 입력 이미지 패턴의 특정 추출을 하기 위하여 윤곽 추적 방법이 기본적으로 사용된다. 전술한 바와 같이 제3도의 방법에 의해 윤곽 추적 시작점이 결정된다. 그 다음 윤곽 추적 시작점에서 시계 방향으로 진행하면서 제3도의 과정 314에 의해 인접한 2개의 윤곽 화소간의 방향 코오들이 상기 시작점에 돌아올때까지 순차로 결정된다. 본 발명의 바람직한 실시예로써 본 발명이 사용하는 방향 코오드들은 제4 도에 도시된 코오드들이 사용된다. 즉 동은1, 동북은2, 북은3, 서북은4, 서는5, 서남은6, 남은7, 남동은8, 8방위 코오드이다.In accordance with the present invention, the contour tracking method is basically used for specific extraction of the input image pattern. As described above, the contour tracking starting point is determined by the method of FIG. Then proceeding in a clockwise direction from the contour tracking starting point, by step 314 of FIG. 3, the direction codes between two adjacent contour pixels are sequentially determined until they return to the starting point. As a preferred embodiment of the present invention, the directional codes used in the present invention are the codes shown in FIG. In other words, Dongeun 1, Northeast 2, North 3, Northwest 4, West 5, Southwest 6, Remnant 7, Southeast 8 and 8-degree codes.

(1) 방향 코오드(1) direction code

전술한 윤곽 추적 시작점의 결정후, 방향코오드열 추출 방법은 제5도의 순서도와 같다. 그러한 방법은 종래기술 예를들어 전술한 미합중국 특허 번호 제4,773,098호에 개시되어 있기 때문에 더 이상의 상세한 설명이 제5도와 관련해서 가해지지 않을 것이다. 또한 Digital Image Processing, Rafael C. Gonzalez/ Paul Wintz, Addison-Wesley Publishing Co., 1987, 특히 section 8.1.1를 보며 또한 1972년 11월 IEEE Trans, Comput. Vol. C-1, No.11, pp 1206-1216을 보라.After the above-described determination of the contour tracking start point, the direction code sequence extraction method is shown in the flowchart of FIG. Such a method is disclosed in the prior art, for example, in U.S. Patent No. 4,773,098, as described above, and no further details will be given in connection with FIG. See also Digital Image Processing, Rafael C. Gonzalez / Paul Wintz, Addison-Wesley Publishing Co., 1987, especially section 8.1.1, and in November 1972, IEEE Trans, Comput. Vol. See C-1, No. 11, pp 1206-1216.

(2) 방향자 코오드(2) director code

상기 방향 코오드열 추출후 방향자 코오드열을 추출하는 수순이 행해진다. 제6도를 참조하면 방향코오드열 θ(1)로 부터 방향자 코오드열 Δθ(1)를 찾는 순서가 도시되어 있다. 도면중 1는 위치를 나타내고, S는 마지막 위치를 나타낸다. 방향자 코오드열 추출 방법은 Digical Image Processing, Rafael C. Gonzalez/ Paul Wintz, Addison-Wesley Publishing Co., 1987, 특히 Section 8.1.1과 Section 8.2.2에 개시되어 있다. 그러나 방향자 코오드열 추출 방법을 본 발명에서 적용한 실시예가 제12도에 도시된 영어 캐랙터 "E"의 이미지 데이터(RAM에 저장된 데이터)와 관련하여 설명될 것이다.After extracting the aromatic code sequence, a procedure for extracting the director code sequence is performed. Referring to FIG. 6, the order of finding the director code sequence Δθ (1) from the direction code sequence θ (1) is shown. In the figure, 1 represents a position and S represents a final position. Method for extracting fragrance cords is disclosed in Digical Image Processing, Rafael C. Gonzalez / Paul Wintz, Addison-Wesley Publishing Co., 1987, in particular Section 8.1.1 and Section 8.2.2. However, an embodiment in which the method of extracting a director code string in the present invention will be described with reference to the image data (data stored in RAM) of the English character “E” shown in FIG.

제3도의 방법에 의해 윤곽 추적 시작점 SP이 결정된 후 이점에서 윤곽점을 따라 시계방향으로 윤곽 추적을 하면서 방향 코오드열 θ(Ⅰ)가 첨부된 테이블 1에 나타낸 바와 같이 구해진다. 과정 601에서 상기 방향코오드열 θ(Ⅰ)의 위치 Ⅰ의 초기화가 이루어진다(Ⅰ=1). 과정 602에서 다음번 위치(Ⅰ=2)가 지정되고, 과정 603후 과정 604에서 첫번째와 2번째 위치의 방향 코오드값 θ(1)과 θ(2)가 RAM 또는 컴퓨터의 기억장치로 부터 읽혀진다. 그후 과정 605에서 상기 두값의 차 θ(2)-θ(1)가 계산되고 그 결과값(0)이 2번째 위치의 방향자코오드값 Δθ(2)로 주어진다. 그 다음 과정 606을 통해 과정 602에서 위치값이 하나 증가되고(Ⅰ=3), 전술한 방법으로 Δθ(3)=Δθ(3)-θ(2)가 계산된다. 즉 각 윤곽화소의 위치에서 방향 변화가 얻어진다. 만약 구해진 방향자 코오드값이 4보다 큰 값일때는, 구해진 값에서 8을 뺀값을 취하고, -4보다 작은 값일때에는 8을 더한값을 취한다. 즉 5는 -3, -5는 3등과 같다. 만약 방향코오드값 θ(I)의 위치 Ⅰ가 마지막 위치 3이면 그 다음 위치는 첫번째 위치로 되는 것과 같은 체인 코오드(chain coda)로서 방향코오드열 θ(Ⅰ)는 취급되지 않으면 않된다. 그러므로 602과정에서 Ⅰ=S+1이면 과정 607이 과정 603후 처리된다. 전술한 방식으로 각 화소의 방향차 코오드 Δθ(I)가 첨부된 테이블 1와 같이 상기 방향코오드 θ(Ⅰ)로 부터 계산된다. 또한 방향자 코오드 Δθ(Ⅰ)에 대하여 하기의 식이 만족되지 않으면 안된다.After the contour tracking start point SP is determined by the method of FIG. 3, the direction code string θ (I) is obtained as shown in the attached Table 1 while contour tracking in the clockwise direction along the contour point. In step 601, the position I of the directional code string θ (I) is initialized (I = 1). In step 602, the next position (I = 2) is designated, and in step 603 after step 603, the direction code values [theta] (1) and [theta] (2) of the first and second positions are read from the RAM or the storage of the computer. Then, in step 605, the difference θ (2) -θ (1) of the two values is calculated and the resulting value (0) is given as the directional mark value Δθ (2) at the second position. Then in step 606, the position value is increased by one in step 602 (I = 3), and Δθ (3) = Δθ (3) -θ (2) is calculated in the above-described manner. In other words, a change in direction is obtained at the position of each contour pixel. If the director code is found to be greater than 4, then the value obtained is subtracted from 8, and if it is less than -4, 8 is taken. 5 is equal to -3 and -5 is equal to third. If the position I of the directional code value θ (I) is the last position 3, the next position is the same chain code as the first position, and the directional code string θ (I) must be treated. Therefore, if I = S + 1 in step 602, step 607 is processed after step 603. In the above-described manner, the direction difference code Δθ (I) of each pixel is calculated from the direction code θ (I) as shown in Table 1 attached thereto. In addition, the following formula must be satisfied about director code | column (DELTA) (theta) (I).

전술한 바와 같이 방향자 코오드값의 결정은 하나의 선택된 윤곽점과 이것에 인접한 두개의 윤곽점으로부터 이루어진다. 그러므로 제1번째 윤곽점의 방향자 코오드값 A(Δθ(Ⅰ)=A)는 제1번째 윤곽점에서 제Ⅰ+1번째 윤곽점으로 향하는 방향 벡터가 제 Ⅰ-1번째 화소에서 제Ⅰ번째로 향하는 방향 벡터로 부터 A×45도의 방향 변경을 상기 제1번째 윤곽점에서 일으킨 것을 말한다. 또한 A의 부호가 양이면, 방향 변경은 반시계 방향이고, 음이면 시계 방향이다. 패턴의 형상으로 본다면, 양의 방향자 코오드값은 오목한 형성이고, 음의 방향자코오드값은 볼록한 형상을 의미한다.As described above, the determination of the director code value is made from one selected contour point and two contour points adjacent to it. Therefore, the direction coder A of the first contour point A (Δθ (I) = A) is such that the direction vector from the first contour point to the I + 1th contour point is the first to the I-1th pixel. The change of direction of A x 45 degrees from the heading direction vector is caused by the first contour point. If the sign of A is positive, the direction change is counterclockwise, and if negative, it is clockwise. In terms of the shape of the pattern, the positive director code value is concave, and the negative director code value means a convex shape.

(3) 스므싱레인지의 방향자 코오드(3) Directional code of smoothing range

전술한 방향자 코오드열 Δθ(Ⅰ)의 결정후 본 발명의 특징에 따라 소정의 스므레싱레인지 까지의 방향자 코오드열들이 순자로 구해질 수 있다.After the above-described determination of the director code string Δθ (I), the director code strings up to a predetermined smoothing range may be obtained in order according to the characteristics of the present invention.

일반적으로 원고에 손으로 쓴 캐랙터 또는 도형들은 용지의 질에 따라 잉크의 번짐등으로 윤곽의 직선부분 또는 끝부분에서 부당한 불균일성 즉 잡음을 가지고 있다. 그러한 잡음은 용지의 질 뿐만 아니라 스케너의 에러 센싱에 의해서도 발생될 수 있다. 이외에 쓴 사람의 스타일 및 다른 인쇄체에 따라 획끝(serif)에 상당한 차이가 발생한다. 그러한 문제들을 해결하기 위하여 캐랙터의 특징부분의 손상없이 잡음 및 수서와 폰트의 스타일 차이를 제거할 수 있는 방법이 본 발명에 따른 소정 스므레싱인지의 방향자 코오드 추출에 의해 실현될 수 있다.In general, characters or figures written by hand on the paper have unevenness or noise at the straight part or the end of the outline due to smearing of ink depending on the quality of the paper. Such noise can be caused not only by the quality of the paper but also by the error sensing of the scanner. In addition, there are significant differences in the serif depending on the style of the writer and other printed materials. In order to solve such problems, a method capable of eliminating noise and arithmetic and font style differences without damaging the characteristics of the character can be realized by extracting the director code of a predetermined smearing perception according to the present invention.

본 명세서에서 "제N스므싱레인지"라는 용어는 방향자 코오드열중 방향 변화가 발생하는 2개의 윤곽점 또는 화소 사이에 방향 변화가 일어나지 않는 윤곽점의 최소의 갯수 N으로 정의된다. 여기서 방향변화가 없는 다수의 윤곽점들은 직선을 형성하는 것으로 이해하여야 한다. 제12도를 참조하면, 영어 캐랙터 "8"자의 이미지 데이터가 특히 끝단부에서 많은 잡음을 가지고 있음을 용이하게 알 수 있다. 이하 상기 캐랙터 "E"자의 잡음을 제거하기 위한 방법이 예로서 설명될 것이다. 제12도의 윤곽점들을 따라 점선으로 밑줄친 부분들은 각각 윤곽 추적 시작점 SP로 부터 첨부된 테이블 1에 있는 Δθ(Ⅰ)의 코오드열중 밑줄친 부분들의 방향자 코오드값들에 대응한다. 이 부분들의 각각은 연속하는 0이 아닌 방향자 값을 갖는 부분들이다. 이 부분들의 대부분은 제거되어야 할 잡음들이다. 그러한 잡음을 처리하기 위하여 연속적으로 0이 아닌 방향자값의 총합이 계산되고 그 결과의 총합이 상기 방향자 값들에 대응하는 위치들의 중앙 위치에 놓여지는 반면 나머지 위치들은 모두 0을 놓는다. 예를 들어, 시작점 SP로 부터 2번째와 3번째에 위치하는 연속적으로 0이 아닌 방향자 값들 -1, -2, -1과 1, -1, 1, 1은 각각 0, -4, 0과 0, 2, 0으로 처리된다.In the present specification, the term "N-th smoothing range" is defined as the minimum number N of contour points at which no change in direction occurs between two contour points or pixels at which direction change occurs in the director code string. Here, it should be understood that a plurality of contour points having no change in direction form a straight line. Referring to FIG. 12, it can be easily seen that the image data of the English character " 8 " has a lot of noise, especially at the ends. Hereinafter, a method for removing noise of the character "E" will be described as an example. The portions underlined by the dotted lines along the contour points of FIG. 12 correspond to the director code values of the underlined portions of the code string of Δθ (I) in Table 1 attached from the contour tracking starting point SP, respectively. Each of these parts are parts with consecutive nonzero director values. Most of these parts are noises to be removed. In order to deal with such noise, the sum of the nonzero director values is continuously calculated and the sum of the results is placed at the center position of the positions corresponding to the director values while all remaining positions set to zero. For example, successive non-zero director values -1, -2, -1 and 1, -1, 1, 1 at the second and third from the starting point SP are 0, -4, 0 and Treated as 0, 2, 0

또한 코오드열은 항상 체인 코오드열로 취급하기 때문에, 마지막으로 연속하는 0이 아닌 방향자값들 1, -2, -2(시작점값)은 제1스므레싱레인지의 코오드값들 0, -3, 0(결과의 시작점값)으로 처리된다. 그 결과의 코오드열, 즉 제1스므싱레인지의 방향자코오드열 Δθ(1, Ⅰ)가 첨두된 테이블 1에 나타나 있다. 여기서 Ⅰ는 위치를 나타낸다. 방향자 코오드열로 부터 방향 변경을 나타내는 값들(0이 아닌 방향차 코오드값들)을 추출한 배열이 형상 코오드열이다. 형상코오드열내의 각각의 방향차 코오드값은 전술한 바와같이 각도변경에 기인한 특정 형상을 갖기 때문에, 형상 코오드열은 패턴의 크기, 또는 라인의 길이 등을 나타내지 않지만 패턴의 전체 윤곽의 형상을 결정한다. 본 발명의 실시예에 따르면, 본 발명은 8방향 코오드를 사용하고 있기 때문에, 방향차 코오드값 -2는 90도의 볼록부분, 2는 90도의 오목부분, -4는 둥근형상의 볼록부분, 4는 둥근형상의 오목부분 등 임은 용이하게 이해된다. 그러므로 시작점으로 부터 시계방향으로 형상코오드열의 방향차 값들을 차례로 선택하면서 시작점에 돌아올때까지 윤곽 형상을 그려나가는 것에 의해 전체 윤곽 형상이 결정된다.Also, since the code string is always treated as a chain code string, the last consecutive non-zero director values 1, -2, and -2 (starting point values) are code values 0, -3, and 0 of the first smoothing range. (Starting point value of the result). The resultant code sequence, i.e., direction directional code sequence Δθ (1, I) of the first smoothing range, is shown in Table 1. Where I represents the position. The shape code string is an array in which values indicating direction changes (non-direction code values) are extracted from the director code string. Since each direction difference code value in the shape code string has a specific shape due to the angle change as described above, the shape code string does not indicate the size of the pattern or the length of the line, but determines the shape of the overall contour of the pattern. do. According to an embodiment of the present invention, since the present invention uses an eight-way code, the direction difference code value -2 is a convex portion of 90 degrees, 2 is a concave portion of 90 degrees, -4 is a convex portion of a round shape, and 4 is a round shape. The concave portion of the image and the like are easily understood. Therefore, the overall contour shape is determined by drawing the contour shape until returning to the starting point while sequentially selecting the direction difference values of the shape code string clockwise from the starting point.

제12a도는 첨부된 테이블 1에 나타낸 제1스므싱레인지의 방향차 코오드열 Δθ(1, Ⅰ)로부터 추출된 형상코오드열에 근거하여 도시된 패턴을 나타내고 있다. 제12도의 패턴과 비교될때, 잡음들과 폰트획끝의 구조 등이 많이 제거되어 있음을 용이하게 할 수 있다.FIG. 12A shows the pattern shown based on the shape code string extracted from the direction difference code string Δθ (1, I) of the first smoothing range shown in the attached Table 1. As shown in FIG. When compared with the pattern of FIG. 12, it is easy to remove a lot of noises and the structure of the font stroke end.

제7도를 참조하면, 전술한 바와같이 방향차 코오드열 Δθ(Ⅰ)로 부터 제1스므싱레인지의 방향차 코오드열 Δθ(1, Ⅰ)를 추출하기 위한 바람직한 상세한 순서도가 나타나있다. 제7도에서 사용된 변수들의 각 의미는 다음과 같다.Referring to FIG. 7, a detailed detailed flowchart for extracting the direction difference code sequence Δθ (1, I) of the first smoothing range from the direction difference code sequence Δθ (I) as described above is shown. The meanings of the variables used in FIG. 7 are as follows.

I=위치I = location

R=레인지R = range

Δθ(Ⅰ)=윤곽점의 위치 Ⅰ에서의 방향차 값Δθ (I) = direction difference value at position I of contour point

A=시작점부터 처음부터 방향차 값이 0인 위치A = position with zero direction difference value from the beginning

Δθ(R, Ⅰ)=제R레인지에서 Ⅰ위치의 방향차 값Δθ (R, I) = direction difference value of position I in the R range

S=마지막위치S = last position

SB, SC, SD=연속적으로 0이 아닌 방향차 값의 합SB, SC, SD = Sum of consecutive nonzero direction difference values

B=연속적으로 0이 아닌 방향차 값들에서 방향차 값이 0이 아닌 첫번째 위치B = first position where the direction difference value is non-zero in consecutive non-direction values

C=위치 B로 부터의 증분C = increment from position B

[ ]=가우스 부호[] = Gaussian Sign

제7도는 순서도는 전술한 설명과 관련하여 이 기술 분야의 통상의 지식을 가진자에게 용이하게 이해될 수 있기 때문에 더 이상의 설명을 하지 않을 것이다.7 will not be further described as the flowchart can be readily understood by one of ordinary skill in the art in connection with the foregoing description.

제12a도의 패턴을 보면 모서리 부분(1210∼1215)은 더이상의 처리를 요구한다. 직선 부분에서의 잡음은 제거되었지만 모서리 부분의 부적절한 형상이 여전히 남아있다. 만약 이러한 형상들이 제거되지 않는다면 패턴 매칭을 위하여 ROM에 저장되야 하는 기준패턴의 형상코오드값들이 크게 늘어나게 되는 문제가 발생된다. 그러므로 적절한 형상이 얻어질때까지 스므싱레인지를 증가하면서 잡음이 점점더 제거된 방향차 코오드열이 추출되지 않으면 안된다. 그러한 문제를 해결하기 위하여, 제1스므싱레인지의 방향차 코오드열 Δθ(1, Ⅰ)로부터 소정 레인지 R까지의 방향차 코오드열이 순차적으로 추출된다.Looking at the pattern of FIG. 12A, the corner portions 1210-1215 require further processing. Noise in the straight portion has been eliminated, but the inadequate shape of the edge portion still remains. If these shapes are not removed, a problem arises in that the shape code values of the reference pattern to be stored in the ROM for pattern matching are greatly increased. Therefore, the direction difference code string must be extracted in which the noise is gradually eliminated while increasing the smoothing range until an appropriate shape is obtained. In order to solve such a problem, the direction difference code sequence of the direction difference code sequence (DELTA) (theta) (1, I) of a 1st smizzing range from the predetermined range R is extracted sequentially.

제8a도∼제8e도를 참조하면, 소정 레인지 R까지의 방향차 코오드열 Δθ(R, Ⅰ)를 추출하기 위한 순서도가 도시되어 있다. 순서도를 상세히 설명하기 전에 이해를 돕기 위하여 제R레인지의 방향차 코오드열 Δθ(R, Ⅰ)로부터 다음레인지의 방향차 코오드열 Δθ(R+1, Ⅰ)를 추출하는 방법을 간단히 설명한다. 제R레인지의 방향차 코오드열 Δθ(R, Ⅰ)중 0이 아닌 2개 방향차값 사이에 0이 R개 나오는 연속하는 방향차 값들이 선택된다. 그후 상기 0이 아닌 방향차값의 합이 계산되고 그 합이 상기 연속하는 방향차값들의 중앙 위치에 놓여지고 나머지 위치들은 모두 0으로 놓는다. 코오드 Δθ(R, Ⅰ)를 체인코오드로 취급하면서 그러한 계산의 반복에 의해 Δθ(R+1, Ⅰ)가 추출된다.8A to 8E, a flowchart for extracting the direction difference code sequence Δθ (R, I) up to a predetermined range R is shown. Before explaining the flowchart in detail, a method of extracting the direction difference code sequence Δθ (R + 1, I) of the next range from the direction difference code sequence Δθ (R, I) of the R-th range will be briefly described. Consecutive direction difference values in which zero Rs are selected between two non-zero direction difference values among the direction difference code sequences Δθ (R, I) of the R-th range are selected. The sum of the nonzero direction difference values is then calculated and the sum is placed at the central position of the consecutive direction difference values and all other positions are set to zero. Δθ (R + 1, I) is extracted by repetition of such calculation while treating the code Δθ (R, I) as a chaincode.

제8도를 참조하면, 과정 811∼814을 통해 시작점(Ⅰ=1)부터 시작하면서, 제R레인지의 방향차 코오드열 Δθ(R, Ⅰ)에서 0이 아닌 방향차값이 탐색된다. 만약 발견되면 그 위치값이 J로 기억된다(과정 815). 다음으로, 상기 코오드열 Δθ(R, Ⅰ)의 꼬리 부분의 위치(S-R-1)에 상기 기억된 위치값 J가 도달하였는지를 판단한다(과정 816). 만약 도달되지 못하였다면 과정 817과 818에서 제R레인지의 방향차 코오드열에 대한 검사가 행해진다. 만약 상기 코오드열이 제R레인지의 열이 아니라면, 과정819∼824를 통해 J로부터 J+R위치까지의 Δθ(R, Ⅰ)값이 상기 위치에 대응하는 Δθ(R+I, Ⅰ)로 이동된다. 그후 위치 Ⅰ는 1증가되고(과정 825) 위치Ⅰ는 마지막 위치 S와 비교된다(과정 826). 만약 Ⅰ가 S보다 크다면, Ⅰ는 Ⅰ-S(=1)로 된다(과정 827). 만약 그렇지 않다면 Δθ(R, Ⅰ)이 0인지에 관한 판단이 행해진다(과정 828). 만약 Δθ(R, Ⅰ)=0이면 Δθ(R, Ⅰ)≠0을 만족하는 위치 Ⅰ가 발견될때까지 Δθ(R+1, 1)값이 계속하여 0으로 놓여진다(과정 825∼829). 만약 Δθ(R, Ⅰ)≠0이면 과정 828에서 과정 815로 수순이 이동된다.Referring to FIG. 8, starting from the starting point (I = 1) through steps 811 to 814, a non-zero direction difference value is searched for in the direction difference code sequence Δθ (R, I) of the R range. If found, the position value is stored as J (step 815). Next, it is judged whether the stored position value J has reached the position S-R-1 of the tail portion of the code sequence Δθ (R, I) (step 816). If not reached, a check is made for the direction difference code sequence of the R-th range in steps 817 and 818. If the code string is not a column of the R range, the values Δθ (R, I) from J to J + R are moved to Δθ (R + I, I) corresponding to the position through steps 819 to 824. do. Position I is then incremented by one (process 825) and position I is compared with the last position S (process 826). If I is greater than S, I becomes I-S (= 1) (step 827). If not, a determination is made as to whether Δθ (R, I) is zero (step 828). If Δθ (R, I) = 0, the value of Δθ (R + 1, 1) continues to be set to 0 until a position I satisfying Δθ (R, I) ≠ 0 is found (steps 825-829). If Δθ (R, I) ≠ 0, the procedure shifts from step 828 to step 815.

만약 상기 코오드열이 R레인지의 열이라면 과정 818에서 과정830으로 수순이 이동된다. 그후 과정 830∼834를 통해 J로부터 J+R위치에 대응하는 Δθ(R+1, A)를 모두 0으로 리세트한다. 그후 과정 818에서 계산된 합 SUM을 상기 코오드열의 중앙위치에 놓는다(과정 835). 그 다음 과정에서 위치 A가 마지막 위치 S인가를 판단한다(과정 836). 만약 A≠S라면 위치 I는 위치 J+R+1(=A)로 되고(과정 837)과정 825가 행해진다. 만약 A=S라면 과정 836은 과정 850으로 간다.If the code row is a row of the R range, the procedure moves from step 818 to step 830. Thereafter, the processes 830 to 834 reset all Δθ (R + 1, A) corresponding to the J + R position from J to zero. The sum SUM calculated at step 818 is then placed at the center of the code sequence (step 835). In the next process, it is determined whether position A is the last position S (step 836). If A? S, then position I becomes position J + R + 1 (= A) (step 837) and step 825 is performed. If A = S, process 836 goes to process 850.

만약 0이 아닌 방향차값의 위치 J가 S-R-1보다 크다면, 과정 816은 과정 838으로 진행한다. 이후의 과정은 방향차 코오드열의 꼬리 부분과 이와 연결된 머리부분을 처리하기 위한 과정이다. 과정 838과 과정 839는 전술한 과정 817과 818과 유사하다. 만약 R레인지의 방향차 코오드임이 만족되지 않으면 과정 840∼849를 통하여 J위치의 Δθ(R, J)값은 동일 위치의 Δθ(R+1, J)값으로 놓여지고, 나머지 위치들(J+1∼S와 1∼J+R+1-S)에서의 Δθ(R+1, Ⅰ)값은 모두 0으로 놓여진다. 그후 과정 850의 판단이 행해진다. 만약 R이 지정된 스므링렝니지가 아니라면 과정 851에서 레인지R은 1증가되고 과정 811이 행해진다.If the position J of the non-zero direction difference value is larger than S-R-1, the process 816 proceeds to the process 838. The subsequent process is to process the tail portion of the direction difference cord row and the head portion connected thereto. Processes 838 and 839 are similar to processes 817 and 818 described above. If the direction difference code of the R range is not satisfied, the values Δθ (R, J) of the J position are set to the Δθ (R + 1, J) value of the same position through the processes 840 to 849, and the remaining positions J +. The values of Δθ (R + 1, I) in 1 to S and 1 to J + R + 1-S are all set to zero. The judgment of process 850 is then made. If R is not the designated swimming ring, in step 851 the range R is increased by 1 and step 811 is performed.

과정 839에서 선택된 방향차 코오드령이 R레인지라면, 제8d도의 수순들이 행해진다. 과정 852는 상기 선택된 코오드열의 중앙위치가 머리부분에 있는가 꼬리부분에 있는가의 판단을 한다. 만약 꼬리부분에 있다면, 과정 859에서 중앙위치에 합 SUR이 놓여지며, 중앙위치를 제외한 위치 J에서 마지막 위치 S까지의 Δθ(R+1, Ⅰ)값은 과정 857에서 0으로 리세트된다. 머리부분의 위치들(1∼J+R+1-S)에서의 Δθ(R+1, Ⅰ)값은 과정 860∼863을 통해 0으로 리세트된다. 그후 과정 861을 통해 과정 850이 행해진다.If the direction difference code command selected in step 839 is an R range, the procedures of FIG. 8d are performed. Step 852 is the central position of the selected code string. Is in the head or tail. If it is at the tail, the sum SUR is placed at the center position in step 859, and the value of Δθ (R + 1, I) from position J to the last position S except the center position is reset to zero in step 857. The value of Δθ (R + 1, I) at the positions 1-J + R + 1-S of the head portion is reset to zero through steps 860-863. Process 850 is then performed through process 861.

한편, 과정 852에서 중앙위치가 머리부분에 있다면 과정 852는 과정 864로 간다. 과정 864∼868을 통해 위치 J부터 마지막 위치 S까지의 Δθ(R+1, Ⅰ)값은 0으로 놓여진다. 과정 869∼874는 머리부분이 처리의 순서도이다. 과정 874에서 중앙위치에 합 SUR이 놓여지고 중앙위치를 제외한 위치의 Δθ(R+1, Ⅰ)값은 과정 871에서 0으로 리세트된다. 그후 과정 850이 행해진다.On the other hand, if the central position is at the head in step 852, then step 852 goes to step 864. Through steps 864 to 868, the value of Δθ (R + 1, I) from the position J to the last position S is set to zero. In steps 869 to 874, the head is a flowchart of the processing. In step 874, the sum SUR is placed at the center position, and the value of Δθ (R + 1, I) at positions other than the center position is reset to 0 in step 871. Process 850 is then performed.

첨부된 테이블 1을 참조하면 제5레인지까지의 방향차 코오드열들과 이들에 대응하는 형상 코오드열들이 제12도의 이미지 패턴에 대하여 전술한 앨거리즘을 사용한 결과로 나타나 있다. 또한 R=1부터 R=5까지의 형상 코오드열들에 각각 대응하는 패턴들이 제12a도∼제12e도에 도시되어 있다. 도면에서 알 수 있는 바와 같이 스므싱레인지의 증가에 따라 잡음이 점점 더 제거되고 있음을 쉽게 알 수 있다. 그러나, 제12e도를 참조하면, 끝부분(1270∼1272)은 여전히 남아있다. 그러나 이 부분(1270∼1272)은 후술하는 대치처리로 제거하는 것에 의해 기준패턴과 일치시킬 수 있다.Referring to Table 1, the direction difference code strings up to the fifth range and the shape code strings corresponding thereto are shown as a result of using the above-described algorithm for the image pattern of FIG. Also, patterns corresponding to the shape code strings of R = 1 to R = 5 are shown in FIGS. 12A to 12E. As can be seen from the figure, it is easy to see that the noise is gradually removed with the increase of the smsing range. However, referring to Fig. 12E, the ends 1270 to 1272 still remain. However, these portions 1270 to 1272 can be matched with the reference pattern by removing them by the substitution process described later.

첨부된 테이블 2는 제13도에 도시된 영어 캐랙터"E"의 이미지 패턴에 대하여 스므싱레인지 6까지 처리된 최종의 방향차 및 형상코오드열들을 나타낸다. 첨부된 테이블 2와 관련하여 제13a도∼제13c도를 참조하면, 스므싱레인지 R=2 또는 3에 대응하는 형상이 제13a도, R=4에 대응하는 형상이 제13b도, R=5 또는 6에 대응하는 형상이 제13c도에 도시되어 있다. 도면에서 알 수 있는 바와같이, 레인지 5 도는 6의 형상코오드열은 영어캐랙터 "E"의 기준 패턴으로 사용할 수 있음을 알 수 있다.The attached Table 2 shows the final direction difference and shape code strings processed up to the smearing range 6 for the image pattern of the English character "E" shown in FIG. Referring to FIGS. 13A to 13C with reference to the attached Table 2, a shape corresponding to smoothing range R = 2 or 3 is 13a, a shape corresponding to R = 4 is 13b, and R = 5 Or a shape corresponding to 6 is shown in FIG. 13C. As can be seen in the figure, it can be seen that the shape code string of the range 5 or 6 can be used as the reference pattern of the English character "E".

첨부된 테이블 3은 제14도에 도시된 한자 캐랙터의 이미지 패턴을 처리한 코오드열들을 나타내고 있다. 또한 제14a도∼제14e도는 제14도 패턴의 시작점(SP1)로 부터 윤곽 추적하여 얻은 첨부된 테이블 3의 레인지 R=1부터 R=9까지 형상 코오드열들에 각각 대응하는 형상들이며, 제14g도∼제14j도는 마찬가지로 시작점(SP2)의 첨부된 테이블 3의 레인지들에 각각 대응하는 형상들이다.Attached Table 3 shows the code strings that process the image pattern of the Chinese character character shown in FIG. 14A to 14E are shapes corresponding to the shape code strings from the range R = 1 to R = 9 of the attached table 3 obtained by contour tracking from the starting point SP1 of the pattern of FIG. 14, respectively. 14J are likewise corresponding shapes respectively to the ranges of the attached table 3 of the starting point SP2.

첨부된 테이블 4는 제14도의 한자 캐랙터와 동일한 캐랙터로서 붓으로쓴 캐랙터를 나타내는 제15도의 캐랙터에 대하여 윤곽 추적 시작점 SP1과 SP2에 대응하는 자소의 코오드열들을 나타내고 있다. 제15a도는 상기 첨부된 테이블 4에서 시작점(SP1)에 대응하는 레인지 7의 대응형상이며, 제15b도는 후술하는 대치처리후의 대응형상이고 제15c도는 시작점(SP2)에 대응하는 R=5, 6 또는 7의 대응형상이다.The attached table 4 shows the code strings of the phonemes corresponding to the contour trace starting points SP1 and SP2 for the character of FIG. 15 representing the character written by the brush as the same character as the Chinese character character of FIG. FIG. 15A is a correspondence shape of the range 7 corresponding to the start point SP1 in the attached table 4, FIG. 15B is a correspondence shape after the substitution process described later, and FIG. 15C is an R = 5, 6 or 7 correspondence shape.

첨부된 테이블 5는 제16도의 한글 이미지 패턴을 처리한 결과의 코오드열들을 나타내고, 제16a도는 레인지 7에 대응하는 형상이며 제16b도는 후술하는 대치후의 형상을 나타낸다.The attached table 5 shows the code strings resulting from the processing of the Hangul image pattern of FIG. 16, FIG. 16A shows a shape corresponding to the range 7, and FIG. 16B shows a shape after replacement.

첨부된 테이블 6은 제17도의 붓으로 쓴 일본어 이미지 패턴을 처리한 결과의 코오드열을 나타내고, 제17a도∼제17d도는 상기 첨부된 테이블 6의 각 레인지에 대응하는 형상들을 나타낸다.The attached table 6 shows a code sequence resulting from processing the Japanese image pattern written with the brush of FIG. 17, and FIGS. 17A to 17D show shapes corresponding to the respective ranges of the attached table 6. FIG.

첨부된 테이블 6은 제17도의 캐랙터와 동일하지만 스타일만 다른 일본어 이미지 패턴을 처리한 결과의 코오드열을 나타내고, 제18a도∼제18d도는 첨부된 테이블 6의 각 레인지에 대응하는 형상들을 나타낸다.The attached Table 6 shows a code string resulting from processing a Japanese image pattern that is the same as the character of FIG. 17 but differs only in style, and FIGS. 18A to 18D show shapes corresponding to each range of the attached Table 6.

전술한 도면에서 알 수 있는 바와같이 소정레인지까지 방향차 코오드열을 이미지 패턴으로 부터 추출하는 것에 의하여 잡음이 제거된 형상을 얻을 수 있고, 필기 또는 프린트 스타일에 관계없이 소정의 기준형상이 얻어질 수 있다.As can be seen from the above drawings, a noise-free shape can be obtained by extracting the direction difference code sequence from the image pattern to a predetermined range, and a predetermined reference shape can be obtained regardless of writing or printing style. have.

[대치][Replace]

패턴 인식은 ROM 또는 특정 저장수단에 저장된 기준패턴들(사전)을 처리된 데이터와 비교하는 것과같은, 패턴매칭방법을 사용하고 있다. 그러므로 사전의 용량을 줄이는 것이 하드웨어의 비용뿐만 아니라 시스템의 계산 시간을 줄이는 것이된다. 이러한 문제점을 해결하기 위하여 테스트 이미지 패턴마다에 대응하는 기준패턴의 데이터양을 줄이는 것이 중요하게 된다. 본 발명은 대치 방법을 사용하는 것에 의하여 그 해결책을 제공한다.Pattern recognition uses a pattern matching method, such as comparing the reference patterns (pre) stored in a ROM or specific storage means with processed data. Therefore, reducing the preliminary capacity not only reduces the cost of the hardware but also the computational time of the system. In order to solve this problem, it is important to reduce the data amount of the reference pattern corresponding to each test image pattern. The present invention provides a solution by using an alternative method.

제12e도를 참조하면 필기체의 스타일 및 인쇄폰트모양에 따라 끝부분들(1270∼1272)에 많은 변화가 있다. 만일 이 끝부분의 각각이 한개의 형상코오드값을 갖도록 대치가 될 수 있다면 이 패턴에 대응하는 기준패턴의 데이터양이 감소될수 있다. 그러한 대치는 하기의 테이블 8에 나타낸 대치 테이블을 ROM 또는 저장수단에 저장하고 제9도에 나타낸 순서도로 프로그램된 마이크로 프로세서(22)를 사용하는 것에 의해 이루어질 수 있다.Referring to FIG. 12E, there are many changes in the ends 1270 to 1272 according to the style of the handwriting and the shape of the printing font. If each of these ends can be replaced to have one shape code value, the data amount of the reference pattern corresponding to this pattern can be reduced. Such replacement may be accomplished by storing the replacement table shown in Table 8 below in a ROM or storage means and using the microprocessor 22 programmed in the flowchart shown in FIG.

[테이블 8]Table 8

[대치 테이블][Replacement table]

대치를 하기 위하여, 전술한 형상 코오드열 ΔK(R, Ⅰ)이 선택된 레인지 R의 방향차 코오드열로 부터 추출되지 않으면 안된다. Ⅰ는 코오드값의 위치이다. 본 발명의 실시예에 따른 각 레인지의 방향차 코오드열 Δθ(R, Ⅰ)를 방향차 코오드열 Δθ(Ⅰ)로부터 순차로 계산할때, 0이 아닌 방향차 값이 머리부분으로 이동할 경우가 있다. 이러한 경우 이후의 형상코오드열의 선택은 이동된 방향차값 다음에 나오는 0이 아닌 방향차값을 순차로 선택하면서 시작위치로 돌아올때까지 실행한다.In order to replace, the above-described shape code sequence ΔK (R, I) must be extracted from the direction code sequence of the selected range R. I is the position of the code value. When the direction difference code sequence Δθ (R, I) of each range is sequentially calculated from the direction code sequence Δθ (I), a non-zero direction difference value may move to the head portion. In this case, the subsequent selection of the shape code string is executed until it returns to the starting position while sequentially selecting a non-zero direction difference value following the moved direction difference value.

제9도를 참조하면 과정 910에서 위치 Ⅰ의 초기값이 주어지고 과정 912∼814에서 위치 I로 부터 연속하는 3개의 값 ΔK(R, Ⅰ), ΔK(R, T+l)과 ΔK(R, Ⅰ+2)이 각각 A, B 및 C로 기억되고 과정 915와 916에서 (A, B)와 (A, B, C)가 대치 테이블내에 있는 대치용 코오드와 일치하는가에 관한 판단이 행해진다. 만약 일치하지 않는다면 IT-2때까지 과정 912∼916을 실행한다. 여기서 T는 ΔK(R, Ⅰ)값의 총수를 나타낸다. 만약 대치용 코오드가 있다면, 과정 917에서 MF×R＞L인가의 판단이 행해진다. 여기서 MF는 테이블에서 주어진 배율이고, R은 선택된 레인지이며 L은 일치되는 대치용 코오드의 외곽 방향자 값들 사이에 있는 0의 갯수(이하 거리라 칭함)이다. 만약 과정 917의 조건이 만족된다면 대응하는 대치 코오드로 대치되고(과정 918) 위치 I가 1증가된다(과정 919). 그후 과정 911이 행해진다. 만약 만족되지 않는다면 과정 911로 돌아간다. 예를들어 제12e도의 끝부분 1270은 -2, -4, 2이 방향차값열을 갖는다. 이 열은 대치 테이블에 있고 대응하는 배율 MF는 3이다. 또한 외곽 방향차값 -2와 2사이에 있는 0의 갯수는 제11g도에서 알 수 있는 바와같이 7+5=12이다. 즉 거리 L은 12이다. R=5이기 때문에 MF×R＞L을 만족하고 방향차 값이 -2, -4, 2 대신 -4로 교체된다.Referring to FIG. 9, in step 910, an initial value of position I is given, and three values ΔK (R, I), ΔK (R, T + l) and ΔK (R, which are continuous from position I in steps 912 to 814. , I + 2) are stored as A, B, and C, respectively, and a determination is made at steps 915 and 916 as to whether (A, B) and (A, B, C) match the replacement code in the replacement table. . If it doesn't match I Run steps 912-916 until T-2. Where T represents the total number of ΔK (R, I) values. If there is a replacement code, a determination is made in step 917 whether MF x R > Where MF is the magnification given in the table, R is the selected range and L is the number of zeros (hereinafter referred to as distance) between the outer director values of the matching replacement code. If the condition of step 917 is satisfied, it is replaced with the corresponding replacement code (step 918) and position I is increased by one (step 919). Process 911 is then performed. If not satisfied, go back to step 911. For example, the end portion 1270 of FIG. 12E has a direction difference value string of -2, -4, and 2. This column is in the substitution table and the corresponding magnification MF is three. The number of zeros between the outer direction difference of -2 and 2 is 7 + 5 = 12 as can be seen in FIG. The distance L is 12. Since R = 5, MF x R > L is satisfied and the direction difference value is replaced by -4 instead of -2, -4 or 2.

그러한 방식으로 교체후 제12e도의 형상은 제12f도의 형상으로 교체된다. 그러므로 영어 캐랙터 "E"의 방향차열(-4, 2, 2, -4, 2, 2, -4, -2, -2)가 ROM에 기준 패턴의 데이터로 저장되어 있다면, 소정 레인지의 방향차 코오드로 부터 선택된 형상코오드의 대치 코오드를 사용하여 정확한 패턴인식이 달성될 수 있다.In that way, after replacement, the shape of FIG. 12E is replaced with the shape of FIG. 12F. Therefore, if the direction sequence (-4, 2, 2, -4, 2, 2, -4, -2, -2) of the English character "E" is stored in the ROM as data of the reference pattern, the direction difference of a predetermined range Accurate pattern recognition can be achieved by using a replacement code of the shape code selected from the code.

만약 과정 911에서 IT-2가 만족되지 않는다면 제9b도의 과정들이 행해진다. 제9b도의 과정들은 ΔK(R, Ⅰ)가 순환하는 코오드로 취급되기 때문에, ΔK(R, Ⅰ)의 꼬리부분과 머리부분을 처리하기 위한 과정들이다. 대치용 코오드의 최대 갯수는 3이기때문에 위치 Ⅰ가 T-1일때에는 과정 912∼919이 수행되고, 위치 Ⅰ가 T일때에는 과정 920∼924, 918과 919가 수행된다. 이들 과정들은 전술한 과정 912∼919과 기본적으로 동일하다.If in course 911 I If T-2 is not satisfied, the processes of Fig. 9b are performed. The processes in FIG. 9b are processes for treating the tail and the head of ΔK (R, I) because ΔK (R, I) is treated as a circulating code. Since the maximum number of replacement codes is 3, steps 912 to 919 are performed when position I is T-1, and steps 920 to 924, 918 and 919 are performed when position I is T. These processes are basically the same as the processes 912 to 919 described above.

한편 대치를 위해 조건 MF×R＞L이 주어지는 이유는 패턴의 특징으로 나타내는 중요한 방향차값이 서로 근접 연속되는 경우에만 대치하도록 제한하여 모야의 변경을 피하기 위한 것이다.On the other hand, the reason why MF × R > L is given for the replacement is to avoid the change of the field by restricting the replacement only when the important direction difference values, which are the characteristics of the pattern, are adjacent to each other.

제13b도에 도시된 R=4의 형상코오드는 끝부분 1310의 대치에 의해 R=5 또는 6의 형상 코오드와 동일하게 변환됨을 알 수 있다. 또한 제14e도의 R=9의 형상은 대치에 의해 제14f도의 형상으로 된다. 마찬가지로 제15a도의 형상은 제15b도의 형상으로 제16a도의 형상은 제16b도의 형상으로, 변환된다.It can be seen that the shape code of R = 4 shown in FIG. 13B is converted in the same manner as the shape code of R = 5 or 6 by replacing the end portion 1310. In addition, the shape of R = 9 of FIG. 14e becomes the shape of FIG. 14f by substitution. Similarly, the shape of FIG. 15a is converted into the shape of FIG. 15b and the shape of FIG. 16a is converted into the shape of FIG. 16b.

[텍스트의 패턴 인식][Pattern recognition of text]

한 페이지 또는 한 원고상에 있는 캐랙터들이 각각 독립적인 스므싱레인지에서 패턴 인식되는 것은 비능율적일 뿐만 아니라 많은 시간을 소모한다. 그러나 인쇄물의 한페이지에 인쇄된 캐랙터들은 일정 크기의 인쇄체로 인쇄되어 있고 또한 한장의 원고에 손으로 쓴 캐랙터들도 개인의 쓰는 스타일이 일정하기 때문에 한 페이지의 패턴인식은 실질적으로 일정한 최적 스므싱레인지를 가지고 실행될 수 있다. 본 발명자는 본 발명에 따라 1500자의 실험결과로서 최적 스므싱레인지가 신문의 경우 5, 동화책 또는 소설책의 경우 5 또는 6, 그리고 작은 크기의 캐랙터들을 가지는 사전 등의 경우 4임을 발견하여 왔다. 그러므로 일반적으로 크기는 커도 두께가 가늘은 캐랙터들은 작은값의 스므싱레인지를 가지며, 크기는 커도 두께가 굵은 캐랙터들은 큰 값의 스므싱레인지를 가짐이 발견되어 왔다.It is not only inefficient but also time-consuming for characters on one page or one original to be pattern-recognized in their respective smoothing ranges. However, because the characters printed on one page of the printed matter are printed in a certain size, and the characters written by hand on a single manuscript have the same personal writing style, the pattern recognition of one page is substantially constant. Can be run with According to the present invention, the inventors have found that the optimum smoothing range is 5 for newspapers, 5 or 6 for fairy tales or novels, and 4 for dictionaries with small characters. Therefore, it has been found that, in general, larger and thinner characters have a smaller value, and thicker characters have a larger value.

이하 제10도를 참조하여 한페이지의 텍스트로 부터 최적 스므싱레인지를 결정하고 패턴을 인식하는 방법이 상세히 설명된다.Hereinafter, a method of determining an optimal smoothing range from a single page of text and recognizing a pattern will be described in detail with reference to FIG. 10.

과정 110에서, 텍스트의 첫번째 라인상에 있는 캐랙터들로 부터 최적 스므싱레인지가 결정될 수 있도록 첫번째 라인상의 첫번째 라인이 선택된다. 과정 111에서, 마이크로 프로세서(22)는 첫번째 라인상의 첫번째 캐랙터의 이미지 데이터를 마이크로 프로세서(22)내의 RAM으로 입력한다.In step 110, the first line on the first line is selected so that an optimal smoothing range can be determined from the characters on the first line of text. In step 111, the microprocessor 22 inputs the image data of the first character on the first line into the RAM in the microprocessor 22.

그후 마이크로 프로세서(22)는 과정 112에서, 제3도와 관련하여 설명된 내부 및 외부 윤곽의 존재와 윤곽 추적 시작점을 찾고, 찾은 윤곽 추적 시작점으로부터 방향 코오드르 추출하는 작업을 수행한다. 방향 코오드열의 추출후, 과정 113에서 상기 추출된 방향 코오드열로 부터 지정된 스므싱레인지까지 방향자 코오드열들이 추출되고 메모리(24)에 저장된다. 상기 방향자 코오드들을 추출하는 방법은 제5도 내지 제8도와 관련하여 이미 상세히 설명되었음을 유의하여야 한다. 스므싱레인지의 지정값은 여러 종류의 패턴이 인식될 수 있을 정도의 충분한 값, 예를들어 7이 바람직하다. 그러나 지정값은 이것에 한정되는 것은 아니며, 콤퓨터가 사용되는 경우에 텍스트의 종류에따라 지정값이 키이보오드와 같은 입력 수단에 의해 적절히 조정될 수 있다. 상기 방향차 코오드열들의 추출후, 과정 114와 115가 실행된다. 고정 114와 115의 상세 순서도는 제14도에 도시되어 있다The microprocessor 22 then finds the presence of the inner and outer contours and the contour trace starting point described in connection with FIG. 3, and extracts the direction coder from the found contour trace starting point. After extraction of the directional code string, the director code strings are extracted and stored in the memory 24 from the extracted directional code string to the designated smoothing range in step 113. It should be noted that the method of extracting the director codes has already been described in detail with reference to FIGS. 5 to 8. The designated value of the smoothing range is preferably a value sufficient to recognize various types of patterns, for example, 7. However, the designated value is not limited to this, and when the computer is used, the designated value can be appropriately adjusted by an input means such as a keyboard according to the type of text. After extraction of the direction code strings, steps 114 and 115 are performed. Detailed flowcharts of fixed 114 and 115 are shown in FIG.

제14도를 참조하면, 과정 141에서 스므싱레인지가 1로 세트된다. 그후, 과정 142에서 메모리에 저장된 제1레인지의 방향차 코오드를 액세스(access)하고, 제9도와 관련하여 이미 설명된 대치 수순의 방법에 의해 형상코오드 열이 구해진다. 그 다음, 과정 143에서 상기 형상코오드열은 사전 즉 ROM에 저장된 기준패턴의 데이터와 패턴매칭을 위하여 비교된다. 만약 사전에 있으면, 과정 144에서 패턴이 인식되고, 과정 145에서 SN(1)=1로 카운트하고, 과정 146에서 레인지 R가 하나 증가된 2로 된다. 만약 사전에 없으면, 성공수 SN(1)은 초기상태값 0을 유지하고, 과정 146에서 레인지 R가 2로 증가된다. 과정 147에서 레인지 R이 지정된 스므싱레인지값 이하라면, R=2에 대하여 과정 142∼146이 다시 실행된다. 그러므로 지정된 레인지까지 상기 첫번째 문자에 대한 성공수 SN(R)이 결정된다. 만약 레인지 R이 지정된 스므싱레인지값보다 크다면 수순은 다음수순 즉 제10도의 과정 116으로 간다. 과정 116에서 선택된 캐랙터가 첫번째 라인의 마지막 캐랙터 인지에 관한 판단이 행해진다. 만약 아니라면, 과정 117에서 첫번째 라인상의 다음 캐랙터가 선택되고 과정 112∼116이 다음 캐랙터에 대하여 다시 실행된다, 그러한 방식으로 첫번째 라인상의 각 캐랙터에 대하여 과정 112∼117이 반복적으로 실행된후, 성공수 SN(R)의 최대값이 구해질 수 있고, 그 최대값에 대응하는 레인지를 읽어내는 것에 의해 최적 스므싱레인지 OSR이 결정될 수 있다(과정 118).Referring to FIG. 14, the smoothing range is set to 1 in step 141. Thereafter, in step 142, the direction difference code of the first range stored in the memory is accessed, and the shape code sequence is obtained by the method of substitution procedure already described with reference to FIG. Then, in step 143, the shape code string is compared for pattern matching with data of a reference pattern stored in a dictionary, that is, a ROM. If it is in the dictionary, the pattern is recognized in step 144, SN (1) = 1 counts in step 145, and in step 146 the range R is incremented by one. If not, the success number SN (1) maintains the initial state value of 0, and the range R is increased to 2 in step 146. If the range R is less than or equal to the designated smoothing range value in step 147, steps 142 to 146 are executed again for R = 2. Therefore, the success number SN (R) for the first character up to the specified range is determined. If the range R is larger than the designated smearing range value, the procedure goes to the next procedure, process 116 of FIG. A determination is made as to whether the selected character in step 116 is the last character of the first line. If not, then the next character on the first line is selected in step 117 and steps 112 to 116 are executed again for the next character. In that way, after steps 112 to 117 are repeatedly executed for each character on the first line, the success number The maximum value of SN (R) can be obtained, and the optimum smoothing range OSR can be determined by reading the range corresponding to the maximum value (step 118).

최적 스므싱레인지 OSR의 결정후, 텍스트의 다음 라인부터 마지막 라인까지의 패턴인식은 제10b도와 제10c도의 순서도에 근거하여 실행된다. 과정 119과 120에서 두번째 라인의 첫번째 캐랙터가 선택된 후, 과정 121에서 이 캐랙터에 대하여 외부 및 내부 윤곽의 존재, 윤곽 추적 시작점의 추출 및 방향 코오드열의 추출이 행해진다. 과정 122에서 추출된 방향 코오드로부터 최적 스므싱레인지 OSR까지 방향차 코오드열들이 추출되고 저장된다. 그 다음 과정 123에서, 최적스므싱레인지의 방향차 코오드열이 선택되고 이로부터 형상코오드열이 추출된다. 추출된 형상코오드열은 과정 124에서 대치 테이블과의 비교예 의하여 대치된후 과정 125에서 전술한 사전과의 비교가 행해진다. 만약 사전에 있으면, 과정 126에서 전술한 사전과의 비교가 행해진다. 만약 사전에 있으면, 고정 126에서 전술한 사전과의 비교가 행해진다. 만약 사전에 있으면, 과정 126에서 상기 캐랙터가 인식되고, 과정 127에서 선택된 라인의 마지막 캐랙터인가의 판단이 행해진다. 만약 마지막 캐랙터가 아니라면 과정 128에서 다음 캐랙터가 선택되고 고정 121이하의 과정들이 수행된다. 만약 선택된 라인의 마지막 캐랙터이면, 과정 129에서 페이지의 마지막 라인인가의 판단이 행해진다. 만약 마지막 라인이 아니라면, 다음 라인의 과정 119에서 선택되고 과정 120이하의 과정이 실행된다. 만약 마지막 라인이라면 텍스트 패턴 인식은 종료된다.After the determination of the optimum smoothing range OSR, pattern recognition from the next line to the last line of text is executed based on the flowcharts of FIGS. 10B and 10C. After the first character of the second line is selected in steps 119 and 120, for this character in step 121 the presence of external and internal contours, extraction of the contour trace starting point and extraction of the direction code string are performed. The direction difference code strings are extracted and stored from the direction code extracted in step 122 to the optimum smoothing range OSR. Then, in step 123, the direction difference code sequence of the optimum smoothing range is selected and the shape code sequence is extracted therefrom. The extracted shape code string is replaced by a comparison example with a replacement table in step 124, and then compared with the aforementioned dictionary in step 125. If it is in the dictionary, a comparison with the aforementioned dictionary is made in step 126. If it is in the dictionary, a comparison with the dictionary described above is made at fixed 126. If so, the character is recognized in step 126, and a determination is made whether it is the last character of the line selected in step 127. If it is not the last character, the next character is selected in step 128 and the processes below fixed 121 are performed. If it is the last character of the selected line, then in step 129 a determination is made whether it is the last line of the page. If it is not the last line, it is selected in step 119 of the next line and a process below step 120 is executed. If the last line, text pattern recognition is terminated.

과정 125에서 만약 사전에 없다면 제10c도의 수순이 실행된다. 이 경우는 최적 스므싱레인지로 패턴인식의 실폐를 의미하기 때문에 스므싱레인지의 변경이 필요하게 된다. 이러한 경우, 패턴인식은 최적 스므싱레인지에 가까운 스므싱레인지에서 실질적으로 성공함이 발견되어 왔다. 제10c도를 참조하면, 과정 130∼136은 스므싱레인지 AR을 최적 스므싱레인지 OSR에서 하나씩 감소하면서 패턴인식을 하는 순서이다. 만약 스므싱레인지 AR이 1일때 패턴인식이 성공하지 못하면, 과정 135에서 과정 137로 수순이 진행되고 과정 138∼142에서 스므싱레인지 BR을 하나씩 증가하면서 패턴 인식이 실행된다. 이때 과정 139에서는 스므싱레인지 BR의 방향차 코오드열이 이미 과정 122에서 계산된 최적 스므싱레인지의 방향차 코오드열로부터 계산될 수 있다. 그러나 과정 132에서 스므싱레인지 AR의 방항차 코오드열은 과정 122에서 계산된 방향차 코오드열들로부터 선택되기 때문에 계산이 요구되지 않는다.In step 125, if not in advance, the procedure of FIG. 10C is executed. In this case, since the pattern recognition is lost due to the optimum smearing range, the smearing range needs to be changed. In such a case, pattern recognition has been found to be substantially successful in the smoothing range close to the optimum smoothing range. Referring to FIG. 10C, steps 130 to 136 are steps for pattern recognition while decreasing the smearing range AR by one in the optimum smearing range OSR. If the pattern recognition is not successful when the smoothing range AR is 1, the procedure proceeds from step 135 to step 137, and pattern recognition is performed by increasing the smoothing range BR by one in steps 138 to 142. At this time, in step 139, the direction difference code sequence of the smoothing range BR may be calculated from the direction difference code sequence of the optimum smoothing range already calculated in step 122. However, since the navigation difference code sequence of the smoothing range AR in step 132 is selected from the direction difference code sequences calculated in step 122, no calculation is required.

한글은 24개의 자소, 즉 10개의 모음과 14개의 자음으로 구성되어 있다. 각 자소는 음가와 별도의 독립형태를 갖는다. 그러나 하나의 한글 캐랙터는 2이상의 자소로 구성되기 때문에 초성과 중성 또는 초성과 중성과 종성의 혼합 음가를 갖는다. 또한 모음은 중성으로만 사용되고, 자음은 초성과 종성으로만 사용된다. 그리고 하나의 한글 캐랙터는 초성이 종성보다 위쪽에 있는 그러한 위치관계를 항상 갖고 있다. 그러므로 인식되야할 한글 캐랙터가 연결되지 않은 3개의 자소로 구성되어 있는 경우, ROM에 저장된 기준패턴의 특징값열의 배열이 초성, 중성, 종성의 순서로 되어 있다면, 추출된 윤곽 추적 시작점들의 행값에 의해 초성과 종성의 구별이 될 수 있고 이에 의해 초성, 중성, 종성의 순서로 본 발명에 따라 처리하는 것에 의해 패턴인식이 가능하게 된다.Hangul consists of 24 phonemes, 10 vowels and 14 consonants. Each phoneme has its own independent form. However, one Hangul character is composed of two or more phonemes, so it has a mixed voice of initial and neutral or mixed of first and neutral and final. In addition, vowels are used only as neutrals, and consonants are used as primary and final stars only. And one Hangul character always has such a positional relationship with Choseong above the Jongsung. Therefore, if the Hangul character to be recognized is composed of three unconnected letters, if the feature value sequence of the reference pattern stored in the ROM is in the order of initial, neutral, and final, then the extracted row values of the contour trace start points It is possible to distinguish between the initial and the final, so that the pattern recognition is possible by processing according to the present invention in the order of initial, neutral, and final.

한자 또한 연결되지 않은 자소들의 결합으로 형성될 수 있지만, 한자는 변, 방, 머리, 엄, 받침, 발, 몸으로 형성된 구조 순서를 가지고 있다. 이러한 구조를 사용하는 것에 의해 패턴인식이 이루어질 수 있다. 예를들어 제14도에 도시된 한자의 경우 윤곽추척 시작점들의 추출 순서에 따라 자소들의 기준 패턴의 특징값열의 배열이 만들어진다면, 시작점 추출 순서로 본 발명에 따라 처리하는 것에 의해 패턴 인식이 가능하게 된다.Chinese characters can also be formed by a combination of unconnected phonemes, but Chinese characters have a structural order formed of sides, rooms, heads, moths, feet, feet, and bodies. Pattern recognition can be achieved by using such a structure. For example, in the case of the Chinese character shown in FIG. 14, if the arrangement of the feature value strings of the reference patterns of the phonemes is made according to the extraction order of the contour tracking starting points, the pattern recognition is possible by processing according to the present invention in the starting point extraction order. do.

전술한 바와같이 본 발명은 캐랙터의 스타일에 관계없이 패턴인식을 할 수 있고 또한 방향자 코오드로 처리를 하기 때문에 용지를 스캐닝하는 방향, 또는 용지의 위치에 관계없이 패턴을 인식할 수 있는 이점을 가질 수 있다.As described above, the present invention has the advantage of being able to recognize patterns regardless of the style of the character and also to recognize the pattern regardless of the direction in which the paper is scanned or the position of the paper because the processing is performed with a director code. Can be.

[테이블 1][Table 1]

[테이블 2][Table 2]

[테이블 3][Table 3]

[테이블 4][Table 4]

[테이블 5][Table 5]

[테이블 6][Table 6]

[테이블 7][Table 7]

Claims

CLAIMS 1. A method of processing an image pattern from a memory storing image data of a pattern to be recognized, comprising: extracting a direction code string from which the contour of the pattern is traced from the stored image data; Extracting a direction difference code sequence providing a change in direction of each contour pixel point from the direction code sequence; And extracting a direction difference code sequence of a predetermined smoothing range from the direction difference code sequence in order to remove a change and noise caused by a characteristic of a pattern and a font difference of the pattern.

A method of processing an image pattern for determining external and internal contours from an input image pattern, the method comprising: scanning a plurality of lines in one direction with respect to the image pattern; Detecting a first contour point that meets the contour of the image pattern during the scanning; Tracking from the contour point when the first contour point is detected; Detecting a second contour point not contour tracked on the image pattern during the same line scanning; And detecting whether the second contour point is an inner or an outer contour according to the number of contour points detected before the second contour point is detected during the same line scanning.

A method of processing an image pattern from a direction code sequence that traces the contour of the pattern from this data of the input pattern and a direction difference code sequence that provides a change in direction of each contour point extracted from the direction code sequence. Storing the code in the memory device; Extracting a direction difference code sequence for a predetermined smoothing range from the direction difference code sequence; And comparing and replacing the selected first code sequence among the extracted direction difference code sequences with the replacement code.

A text pattern recognition method for recognizing a pattern by comparing data of image data of an input pattern with data of a reference pattern, the method comprising: selecting image data of a predetermined number of patterns among text patterns; Extracting the processed first group of data from the data, comparing the first data group with the data of the reference pattern, obtaining a pattern recognition success number, and data of the data group having the highest success number And recognizing the remaining patterns that are not selected according to the processing method.

A method of recognizing a pattern by comparing the data processed by the image data from a memory device storing image data of a text input pattern with data of a reference pattern stored in a memory, the method comprising: image data of a predetermined number of patterns among the patterns Selecting a process; Extracting a direction code sequence from which the contour of each pattern is traced from each of the selected image data, and a direction difference code sequence providing a change in direction of each contour point extracted from the direction code sequence; The pattern recognition success rate is extracted by extracting the direction difference code sequence from the extracted direction difference code sequence to a predetermined range of each pattern and comparing the data from each extracted direction difference code sequence with the data of the reference pattern. The process of selecting a high range; And extracting the direction difference code string of the selected range from the remaining patterns and comparing the data with the reference pattern data.

An apparatus for processing an image pattern from a memo that stores image data of a pattern to be recognized, the apparatus comprising: means for extracting a direction code string from which the contour of the pattern is traced from the stored image data, and the direction code string Means for extracting a direction code sequence that provides a change in the direction of each contour pixel point from and a predetermined smoothing from the direction code sequence to remove noise and variations caused by the characteristics and font differences of the aquatic pattern of the pattern And means for extracting the direction difference code sequence of the range.

An apparatus for processing an image pattern for determining external and internal contours from an input image pattern, the apparatus comprising: means for scanning a plurality of lines in one direction with respect to the image pattern, and a method that meets the contour of the image pattern during scanning; Means for detecting a first contour point, means for contour tracking from the contour point when the first contour point is detected, means for detecting a second contour point that is not contour traced on the image pattern during the same line scanning; And means for detecting whether the second contour point is an internal or external contour according to the number of contour points to be detected before the second contour point is detected during the same line scanning.

An apparatus for processing an image pattern from a direction code sequence that traces the outline of a pattern from image data of an input pattern and a direction difference code sequence that provides a change in direction of each contour point extracted from the direction code sequence. A means for storing a code in a memory device, a means for extracting a direction difference code sequence for a predetermined smoothing range from the direction difference code sequence, and a first code sequence selected from the extracted direction difference code sequence with the replacement code And an image pattern processing apparatus comprising a means for replacing.

A text pattern recognition apparatus for recognizing a pattern by comparing data of image data of an input pattern with data of a reference pattern, comprising: means for selecting a predetermined number of image data of a text pattern, and each of the selected images Means for extracting data of the first group processed from the data, means for comparing the first data group with data of the reference pattern, and obtaining pattern recognition success numbers, and data of the data group with the highest success number. And a means for recognizing the remaining patterns that are not selected in accordance with the processing.

A device for recognizing a pattern by comparing the data processed by the image data from a memory device storing image data of a text input pattern with data of a reference pattern stored in a memory, wherein the image data of the pattern of the small and medium number pattern is determined. Means for selecting, means for extracting a direction code sequence that traces the contour of each pattern from the selected respective image data and a direction difference code sequence that provides a change in direction of each contour point extracted from the direction code sequence; The pattern recognition success rate is extracted by extracting the direction difference code sequence from the extracted direction difference code sequence up to a predetermined range of each pattern and comparing the data from each extracted direction difference code sequence with the data of the reference pattern. Means for selecting a high range and the direction difference code of the selected range. Extracting the turn and text pattern recognition apparatus, characterized by a means adapted to compare the data of the reference pattern.