KR19990081569A

KR19990081569A - Feature Vector Value Extraction Method for Character Recognition

Info

Publication number: KR19990081569A
Application number: KR1019980015585A
Authority: KR
Inventors: 김형태
Original assignee: 윤종용; 삼성전자 주식회사
Priority date: 1998-04-30
Filing date: 1998-04-30
Publication date: 1999-11-15

Abstract

문자 블록의 영상을 정규화한 소정 크기의 화소 배열을 대상으로 윤곽선을 검출한 후, 화소 단위로 검색하여 서로 연결된 윤곽선 경로 상에 존재하는 각 화소의 점좌표로 구성된 윤곽 점좌표군을 저장하는 단계; 저장된 상기 윤곽 점좌표군의 좌표수를 소정 수의 점좌표로 정규화한 점좌표군인 정규화 점좌표군을 저장하는 단계; 상기 정규화 점좌표군의 윤곽선 경로 상에서 서로 인접한 점좌표들 간을 연결하는 선분의 각도인 인접 점좌표 각을 계산하여 저장하는 단계; 및 상기 인접 점좌표 각을 인공 신경망의 입력으로 인가하여 상기 인공 신경망의 출력을 특징 벡터값으로 추출하는 단계로 구성된 문자 인식을 위한 특징 벡터값 추출 방법에 관한 것이다.Detecting an outline of a pixel array having a predetermined size by normalizing an image of a character block, and then retrieving a pixel unit and storing the outline point coordinate group consisting of point coordinates of each pixel existing on the connected contour path; Storing a normalized point coordinate group which is a point coordinate group in which the coordinate number of the stored outline point coordinate group is normalized to a predetermined number of point coordinates; Calculating and storing adjacent point coordinate angles, which are angles of line segments connecting adjacent point coordinates on contour paths of the normalized point coordinate group; And extracting an output of the artificial neural network as a feature vector value by applying the adjacent point coordinate angle as an input of the artificial neural network.

본 발명에 따르면, 인공 신경망을 이용한 문자 인식 방법에 있어서, 문자 윤곽선을 검출한 후, 문자 윤곽선의 경로를 따라 윤곽선 화소들간의 각도를 문자의 특징 벡터값으로 추출함으로써 문자 인식을 위한 특징값을 추출할 시에 좀 더 정확하고 빠른 속도로 특징 벡터값을 추출할 수 있는 이점이 있다.According to the present invention, in a character recognition method using an artificial neural network, after detecting a character outline, a feature value for character recognition is extracted by extracting an angle between the contour pixels along a path of the character outline as a feature vector value of the character. The advantage is that feature vector values can be extracted more accurately and faster.

Description

Feature Vector Value Extraction Method for Character Recognition

본 발명은 문자 인식을 위한 특징 벡터값 추출 방법에 관한 것으로, 더욱 상세하게는 인공 신경망을 이용한 문자 인식 방법에 있어서, 문자 윤곽선을 검출한 후, 문자 윤곽선의 경로를 따라 윤곽선 화소들간의 각도를 문자의 특징 벡터값으로 추출하는 문자 인식을 위한 특징 벡터값 추출 방법에 관한 것이다.The present invention relates to a feature vector value extraction method for character recognition. More particularly, in a character recognition method using an artificial neural network, after detecting a character outline, the angle between the contour pixels along the path of the character outline is determined. The present invention relates to a feature vector value extraction method for character recognition that extracts a feature vector value.

인간 두뇌의 특성은 고도의 병렬성, 오류 허용도(Fault tolerant), 지식의 분산 표현, 분산 제어 등으로 대표되는 데, 이와 같은 특성을 인간이 갖고 있는 독특한 신경 조직에 근거한다. 인간의 신경 조직은 도 1에 나타낸 바와 같이, 소위, 뉴런(neuron)으로 명명되는 신경 세포로 이루어져 있다.The characteristics of the human brain are represented by a high degree of parallelism, fault tolerant, distributed representation of knowledge, and distributed control, which are based on the unique neural tissues that humans possess. Human neural tissue consists of nerve cells called neurons, as shown in FIG. 1.

뉴런은 생체 속에서 정보처리를 위해 특별한 분화를 이룬 세포로, 본체인 세포체(cell body=soma)부분과, 복잡하게 갈라진 수상돌기(dendrites)라고 불리는 부분, 본체에서 한 줄만 뻗어 나왔다가 말단에서 다수로 갈라진 축색(axons)이라고 불리는 부분의 세 가지로 나뉘어져 있다.Neurons are cells that have been specially differentiated for information processing in a living body. They have a cell body (soma), which is a main body, and a part called complex dendrites. It is divided into three parts called axons.

축색은 세포체 본체로부터의 신호를 다른 뉴런에 전달하는 섬유(Nerve Fibers)이고 수상돌기는 다른 뉴런으로부터의 신호를 받아들이는 부분이다. 즉 다른 뉴 런의 축색의 말단이 여기에 연결이 되어 있다. 이 연결을 시냅스(synapse)라 부른다.Axons are fibers that carry signals from the body of the cell to other neurons, and dendrites are the parts that receive signals from other neurons. That is, the end of the axon of another neuron is connected to it. This connection is called a synapse.

뉴런 내부의 전위는 외부에 비해 보통 때는 낮은 상태를 유지하지만 축색을 통과하여 외부로부터 입력 신호가 도달하면 어떤 조건하에서 뉴런이 '흥분(excitory)'하고 내부 전위가 갑자기 높아진다. 이때 뉴런은 '발화(fire)'했다고 하고, 시간 폭으로 해서 1 msec, 전압으로 해서 0.1V 정도의 펄스가 축색을 통해 나가며, 이것은 해당 뉴런의 신호로서 여타의 뉴런에 전달된다. 수상돌기에 다른 뉴런으로부터의 전기 펄스가 도달하면, 이 곳의 전위가 약간 변동된다. 수상돌기의 여러 곳에 펄스가 오면 여러 곳의 전압이 약간씩 변동된다. 이 변동값이 본체까지 도달하여 더해지고, 합계가 어떤 임계치(threshold)를 넘을 때 이것이 계기가 되어 뉴런은 발화하고, 임계치를 넘지 못할 경우 아무 반응도 일어나지 않는다.The potential inside the neuron is usually lower than it is outside, but when it passes through the axon and the input signal arrives from the outside, the neuron 'excitory' under certain conditions and the internal potential suddenly rises. In this case, the neurons are said to be 'fire', and a pulse of about 0.1 m is transmitted through the axon as a time width and a voltage of 1 msec, which is transmitted to other neurons as a signal of the neuron. When electric pulses from other neurons reach the dendrites, the potential here fluctuates slightly. When a pulse comes in several places of the dendrite, the voltage of several places fluctuates slightly. This variation reaches and adds to the body, and when the sum exceeds a certain threshold, this triggers the neuron to ignite, and if no threshold is reached, no reaction occurs.

뉴런의 수리 모델은 크게 선형 가산성 및 비선형 임계치 특성을 갖는 데, 선형가산성은 뉴런이 다른 뉴런으로부터의 신호에 가중치를 곱하고 더하는 것을 칭하는 것이며, 비선형 임계치 특성은 그 가중 합계가 임계치를 넘지 않으면 아무 일도 일어나지 않는 반면에 그 가중 합계가 임계치를 넘으면 펄스를 하나 내보내는 비선형적인 동작을 함으로 말하는 것이다. 이를 잘 알려진 바 있는 MP(McCulloch and Pitts) 모델을 통해 설명하면 다음과 같다.Numerical models of neurons have large linear and nonlinear threshold characteristics, which are what neurons multiply and add to signals from other neurons, and nonlinear threshold characteristics do nothing unless their weighted sum exceeds the threshold. On the other hand, if the weighted sum exceeds a threshold, it is said to be a nonlinear operation that emits a pulse. This is explained through the well-known MP (McCulloch and Pitts) model as follows.

MP(McCulloch and Pitts) 모델에 있어서, 뉴런은 식 (1)을 만족하면 1을 출력하고, 그렇지 않으면 0을 출력한다.In the MPCulloch and Pitts (MP) model, the neuron outputs 1 if it satisfies Equation (1), and 0 otherwise.

w₁x₁+w₂x₂+⃛+w_Mx_M＞Tw ₁ x ₁ + w ₂ x ₂ + ⃛ + w _M x _M ＞ T

D=w₀x₀+w₁x₁+⃛+w_Mx_M D = w ₀ x ₀ + w ₁ x ₁ + ⃛ + w _M x _M

여기서, x₁,x_2,⃛,x_M 은 입력 패턴이고, w₁,w₂,⃛,w_M 는 가중치들(weights)이다. 이때, w₀=-T, x₀=1 일 때, 식 (2)의 D>0 이면 출력은 1이고 D<0 이면 출력은 0이다. 단, w₀ 는 바이어스 가중치(bias weight)가 된다.here, x ₁ , x _2, ⃛, x _M Is the input pattern, w ₁ , w ₂ , ⃛, w _M Is the weights. At this time, w ₀ = -T, x ₀ = 1 When is of formula (2) D> 0 If the output is 1 D <0 If it is, the output is zero. only, w ₀ Becomes a bias weight.

연결주의자 모델, 병렬 분산 처리 모델, 신경 형태소(neuromorphic) 시스템, 또는 간단히 "신경망"으로 불리우는 인공 신경망 모델(artificial neural network model)은 간단한 계산 요소(즉, 노드)들 간의 상호 연결을 통해 높은 분류 성능을 얻고자 하는 데 그 목적이 있다.Connectionist models, parallel distributed processing models, neural morphological systems, or artificial neural network models, sometimes referred to simply as "neural networks", provide high classification performance through the interconnection between simple computational elements (ie, nodes). The purpose is to obtain.

인공 신경망의 사용에 있어서 어려운 점은 적절한 동작으로 문제를 해결하는 가중치들(weights)을 찾는 것(training)이다. 한번 가중치들이 정확하게 찾아지면 인공 신경망을 이용하여 샘플들을 분류하는 작업은 수월해진다.The difficulty with the use of artificial neural networks is to find weights that solve the problem with proper operation. Once the weights are found correctly, it is easy to classify the samples using artificial neural networks.

인공 신경망은 인간의 생체적인 뉴런을 전술한 바와 같은 개념을 반영하여 공학적으로 모델링하는 것에 의해 정의되며, 흔히, 감각층(입력층), 연결층(은닉층), 반응층(출력층)과 같이 배열한 계산모델을 이용하고 이러한 계산 모델에 근거하여 자신의 두뇌 상태를 수정해 가는 학습을 통해 당면한 문제를 해결할 수 있는 모델로 간주할 수 있다.Artificial neural networks are defined by engineering human models of biological neurons, reflecting the concepts described above, and are often arranged as sensory layers (input layers), connection layers (hidden layers), and reaction layers (output layers). Learning to use a computational model and modifying the state of your brain based on these computational models can be regarded as a model that can solve your problems.

지금까지 제안된 인공 신경망에는 뉴런의 활성화 함수, 회로망의 구조, 뉴런의 연결 강도를 조정하는 학습 규칙 등에 따라 다양한 형태가 있는 데, 대표적인 인공 신경망으로는 기초적인 신경망인 퍼셉트론(perceptron)을 비롯하여 홉필드 신경망(Hopfield network), 자기 조직화 지도(self-organizing maps), 네오코그니트론(neocognitron), 역전파 모델(backpropagation), 적응공명이론(ART; Adaptive Resonance Theory), 볼쯔만 기계(Boltzman machine) 등이 있다.So far, the proposed artificial neural network has various forms according to the activation function of the neuron, the structure of the network, and learning rules for adjusting the connection strength of the neuron. Representative artificial neural networks include the basic neural network, the perceptron, and the hopfield. Hopfield network, self-organizing maps, neocognitron, backpropagation, Adaptive Resonance Theory, Boltzman machine, etc. There is this.

전술한 바와 같은 인공 신경망과 관련하여 학습에 의해 문자를 인식하도록 하는 본 발명에 대한 이해를 돕기 위해 일반적인 문자 인식 방법을 설명하기로 한다.General character recognition method will be described in order to help the understanding of the present invention to recognize a character by learning in relation to the artificial neural network as described above.

도 2는 일반적인 문자 인식 방법을 나타낸 순서도이다.2 is a flowchart illustrating a general character recognition method.

우선, 원영상 입력 단계(S10)에서는 스캐너와 같은 문서 독취 수단을 통해 원고를 스캐닝하여 획득한 스캐닝 화상이나 저장 매체에 기저장된 화상 파일 등으로부터 원화상을 입력받은 후, 전처리 단계(S20)에서는 이 원화상을 문자 인식에 적합하도록 잡영을 제거하거나 배경 영역과 문자 영역을 분리해내는 전처리 과정을 수행한다.First, in the original image input step S10, an original image is input from a scanning image obtained by scanning an original through a document reading means such as a scanner or an image file previously stored in a storage medium, and then, in the preprocessing step S20. Preprocessing is performed to remove the noise and to separate the background and text areas to make the original image suitable for text recognition.

이후, 특징 벡터값 추출 단계(S30)에서는 문자 영역으로 분리된 해당 문자를 대상으로 다양한 방법을 동원하여 문자에 대한 특성을 반영할 수 있도록 정의된 특징 벡터값(feature vector values)을 추출해낸다.Subsequently, in the feature vector value extraction step (S30), feature vector values defined to reflect characteristics of the character are extracted by using various methods for the corresponding character separated into the character area.

인식 알고리즘 단계(S40)에서는 전술한 바 있는 인공 신경망(artificial neural network)이나 템플리트 정합(template matching), 퍼지 알고리즘(fuzzy algorithm), 구문론(semantic method), 유전자 알고리즘(genetic algorithm), 동적 프로그래밍(dynamic programming) 등과 같은 패턴 인식 알고리즘(pattern recognition algorithm)을 이용하여 이 특징 벡터값에 대응한 문자를 인식한다.In the recognition algorithm step (S40), the artificial neural network or template matching, fuzzy algorithm, semantic method, genetic algorithm, and dynamic programming described above are used. A character corresponding to this feature vector value is recognized using a pattern recognition algorithm such as programming.

이후, 후처리 단계(S50)에서는 인식된 문자 부분에 대한 원화상을 복원하고 다음 문자를 처리하기 위해 이상을 과정을 반복적으로 수행한다.Subsequently, in the post-processing step (S50), the above process is repeatedly performed to restore the original image for the recognized character portion and to process the next character.

전술한 인공 신경망을 포함한 대부분의 인식 알고리즘이 그렇듯이 인식 결과에 대한 상대적으로 높은 신뢰도룰 제공하기 위해서는 상대적으로 많은 수의 특징 벡터값을 추출해야하는 데, 이는 자연스럽게 수렴 속도를 지연시키는 결과를 초래함에 따라 특징 벡터값의 수와 인식 결과의 신뢰도는 각각 상호 타협 관계(tradeoff)를 갖고 있는 것으로 파악할 수 있다.As with most recognition algorithms including artificial neural networks described above, in order to provide a relatively high reliability for recognition results, a relatively large number of feature vector values need to be extracted, which naturally results in delaying the convergence speed. The number of vector values and the reliability of the recognition result can be seen as having a mutual tradeoff.

그러나, 인공 신경망의 경우, 너무 많은 수의 특징 벡터값이 인공 신경망에 입력층에 인가되면, 은닉층의 노드수를 상대적으로 많이 할당해야 정확도가 높은 특징 벡터값이 출력되는 데, 은닉층의 노드수가 많으면 인공 신경망의 학습에 막대한 시간이 소요될 뿐만 아니라 복잡도가 증가하여 부정확한 특징 벡터값이 출력된 확률이 커짐과 동시에 이는 곧 실제 인식률과 인식 속도를 저하시키는 문제를 초래한다.However, in the case of the artificial neural network, if too many feature vector values are applied to the input layer to the artificial neural network, a relatively high number of nodes of the hidden layer is required to output a highly accurate feature vector value. Not only does it take a great deal of time to learn artificial neural networks, but the complexity increases and the probability that an incorrect feature vector value is output increases, which leads to a problem of lowering the actual recognition rate and recognition speed.

이에 따라, 특징 벡터값의 수를 최적화하여 인공 신경망의 복잡도를 떨어뜨리고 인식 속도를 개선하면서 동시에 높은 인식률을 확보하기 위해서는 오염이 안된 인쇄 문자 패턴은 물론이고, 기하학적 변형되거나 다소의 훼손된 인식 대상 문자 패턴이 입력되더라도 해당 인식 대상 문자 패턴에 대한 특징을 최대로 반영할 수 있는 특징 벡터값을 추출할 수 있어야 한다.Accordingly, in order to optimize the number of feature vector values to reduce the complexity of the artificial neural network, improve the recognition speed, and at the same time obtain a high recognition rate, not only the uncontaminated printed character pattern but also the geometrically deformed or somewhat damaged recognition object pattern Even if is inputted, it should be possible to extract a feature vector value that can fully reflect the characteristics of the character string to be recognized.

따라서, 본 발명은 이와 같은 필요성을 문자 인식 과정에 반영하기 위해 안출된 것으로, 인공 신경망을 이용한 문자 인식 방법에 있어서, 문자 윤곽선을 검출한 후, 문자 윤곽선의 경로를 따라 윤곽선 화소들간의 각도를 문자의 특징 벡터값으로 추출함으로써 좀 더 정확하고 빠른 속도로 특징 벡터값을 추출할 수 있도록 한 문자 인식을 위한 특징 벡터값 추출 방법을 제공함에 그 목적이 있다.Accordingly, the present invention has been made to reflect such a necessity in the character recognition process. In the character recognition method using an artificial neural network, after detecting a character outline, the angle between the contour pixels along the path of the character outline is converted into a character. It is an object of the present invention to provide a feature vector value extraction method for character recognition that can extract a feature vector value more accurately and quickly by extracting the feature vector value.

도 1은 인간 신경의 뉴런의 구조를 나타낸 예시도,1 is an exemplary view showing the structure of neurons of the human nerve,

도 2는 일반적인 문자 인식 방법을 나타낸 순서도,2 is a flowchart illustrating a general character recognition method;

도 3은 본 발명에 따른 문자 인식을 위한 특징 벡터값 추출 방법의 바람직한 실시예를 나타낸 블록도,3 is a block diagram showing a preferred embodiment of a feature vector value extraction method for character recognition according to the present invention;

도 4는 본 발명에 따른 문자 인식을 위한 특징 벡터값 추출 방법을 이용한 특징 벡터값의 추출예를 나타낸 예시도이다.4 is an exemplary view showing an example of extracting a feature vector value using the feature vector value extraction method for character recognition according to the present invention.

이와 같은 목적을 달성하기 위해 본 발명에 따른 문자 인식을 위한 특징 벡터값 추출 방법은, 문자 블록의 영상을 정규화한 소정 크기의 화소 배열을 대상으로 윤곽선을 검출한 후, 화소 단위로 검색하여 서로 연결된 윤곽선 경로 상에 존재하는 각 화소의 점좌표로 구성된 윤곽 점좌표군을 저장하는 단계;In order to achieve the above object, the feature vector value extraction method for character recognition according to the present invention detects an outline of a pixel array having a predetermined size normalizing an image of a character block, and then searches by pixel unit to search for the contour. Storing outline point coordinate groups composed of point coordinates of each pixel existing on the contour path;

저장된 상기 윤곽 점좌표군의 좌표수를 소정 수의 점좌표로 정규화한 점좌표군인 정규화 점좌표군을 저장하는 단계;Storing a normalized point coordinate group which is a point coordinate group in which the coordinate number of the stored outline point coordinate group is normalized to a predetermined number of point coordinates;

상기 정규화 점좌표군의 윤곽선 경로 상에서 서로 인접한 점좌표들 간을 연결하는 선분의 각도인 인접 점좌표 각을 계산하여 저장하는 단계; 및Calculating and storing adjacent point coordinate angles, which are angles of line segments connecting adjacent point coordinates on contour paths of the normalized point coordinate group; And

상기 인접 점좌표 각을 인공 신경망의 입력으로 인가하여 상기 인공 신경망의 출력을 특징 벡터값으로 추출하는 단계로 구성되는 것이 특징이다.And applying the adjacent point coordinate angle as an input of the artificial neural network to extract the output of the artificial neural network as a feature vector value.

여기서, 상기 인접 점좌표 각은 상기 서로 인접한 점좌표 쌍을 대상으로 삼각함수적으로 연산하는 것이 바람직하다.Here, the adjacent point coordinate angle is preferably trigonometrically calculated for the pair of adjacent point coordinates.

이하, 본 발명에 따른 문자 인식을 위한 특징 벡터값 추출 방법의 바람직한 실시예를 첨부한 도면을 참조하여 설명하면 다음과 같다.Hereinafter, a preferred embodiment of a feature vector value extraction method for character recognition according to the present invention will be described with reference to the accompanying drawings.

도 3은 본 발명에 따른 문자 인식을 위한 특징 벡터값 추출 방법의 바람직한 실시예를 나타낸 블록도로, 일관된 설명을 위해 종래 기술과 같은 수행 과정에는 동일한 도면 번호를 부여하기로 한다.3 is a block diagram illustrating a preferred embodiment of a feature vector value extraction method for character recognition according to the present invention. For the sake of consistent description, the same reference numerals will be given to the same process as the prior art.

본 발명에 따른 문자 인식을 위한 특징 벡터값 추출 방법의 바람직한 실시예는 도 3에 도시한 바와 같이, 원영상 입력 단계(S10)에서는 스캐너와 같은 문서 독취 수단을 통해 원고를 스캐닝하여 획득한 스캐닝 화상이나 저장 매체에 기저장된 화상 파일 등으로부터 원화상을 입력받은 후, 전처리 단계(S20)에서는 이 원화상을 문자 인식에 적합하도록 잡영을 제거하거나 배경 영역과 문자 영역을 분리해내는 전처리 과정을 수행한다.As shown in FIG. 3, in the original image input step S10, a scanning image obtained by scanning an original through a document reading means such as a scanner, is illustrated in FIG. 3. After receiving the original image from the image file or the like previously stored in the storage medium, the preprocessing step (S20) performs a preprocessing process to remove the miscellaneous images or to separate the background region and the character region so that the original image is suitable for character recognition. .

이후, 본 발명의 특징 벡터 추출 과정을 수행하게 되는 데, 먼저, 단계 S100에서는 도 4의 (a)와 도 (b) 및 (c)에서 살펴볼 수 있듯이, 문자 블록의 영상을 정규화한 N×N 크기의 화소 배열을 대상으로 윤곽선을 검출한 후, 화소 단위로 검색하여 서로 연결된 윤곽선 경로 상에 존재하는 각 화소의 점좌표로 구성된 윤곽 점좌표군을 저장한다Thereafter, the feature vector extraction process of the present invention is performed. First, in step S100, as shown in FIGS. 4A, 4B, and C, N × N normalizing an image of a character block is performed. After the contour is detected by the pixel array of the size, the contour is searched in units of pixels, and the contour point coordinate group consisting of the point coordinates of each pixel existing on the connected contour path is stored.

이후, 단계 S110에서는 저장된 상기 윤곽 점좌표군의 좌표수를 소정 수의 점좌표로 정규화한 점좌표군인 정규화 점좌표군을 저장하고, 단계 S120에서는 상기 정규화 점좌표군의 윤곽선 경로 상에서 서로 인접한 점좌표들 간을 연결하는 선분의 각도인 인접 점좌표 각을 삼각함수적으로 계산하여 저장한다.Thereafter, in step S110, a normalized point coordinate group, which is a point coordinate group in which the number of coordinates of the stored outline point coordinate group is normalized to a predetermined number of point coordinates, is stored, and in step S120, point coordinates adjacent to each other on a contour path of the normalized point coordinate group The angle of adjacent point coordinates, which is the angle of the line connecting them, is calculated and stored trigonometrically.

이를 좀 더 상세하게 설명하면, 서로 인접한 점좌표 쌍의 좌표가 P1 (x_1,y₁) , P2 (x_2,y₂) 일 때, 인접 점좌표 각은 수학식 3과 같다.In more detail, the coordinates of adjacent pairs of point coordinates (x _1, y ₁ ) , P2 (x _2, y ₂ ) When, the adjacent point coordinate angle is as shown in equation (3).

예컨대, 서로 인접한 점좌표 쌍의 좌표가 (3,4), (5,6)일 때, 인접 점좌표 각은 수학식 3에 의해 45。임을 용이하게 연산할 수 있다.For example, when the coordinates of the pair of point coordinates adjacent to each other are (3, 4) and (5, 6), the adjacent point coordinate angles can be easily calculated as equation (45).

이어서, 단계 S130에서는 상기 인접 점좌표 각을 인공 신경망의 입력으로 인가하고, 단계 S140에서는 상기 인공 신경망의 출력을 특징 벡터값으로 추출한다.Subsequently, in step S130, the adjacent point coordinate angle is applied as an input of the artificial neural network, and in step S140, the output of the artificial neural network is extracted as a feature vector value.

이후, 종래 기술과 마찬가지로, 인식 알고리즘 단계(S40)에서는 인공 신경망(artificial neural network)이나 템플리트 정합(template matching), 퍼지 알고리즘(fuzzy algorithm), 구문론(semantic method), 유전자 알고리즘(genetic algorithm), 동적 프로그래밍(dynamic programming) 등과 같은 패턴 인식 알고리즘(pattern recognition algorithm)을 이용하여 이 특징 벡터값에 대응한 문자를 인식한다. 이때, 인공 신경망을 이용하는 것이 바람직하다.Then, as in the prior art, in the recognition algorithm step S40, an artificial neural network or template matching, fuzzy algorithm, semantic method, genetic algorithm, dynamic A pattern recognition algorithm such as dynamic programming is used to recognize a character corresponding to this feature vector value. At this time, it is preferable to use an artificial neural network.

마지막으로, 후처리 단계(S50)에서는 인식된 문자 부분에 대한 원화상을 복원하고 다음 문자를 처리하기 위해 이상을 과정을 반복적으로 수행한다.Finally, in the post-processing step (S50), the above process is repeatedly performed to restore the original image for the recognized character portion and process the next character.

본원에서 사용되는 용어(terminology)들은 본 발명에서의 기능을 고려하여 정의내려진 용어들로써 이는 당분야에 종사하는 기술자의 의도 또는 관례 등에 따라 달라질 수 있으므로 그 정의는 본원의 전반에 걸친 내용을 토대로 내려져야 할 것이다.Terminologies used herein are terms defined in consideration of functions in the present invention, which may vary according to the intention or customs of those skilled in the art, and the definitions should be made based on the contents throughout the present application. will be.

또한, 본원에서는 본 발명의 바람직한 실시예를 통해 본 발명을 설명했으므로 본 발명의 기술적인 난이도 측면을 고려할 때, 당분야에 통상적인 기술을 가진 사람이면 용이하게 본 발명에 대한 또 다른 실시예와 다른 변형을 가할 수 있으므로, 상술한 설명에서 사상을 인용한 실시예와 변형은 모두 본 발명의 청구 범위에 모두 귀속됨은 명백하다.In addition, since the present invention has been described through the preferred embodiment of the present invention, in view of the technical difficulty aspects of the present invention, those having ordinary skill in the art can easily be different from another embodiment of the present invention. Since modifications may be made, it is obvious that both the embodiments and modifications cited in the above description belong to the claims of the present invention.

이상에서 상세하게 설명한 바와 같이, 인공 신경망을 이용한 문자 인식 방법에 있어서, 문자 윤곽선을 검출한 후, 문자 윤곽선의 경로를 따라 윤곽선 화소들간의 각도를 문자의 특징 벡터값으로 추출하는 본 발명에 의한 문자 인식을 위한 특징 벡터값 추출 방법에 따르면, 문자 인식을 위한 특징값을 추출할 시에 좀 더 정확하고 빠른 속도로 특징 벡터값을 추출할 수 있는 이점이 있다.As described in detail above, in the character recognition method using an artificial neural network, after detecting the character outline, the character according to the present invention extracts the angle between the contour pixels along the path of the character outline as a feature vector value of the character. According to the feature vector value extraction method for recognition, there is an advantage that the feature vector value can be extracted at a more accurate and faster speed when the feature value for character recognition is extracted.

Claims

Detecting an outline of a pixel array having a predetermined size by normalizing an image of a character block, and then retrieving a pixel unit and storing the outline point coordinate group consisting of point coordinates of each pixel existing on the connected contour path;

Storing a normalized point coordinate group which is a point coordinate group in which the coordinate number of the stored outline point coordinate group is normalized to a predetermined number of point coordinates;

Calculating and storing adjacent point coordinate angles, which are angles of line segments connecting adjacent point coordinates on contour paths of the normalized point coordinate group; And

And extracting the output of the artificial neural network as a feature vector value by applying the adjacent point coordinate angle as an input of the artificial neural network.

The method of claim 1, wherein the adjacent point coordinate angle,

Characteristic vector value extraction method for character recognition, characterized in that the trigonometric operation on the adjacent pair of point coordinates.