KR100874747B1

KR100874747B1 - Camera Character Recognition Apparatus and Method Using Pixel Shift Document Image Recognition

Info

Publication number: KR100874747B1
Application number: KR1020070081866A
Authority: KR
Inventors: 김계경; 안현상; 이인호
Original assignee: 한국전자통신연구원
Priority date: 2006-12-02
Filing date: 2007-08-14
Publication date: 2008-12-19
Also published as: KR20080050272A

Abstract

본 발명은 화소 이동 문서 영상 조합 인식 방법을 이용한 카메라 문자 인식 장치 및 방법에 관한 것으로서, 카메라 문서 영상을 화소가 이동된 여러 장의 부-문서 영상(sub-document images)으로 나누어 입력하고, 각 문서 영상에 대해 영상 개선 알고리즘 및 이진화를 적용한 다음, 화소 이동 망 특징 등의 조합 문자 특징 추출기를 이용하여 문자의 특징을 추출한 다음, 가중치가 부여된 이종 인식기를 통해 인식하여 최종 인식 결과를 얻어냄으로써, 인식된 결과는 문서 편집, 데이터 베이스 구축 및 음성 등의 형태로 변환하여 사용될 수 있으므로 사용자의 요구에 대응하여 다양한 서비스를 제공할 수 있을 것이다. BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an apparatus and method for recognizing a camera character using a pixel shift document image combination recognition method, wherein a camera document image is divided into a plurality of sub-document images in which pixels are shifted, and each document image is input. After applying the image enhancement algorithm and the binarization to, extract the feature of the character by using the combined character feature extractor such as the pixel shift network feature, and then recognize it through the weighted heterogeneous recognizer to obtain the final recognition result. The results can be used in the form of document editing, database construction, and voice, so that various services can be provided in response to user needs.

Description

Apparatus and method for recognizing camera character using pixel shift document image recognition method TECHNICAL FIELD

본 발명은 영상 처리 기술 중 카메라를 이용하여 획득한 문서 영상을 인식하는 기술에 관한 것으로, 더욱 상세하게는 카메라를 이용하여 화소가 이동된 부-문서 영상들을 획득하고 영상 개선, 국소 이진화를 적용하여 개별 문자로 추출한 다음 화소 이동된 개별 문자의 특징을 추출하고 가중치가 부여된 이종 인식기를 이용하여 인식하고 결합하는 화소 이동 문서 영상 조합 인식 방법을 이용한 카메라 문자 인식 장치 및 방법에 관한 것이다. The present invention relates to a technique of recognizing a document image acquired using a camera among image processing techniques. More particularly, the present invention relates to a method of acquiring sub-document images having pixels moved using a camera, and to applying image enhancement and local binarization. The present invention relates to a camera character recognition apparatus and method using a pixel shift document image combination recognition method which extracts a feature of an individual character, extracts a feature of a pixel shifted individual character, and recognizes and combines the result using a weighted heterogeneous recognizer.

본 발명은 정보통신부 및 정보통신연구진흥원의 IT신성장동력핵심기술개발사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호:2005-S-003-02, 과제명: 실사 수준의 디지털 영상 콘텐츠 제작 소프트웨어 개발].The present invention is derived from a study conducted as part of the IT new growth engine core technology development project of the Ministry of Information and Communication and the Ministry of Information and Telecommunication Research and Development. [Task Management Number: 2005-S-003-02, Title: Diligence-level Digital Video Contents Authoring software development].

카메라 문자 인식은 기존의 종이문서를 스캐너로 입력해서 인식한 것과는 달리 실세계에 존재하는 어떤 형태의 문자도 쉽게 획득할 수 있다는 장점이 있다. 즉, 카메라 문자 인식은 종이 문서를 스캐너로 입력하는 것 외에 카메라라는 입력 수단에 의해 아주 손쉽게 인식 대상 문자들을 획득하여 인식한 다음, 그 인식 결과를 활용할 수 있다. 카메라 문자 인식은 실내 환경 뿐만 아니라 실외 환경에 존재하는 문자들도 입력하여 인식할 수 있는데, 그 예로 관광지의 안내문이나 각종 자료에 대한 설명문, 기념비, 간판, 명함 등을 들 수가 있다. 이와 같이 카메라 문자 인식은 입력 대상 문자들이 기록되어 있는 매체에 상관없이 휴대형 카메라로 문자들을 손쉽게 획득하여 인식하고 그 인식 결과를 활용할 수 있어서 최근 활발히 연구되고 있는 문자 인식 분야이다. 이러한 카메라 문자 인식은 책 읽는 로봇이나 휴대폰 카메라 문자 인식 등에도 활용되고 있다. Camera character recognition has the advantage of easily acquiring any type of character existing in the real world, unlike conventional paper documents inputted through a scanner. In other words, in addition to inputting a paper document into a scanner, camera character recognition may acquire and recognize characters to be recognized very easily by an input means such as a camera, and then use the recognition result. Camera character recognition can be recognized by inputting not only the indoor environment but also characters existing in the outdoor environment. For example, a guide letter of a tourist destination or an explanation of various data, a monument, a signboard, a business card, etc. may be mentioned. As described above, camera character recognition is a field of character recognition that is being actively researched since it is possible to easily acquire and recognize characters with a portable camera and utilize the recognition results regardless of the medium in which the input target characters are recorded. Such camera character recognition is also utilized for robots that read books and mobile phone camera characters.

이와 같이, 카메라 기반 문자 인식은 문자가 적힌 매체에 관계없이 실세계 문헌 정보를 카메라로 손쉽게 입력할 수 있는 장점이 있다. 이를 통해 카메라 문서 영상의 이용 목적에 따라, 획득한 카메라 문서 영상을 사용자의 컴퓨터에서 영상 처리한 다음 인식하여 문서 편집, 데이터베이스 구축 등을 위해 사용하거나, 휴대용 단말기에서 인식 대상 문자 영상을 서버로 직접 전송하고 인식된 결과를 다시 전송 받아서 외국어 번역, 음성 실시간 서비스 등 사용자 목적에 적합한 형태로 변환하여 활용할 수 있다. 최근 실세계 카메라 문자 인식은 상기한 바와 같은 휴대폰 카메라 문자 인식 및 책 읽는 로봇 등에 활용되어 카메라 문자 인식에 대한 사용자의 필요성이 점점 더 증가되고 있는 추세이다. As such, camera-based text recognition has an advantage of easily inputting real-world document information into the camera regardless of the medium on which the text is written. Through this, the acquired camera document image is processed by the user's computer and then recognized and used for document editing, database construction, or the like. After receiving the recognized result, it can be converted into a form suitable for user purpose such as foreign language translation and voice real-time service. Recently, the real-world camera character recognition is utilized by the mobile phone camera character recognition and a robot for reading books as described above, and the user's need for camera character recognition is increasing.

그러나, 카메라 문자 영상은 스캐너 문자 영상과는 달리 문자 영상을 입력하는 조건이 제한되지 않아 주변 조명 및 입력 카메라의 영향을 많이 받는다는 단점이 있어서 기존의 스캐너 기반 문자 인식에 비하여 인식하기 매우 어려운 것으로 알려져 있다. 이 같은 단점을 보완하기 위해 카메라 문자 인식을 위한 기술이 발표된 바 있다. However, the camera text image, unlike the scanner text image, has a disadvantage that the conditions for inputting the text image are not limited and are affected by the ambient light and the input camera. . To compensate for this drawback, a technique for camera character recognition has been announced.

그 방법 가운데 첫 번째는, 카메라 문서 영상을 개선하기 위하여 영상 전처리하는 방법으로, 각종 필터를 이용하여 카메라 문서 영상을 개선한 다음, 전역 이진화 및 국소 이진화 방법을 이용하여 카메라 문서 영상을 이진화한 다음, 문자 인식하도록 한 것이다. 카메라 문서 영상 인식 방법의 대부분이 카메라 문서 영상을 개선하는 방법에 관한 것이다. 이 방법은 카메라를 이용하여 종이 문서를 데이터베이스화하는데 활용될 수 있다. The first of these methods is image preprocessing to improve the camera document image. The camera document image is improved using various filters, and then the camera document image is binarized using global binarization and local binarization. Character recognition. Most of the camera document image recognition methods relate to a method of improving a camera document image. This method can be utilized to database paper documents using cameras.

그 방법 가운데 두 번째는, 실 환경에 존재하는 문자를 카메라로 획득하여 인식하는 것에 관한 것으로, 카메라로 문서 영상을 획득하고 인식하고자 하는 문자영역을 사용자가 지정하여 인식하도록 한 것이다. 이 방법은 실 환경에 존재하는 거리간판에 씌어진 문자를 PDA용 카메라로 획득한 다음 인식하는데 활용하고 있다. 카메라로 획득한 영상의 배경에서 인식하고자 하는 문자 영역만 사용자가 직접 선택한 다음 서버로 전송하여 인식한 다음 인식 결과를 다시 사용자에게 전송해준다. 문자 인식 결과는 사용자가 필요로 하는 정보 검색이나 외국어로 번역하는 등 사용자의 편의를 제공해주는 목적으로 활용되고 있다. The second method of the present invention relates to acquiring and recognizing a character existing in a real environment with a camera, and to allowing a user to designate and recognize a character region to acquire and recognize a document image with a camera. This method is used to acquire characters written on street signs in real environment with PDA cameras and then recognize them. Only the text area to be recognized in the background of the image acquired by the camera is directly selected by the user and transmitted to the server for recognition, and the recognition result is transmitted back to the user. The character recognition result is used for the purpose of providing the user's convenience, such as searching for information required by the user or translating into a foreign language.

상위 두 방법에서 주로 다루고 있는 것은 카메라 문자 영상에 대해 단순히 영상 개선 및 이진화 방법을 제시한 것으로서, 카메라 왜곡 및 주변 조명 영향으로 인해 발생되는 왜곡된 상황에 대한 카메라 문자 인식 방법은 제시되지 않고 있다. 이와 같이 대부분 카메라로 획득한 문서 영상 인식에 있어서 문제점으로 대두되고 있는 것은 카메라의 비네팅(vignetting) 현상 및 주변 조명의 영향을 최소화하여 문자 인식하는 것인데, 단순히 영상 개선 및 이진화 방법에 대한 개선만으로는 카메라 문자의 인식 성능을 보장할 수 없다. 또한, 실 환경에서 획득한 문자를 인식할 경우에는 더욱이 다양한 문자 형태를 인식할 수 있는 특징 추출 방법이나 인식 방법이 모색되어야 한다. 따라서, 주변 환경에 강인(왜곡 및 환경 영향 최소화)하게 카메라 문서 영상에서 문자 영역을 제대로 추출하여 인식하는 방법 및 카메라로 입력된 다양한 문자 영상에 무관하게 영상 특징이나 인식 방법을 이용하여 카메라 문자 인식이 수행되어야 한다. Mainly dealt with in the top two methods is simply a method of image enhancement and binarization for camera character images, and no camera character recognition method for distorted situations caused by camera distortion and ambient lighting effects is presented. As such, a problem in recognizing document images acquired by cameras is to recognize texts by minimizing the effects of vignetting and ambient lighting of cameras. Can not guarantee the recognition performance. In addition, when recognizing a character acquired in a real environment, a feature extraction method or a recognition method capable of recognizing various character forms should be sought. Therefore, the camera character recognition is performed using the image feature or recognition method regardless of the various character images inputted by the camera and the method of properly extracting and recognizing the character region from the camera document image to be robust to the surrounding environment (minimizing distortion and environmental influence). Should be performed.

이와 같이 카메라 문서 영상은 카메라 왜곡 및 주변의 조명 영향으로 인하여 블러링(blurring)되거나 잡음이 많이 발생하여 문자 분할 및 인식이 상당히 어려운 문제점을 해결하기 위해 영상 개선 알고리즘을 적용하거나 새로운 이진화 기법들이 많이 제안되어 있으나, 이러한 방법들을 이용하여 인식 대상 문자들을 추출했다하더라도 정형화된 개별 문자를 추출하기는 상당히 어렵다. 따라서 카메라 문서 인식을 위해 주로 연구되어온 영상 개선 알고리즘 적용 뿐만 아니라, 카메라라는 입력 수단으로 문서 영상을 획득하면서 발생한 변형 문자들을 제대로 인식할 수 있는 특징 추출 방법이나 인식 방법도 같이 모색되어야 할 필요가 있다. As such, camera document images are blurring or noisy due to camera distortion and ambient light effects, and image enhancement algorithms or new binarization techniques are proposed to solve the problem that character segmentation and recognition are difficult. However, even if the characters to be recognized are extracted using these methods, it is difficult to extract the individual characters. Therefore, in addition to the application of image enhancement algorithms that have been mainly studied for camera document recognition, it is necessary to search for feature extraction and recognition methods that can properly recognize the deformed characters generated while acquiring the document image using the camera input means.

따라서, 본 발명은 상기한 종래 기술의 문제점을 해결하기 위해 이루어진 것으로서, 본 발명의 목적은 카메라 문서 영상을 화소가 이동된 여러 장의 부-문서 영상(sub-document images)으로 나누어 입력하고, 각 문서 영상에 대해 영상 개선 알고리즘 및 이진화를 적용한 다음, 화소 이동 망 특징 등의 조합 문자 특징 추출기를 이용하여 문자의 특징을 추출한 다음, 가중치가 부여된 이종 인식기를 통해 인식하여 최종 인식 결과를 획득함으로써 문자 인식 능력을 향상시킬 수 있는 화소 이동 문서 영상 조합 인식 방법을 이용한 카메라 문자 인식 장치 및 방법을 제공하는데 있다. Accordingly, the present invention has been made to solve the above-described problems of the prior art, and an object of the present invention is to divide a camera document image into a plurality of sub-document images in which pixels are moved, and input each document. After applying the image enhancement algorithm and the binarization to the image, we extract the feature of the character using the combined character feature extractor such as the pixel moving network feature, and then recognize the character through the weighted heterogeneous recognizer to obtain the final recognition result. Disclosed is a camera character recognition apparatus and method using a pixel shift document image combination recognition method capable of improving capability.

구체적으로, 카메라 문서 영상을 화소가 이동된 여러 장의 부-문서 영상으로 나누어 획득한 다음 각 부-문서 영상에 대해 영상 전처리를 통하여 영상 화질을 개선하고, 각 화소의 명암 분포도를 조사하여 국소 이진화 임계치를 구하여 이진화 문자 영상을 구한다. 여기서, 제안된 국소 이진화는 기존의 국소 이진화에 비해서 처리시간이 최대 10배 정도 빠르기 때문에 여러 장의 부-문서 영상들을 처리하는데 문제가 되지 않는다. 화소를 이동하여 부-문서 영상들을 입력하여 인식함으로써 흔히 카메라 문서 영상의 가장자리 영역에 인식 불가능한 문자 영역이 존재하여 인식하지 못하는 문자들이 생기는 문제점을 해결할 수 있다. 또한, 각 부-문서 영상에 대해 결합 및 분리 알고리즘을 적용하여 개별 문자를 추출한 다음 상하 좌우 화소 이동 망 특징을 추출하여 이종 인식기로 인식한다. 여기서, 이종 인식기는 각 인식 기의 단점을 상호 보완해주는 역할을 함으로써 보다 향상된 인식 성능을 보장할 수 있다. 그리고, 각 인식기에 가중치를 부여하여 곱한 인식 결과를 최종 인식 결과로 한다. 이를 통해, 본 발명은 기존의 저해상도 카메라 문서 영상 인식을 위해 집중적으로 연구되어온 영상 개선 알고리즘 적용 및 이진화 방법 개선 뿐만 아니라 카메라 문서 영상에서 흔히 인식이 불가능한 영역 제거 방법 및 다양한 형태의 카메라 문자를 인식하기 위한 문자 특징 추출 및 문자 인식 방법을 같이 제안함으로써 카메라 문자의 인식 성능을 향상시킬 수 있을 뿐만 아니라 안정적인 인식 성능을 보장하는데 그 목적이 있다. Specifically, the camera document image is obtained by dividing the sub-document image into which pixels are moved, and then the image quality is improved by image preprocessing for each sub-document image, and the local binarization threshold is investigated by examining the intensity distribution of each pixel. Obtain the binarized character image by Here, the proposed local binarization is not a problem for processing multiple sub-document images since the processing time is up to 10 times faster than the conventional local binarization. By moving the pixels and recognizing the sub-document images, it is possible to solve the problem that characters that are not recognized are often generated because an unrecognizable character region exists in the edge region of the camera document image. In addition, the individual characters are extracted by applying a combination and separation algorithm to each sub-document image, and then the vertical and horizontal pixel shifting network features are extracted and recognized by a heterogeneous recognizer. In this case, the heterogeneous recognizer can ensure a more improved recognition performance by playing a role of complementing the disadvantages of each recognizer. The recognition result multiplied by the weight of each recognizer is used as the final recognition result. Through this, the present invention is not only to apply the image enhancement algorithm and the binarization method that has been intensively studied for the existing low resolution camera document image recognition, but also to recognize the various types of camera characters and unrecognized regions of the camera document image. By suggesting the character feature extraction and the character recognition method together, it is not only to improve the recognition performance of the camera character but also to ensure stable recognition performance.

상기와 같은 목적을 달성하기 위한 본 발명의 화소 이동 문서 영상 조합 인식 방법을 이용한 카메라 문자 인식 장치는, 입력되는 영상에 대해 화소가 이동된 다수의 부-문서 영상(sub-document images)으로 처리하여 출력하는 부-문서 영상처리부; 상기 부-문서 영상에 대해 영상 개선 알고리즘을 거친 영상을 국소 이진화하는 이진화부; 상기 이진화된 영상에 대해 개별 문자의 구조적인 특징정보 및 화소가 이동된 문자 영역에 대한 화소 이동 망 특징을 추출하는 문자처리부; 문자 유형의 분류 및 상기 개별 문자를 인식하는 이종 인식기를 결합하여 인식 결과를 출력시키는 인식처리부를 포함하여 이루어진 것을 특징으로 한다. Camera character recognition apparatus using the pixel moving document image combination recognition method of the present invention for achieving the above object, by processing a plurality of sub-document images of the pixel is moved to the input image An output sub-document image processing unit; A binarization unit configured to locally binarize an image that has undergone an image enhancement algorithm on the sub-document image; A character processing unit for extracting structural characteristic information of individual characters and pixel moving network characteristics of a character region to which pixels are moved with respect to the binarized image; And a recognition processing unit for outputting a recognition result by combining a classification of a character type and a heterogeneous recognizer for recognizing the individual character.

한편, 본 발명의 화소 이동 문서 영상 조합 인식 방법을 이용한 카메라 문자 인식 방법은, (a) 입력되는 영상에 대해 화소가 이동된 다수의 부-문서 영상(sub-document images)으로 처리하여 출력하는 단계; (b) 상기 부-문서 영상에 대해 영 상 개선 알고리즘을 적용하는 단계; (c) 상기 영상 개선 알고리즘을 거친 영상에 대해 이진화를 적용하여 이진문서 영상을 추출하는 단계; (d) 상기 이진화된 영상에 대해 개별 문자의 구조적인 특징정보 및 화소가 이동된 문자 영역에 대한 화소 이동 망 특징을 추출하는 단계; 및 (e) 문자 유형의 분류 및 상기 개별 문자를 인식하는 이종 인식기를 결합하여 인식 결과를 출력시키는 단계를 포함하여 이루어진 것을 특징으로 한다. On the other hand, the camera character recognition method using the pixel shift document image combination recognition method of the present invention, (a) processing and outputting a plurality of sub-document images (pixels) with respect to the input image shifted; ; (b) applying an image enhancement algorithm to the sub-document image; (c) extracting a binary document image by applying binarization to an image that has undergone the image enhancement algorithm; (d) extracting structural feature information of individual characters and pixel moving network characteristics of a character region to which pixels are moved to the binarized image; And (e) combining the classification of character types and a heterogeneous recognizer for recognizing the individual characters, and outputting a recognition result.

상술한 바와 같이, 본 발명은 카메라를 이용하여 화소 이동된 부-문서 영상을 획득하여 영상 개선 알고리즘을 적용하고 부-문서 영상에 대해 국소 이진화를 행한 다음 문자 추출, 문자 특징 추출 및 이종 인식기를 결합하여 문자 인식하는 방법을 제안하고 있다. 또한, 본 발명에서는 주변 조명 영향 및 카메라의 왜곡 현상을 줄이기 위하여 부-문서 영상 입력, 영상 개선 알고리즘 및 각 부-문서 영상에 대한 국소 이진화 방법을 적용하도록 하고 있다. 그리고, 결합 및 분리 알고리즘을 이용하여 개별 문자를 추출하여 화소 이동 망 특징을 추출하여 유형별로 문자를 분류하고 이종 인식기를 결합하여 인식하도록 하고 있다. As described above, the present invention obtains a pixel shifted sub-document image using a camera, applies an image enhancement algorithm, performs local binarization on the sub-document image, and then combines character extraction, character feature extraction, and heterogeneous recognizer. A method of character recognition is proposed. In addition, the present invention is to apply the sub-document image input, image enhancement algorithm and local binarization method for each sub-document image in order to reduce the effect of ambient lighting and distortion of the camera. In addition, the individual characters are extracted using a combination and separation algorithm to extract the pixel moving network feature to classify the characters by type and to recognize them by combining heterogeneous recognizers.

이와 같이 본 발명에 의한 화소 이동 문서 영상 조합 인식 방법을 이용한 카메라 문자 인식 장치 및 방법은, 화소 이동 부-문서 영상 입력, 화소 이동 망 특징 등의 개별 문자 특징을 추출하여 가중치를 부여한 이종 인식기를 결합하는 방법을 사용하여 인식하도록 함으로써 기존의 카메라 문자 영상이 주변 조명이나 카메라 렌즈의 영향으로 인하여 인식하기 어려운 문자 대상으로 여겨졌었던 문제점을 보완 하여 인식 성능을 향상시킬 수 있다. As described above, the apparatus and method for recognizing a camera character using the pixel shift document image combination recognition method according to the present invention combines a weighted heterogeneous recognizer by extracting and weighting individual character features such as pixel shift sub-document image input and pixel shift network feature. By recognizing by using the method, it is possible to improve the recognition performance by compensating for the problem that the existing camera character image was considered to be difficult to recognize due to the influence of the ambient light or the camera lens.

최근 카메라 문자 인식은 책읽는 로봇과 같은 로봇 비젼에 활용되거나 휴대폰과 같은 모바일 기기에 카메라 기능 및 인식 기능을 내장시켜 정보 획득의 수단으로 활용되고 있는 상황에서, 현실 세계에서 보다 안정적인 카메라 문자 인식 성능을 보장하면서 그 인식 결과를 다른 응용 서비스와 결합할 경우에 사용자에게 편리함을 제공할 수 있을 것이다. Recently, camera character recognition is used in robot visions such as robots reading books, or as a means of acquiring information by embedding camera functions and recognition functions in mobile devices such as mobile phones. While ensuring that the recognition results can be combined with other application services, it can provide convenience to the user.

이하, 본 발명의 화소 이동 문서 영상 조합 인식 방법을 이용한 카메라 문자 인식 장치 및 방법에 대하여 첨부된 도면을 참조하여 상세히 설명하기로 한다. Hereinafter, an apparatus and method for recognizing a camera character using the pixel shift document image combination recognition method of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 의한 화소 이동 문서 영상 조합 인식 방법을 이용한 문자 인식 장치의 제어회로블록도이다. 1 is a control circuit block diagram of a character recognition apparatus using a pixel shift document image combination recognition method according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 화소 이동 문서 영상 조합 인식 방법을 이용한 문자 인식 장치는 크게, 부-문서 영상처리부(1), 이진화부(2), 문자추출부(3), 문자특징추출부(4), 인식기(5) 및 결합기(6)로 구성되어 있다. 여기서, 문자추출부(3)와 문자특징추출부(4)는 문자처리부로 정의될 수 있으며, 인식기(5)와 결합기(6)는 인식처리부로 정의될 수 있다. 또한, 본 발명의 화소 이동 문서 영상 조합 인식 방법을 이용한 문자 인식 장치는 카메라 자체에 내장될 수 있으며, 서버에 장착되어 네트워크를 통한 서비스를 카메라에서 받을 수도 있을 것이다. Referring to FIG. 1, the character recognition apparatus using the pixel shift document image combination recognition method of the present invention is largely divided into a sub-document image processing unit 1, a binarization unit 2, a character extraction unit 3, and a character feature extraction unit. (4), the recognizer 5, and the coupler 6 are comprised. Here, the character extractor 3 and the character feature extractor 4 may be defined as a character processor, and the recognizer 5 and the combiner 6 may be defined as a recognition processor. In addition, the character recognition apparatus using the pixel moving document image combination recognition method of the present invention may be embedded in the camera itself, and may be mounted in a server to receive a service through a network from the camera.

부-문서 영상처리부(1)는, 카메라 문서 영상에서 불가능한 인식 영역을 없애기 위한 방안으로 문서 영상을 입력할 때 화소가 이동된 부-문서 영상(sub- document images)을 여러 장 출력되도록 하는 기능을 수행한다. The sub-document image processing unit 1 has a function of outputting a plurality of sub-document images in which pixels are moved when inputting a document image in order to eliminate an impossible recognition area in the camera document image. Perform.

이진화부(2)는, 카메라 렌즈의 특성에 따른 왜곡 현상 및 포커스 문제가 발생하여 카메라 문서 영상의 특정 부분 특히, 문서의 가장자리 부분에 위치한 문자들의 경우 문자 분할이나 인식이 어려운 경우가 흔히 발생한다. 또한, 저해상도 카메라의 경우 문자 영상이 블러링되거나 이웃하는 문자와 겹쳐지는 경우가 나타나기도 한다. 따라서 화소 이동된 문서 영상에 대해 영상 개선 알고리즘을 적용하고 각 부-문서 영상의 화소 명암도 값을 조사하여 국소 이진화를 수행한다. In the binarization unit 2, distortion and focus problems according to the characteristics of the camera lens are generated, so that a character segmentation or recognition is often difficult in a particular portion of the camera document image, particularly characters located at the edge of the document. In addition, in the case of a low resolution camera, a text image may be blurred or overlap with a neighboring text. Therefore, the image enhancement algorithm is applied to the pixel shifted document image, and the local binarization is performed by examining the pixel contrast value of each sub-document image.

문자추출부(3)는, 이진화 문서 영상에 대해 개별 문자의 구조적인 특징정보로부터 개별 문자로 분할하는 기능을 수행한다. 여기서, 문자추출부(3)는 이진화 문서 영상에서 투영기법을 사용하여 단어를 추출하는 기능과, 단어영역에서 결합 및 분리 알고리즘을 적용하여 개별문자를 추출하는 기능을 포함한다. The character extracting unit 3 performs a function of dividing the binarized document image into individual characters from structural feature information of the individual characters. Here, the character extracting unit 3 includes a function of extracting a word from a binarized document image using a projection technique, and a function of extracting individual characters from a word region by applying a combination and separation algorithm.

문자특징추출부(4)는, 추출된 개별 문자에 대해서 특징을 추출하는데, 카메라 문자의 변형에도 무관한 문자 특징을 추출하기 위해 추출된 개별 문자에서 상,하,좌,우 각각에 대해 화소가 이동된 문자 영역에 대해 화소 이동 망 특징을 추출하는 기능과, 윤곽선, 체인코드, 거리 및 문자 영상의 가로 대 세로 비(ratio) 등의 구조적인 특징을 추출하는 기능을 포함한다. The character feature extracting unit 4 extracts a feature for the extracted individual character. In order to extract a character feature irrespective of the deformation of the camera character, a pixel is extracted for each of the extracted individual characters. And extracting pixel moving network features from the moved text area, and extracting structural features such as contours, chain codes, distances, and aspect ratios of the text images.

인식기(5)는, 추출된 문자에 대해 인식을 수행하는데, 인식기의 부담을 줄이기 위하여 문자의 유형을 분류할 수 있는 MLP 유형분류기(51)와, 카메라 문자와 같이 형태 변형이 다양한 문자를 인식하기 위해 개별 문자에 대한 이종 인식기, 즉 HMM(Hidden Markov Model) 및 MLP(Multi-Layer Perceptron)(52)로 구성되어 있다. The recognizer 5 performs recognition on the extracted characters, and in order to reduce the burden of the recognizer, an MLP type classifier 51 capable of classifying characters and a character having various shape variations such as a camera character are recognized. It consists of a heterogeneous recognizer for individual characters, that is, a Hidden Markov Model (HMM) and a Multi-Layer Perceptron (MLP) 52.

결합기(6)는, 상기 이종 인식기로부터 출력되는 인식결과에 가중치를 부여하고 MLP 유형분류기로부터 출력되는 인식결과에 가중치를 부여하여 결합함으로써 최종 인식 결과를 도출하는 기능을 수행한다. The combiner 6 performs a function of deriving a final recognition result by assigning weights to the recognition results outputted from the heterogeneous recognizer and weighting and combining the recognition results outputted from the MLP type classifier.

도 2는 본 발명의 일실시예에 의한 화소 이동 문서 영상 조합 인식 방법을 이용한 문자 인식 방법의 흐름도이다. 2 is a flowchart of a character recognition method using a pixel shift document image combination recognition method according to an embodiment of the present invention.

도 2를 참조하면, 인식 대상 문자(실내외 문서, 간판/명함, 안내문/설명문)에 대하여 카메라, 휴대폰 카메라, 웹 카메라 등을 이용하여 촬상을 수행한 영상을 본 발명의 화소 이동 문서 영상 조합 인식 방법을 이용한 문자 인식 장치에 입력시킴에 따라, 우측으로 화소 이동시킨 부-문서 영상을 획득한다(S1). 획득한 문자 영상에 대하여 문자부분이 제대로 추출되도록 하기 위한 영상 개선 알고리즘을 적용시킨다. 그리고 화소의 명암 분포도를 계산하여 부-문서 영상에 대한 국소 이진화를 행한다(S2). 결합 및 분리 알고리즘을 이용하여 단어 및 개별 문자를 추출한다(S3). 추출된 개별 문자에 대해 상,하,좌,우 화소 이동 메쉬 특징 및 구조적인 특징 벡터를 추출한다(S4). 추출된 개별 문자를 이종 인식기에서 인식한다(S5). 각 인식기로부터 얻은 인식 결과에 가중치를 부여한 다음 두 인식기의 결과를 곱하여 최종 인식 결과를 얻는다(S6). 이후, 사용자의 편의에 맞게 인식 결과와 응용소프트웨어를 결합하여 활용한다. 이하, 각 단계별로 구체적인 실시예를 상세히 설명한다. Referring to FIG. 2, a pixel moving document image combination recognition method according to an embodiment of the present invention is performed by using a camera, a mobile phone camera, a web camera, or the like for a character to be recognized (indoor / outdoor document, sign / card, guide / explanation). In operation S1, a sub-document image obtained by shifting a pixel to the right is obtained according to the input to the character recognition apparatus using the SLR. An image enhancement algorithm is applied to properly extract the text part of the acquired text image. The intensity distribution of the pixels is calculated to perform local binarization on the sub-document image (S2). The words and the individual letters are extracted using the combining and separating algorithm (S3). Up, down, left, and right pixel shifting mesh features and structural feature vectors are extracted for the extracted individual characters (S4). The extracted individual characters are recognized by the heterogeneous recognizer (S5). The recognition result obtained from each recognizer is weighted and then multiplied by the results of the two recognizers to obtain a final recognition result (S6). After that, the recognition result and application software are combined to suit the user's convenience. Hereinafter, specific embodiments will be described in detail for each step.

실시예 Example

먼저, 인식대상 카메라 문서 영상을 획득한다. 즉, 다양한 형태의 인식 대상 문서 획득을 위하여 카메라와 문서와의 거리를 다양하게 하여 카메라 문자 영상 샘플들을 얻는다. 또한, 다양한 문자 폰트가 기록된 문서 영상을 대상으로 샘플들을 입력하도록 한다. 시뮬레이션에 사용된 문자 영상은 제한되지 않은 조명 조건 환경에서 얻어진 영상들이다. 다양한 카메라 문자 영상에 대한 인식 성능을 실험하기 위하여 실내 외 환경에서 문자가 기록된 매체에 상관없이 인쇄체로 적힌 문자 영상을 획득하도록 한다. First, a recognition target camera document image is obtained. That is, camera character image samples are obtained by varying the distance between the camera and the document in order to obtain various types of recognition target documents. Also, samples may be input to a document image in which various character fonts are recorded. Character images used in the simulation are images obtained in an unconstrained lighting condition environment. In order to experiment with the recognition performance of various camera character images, a character image written in a printed form is acquired regardless of a medium in which characters are recorded in indoor and outdoor environments.

S1: 스캐너로 입력한 문자 영상과는 달리 카메라 문자 영상은 주변 조명 영향으로 인하여 획득한 문자 영상의 가장 자리부분에 비네트(vignette) 현상 및 문자 영상이 흐려지는 블러링 현상이 발생한다. 이러한 요인들은 문자의 오분할 및 오인식을 유발하는 요인으로 작용하므로 문자부분을 제대로 추출하여 인식하는 방법이 요구된다. 따라서, 화소가 이동된 부-문서 영상을 카메라로 획득하여 오인식 문자 영역을 줄이도록 한다. 도 3은 카메라로 입력된 문서 영상 샘플들을 도시한 도면이다. S1: Unlike a character image input by a scanner, a camera character image generates a vignette phenomenon and a blurring phenomenon in which a character image is blurred at the edge of the acquired character image due to ambient lighting effects. Since these factors act as factors causing mis-division and misrecognition of characters, a method of properly extracting and recognizing character parts is required. Therefore, the sub-document image in which the pixel is moved is acquired by the camera to reduce the misrecognized character area. 3 is a diagram illustrating document image samples input to a camera.

S2: 컬러 영상을 명도 영상으로 변환하기 위하여 다음 [수식 1]을 적용한다. S2: The following [Formula 1] is applied to convert the color image to the brightness image.

--- [수식 1]

--- [Formula 1]

자연색 컬러 영상을 256컬러 영상으로 변환한 다음 상기 [수식 1]을 이용하여 명도 레벨 영상으로 변환한다. 이때,

는 각각 0.11, 0.59 및 0.30으로 계산하였다. The natural color image is converted into a 256 color image and then converted into a brightness level image using Equation 1 above. At this time,

Were calculated to be 0.11, 0.59 and 0.30, respectively.

또한, 각 부-문서 영상에 대한 국소 이진화를 위한 임계치를 계산하여 이진화를 행한다. 영상 개선 알고리즘을 적용한 입력 영상의 국소 영역에 대해서 부분적으로 문자 영상을 이진화하는 알고리즘을 적용하여 주변 조명 영향에 민감하지 않게 문자영역을 추출할 수 있도록 한다. 부-문서 영상에 대한 국소 이진화 결과는 도 4에 잘 도시되어 있다. 이진화 임계값을 구할 대상 국소 영역 r x r에서 명도 레벨이 가장 높은 화소값과 가장 낮은 화소값의 차이를 구한 다음 이 차 값의 t에 해당하는 값으로 이진화를 수행하기 위하여 다음의 [수식 2]로 계산한다. In addition, binarization is performed by calculating a threshold for local binarization for each sub-document image. By applying an image enhancement algorithm to partially localize the text image in the local region of the input image, the text region can be extracted insensitive to the ambient lighting effects. Local binarization results for sub-document images are well illustrated in FIG. 4. In the target region rxr where the binarization threshold is to be calculated, the difference between the pixel value having the highest brightness level and the lowest pixel value is obtained, and then calculated by the following [Equation 2] to perform binarization with a value corresponding to t of this difference value. do.

--- [수식 2]

--- [Equation 2]

이때,

와

는 국소 영역 r x r에서 명도 레벨이 최대 및 최소 화소 값을 각각 나타낸다.

는 실험 결과 얻어진 값이다. At this time,

Wow

In the local region rxr, brightness levels represent maximum and minimum pixel values, respectively.

Is the value obtained from the experiment.

S3: 문자 분할을 위하여 수직 투영 및 여백 정보를 이용하여 단어를 추출한 다음, 결합 및 분리 알고리즘을 이용하여 개별 문자를 추출한다. 한글은 다른 문자들과 달리 모음과 자음이 결합하여 문자를 이루기 때문에 연결 화소에 대한 결합 및 분리 알고리즘을 적용하여 개별 문자를 추출하도록 한다. 문자열, 단어 및 개별 문자 추출 과정 및 결과가 도 5a 내지 5c에 잘 도시되어 있다. S3: A word is extracted using vertical projection and margin information for character division, and then individual characters are extracted using a combination and separation algorithm. Hangul, unlike other characters, combines vowels and consonants to form letters, so that individual letters are extracted by applying a combination and separation algorithm for the connected pixels. Strings, words and individual character extraction processes and results are well illustrated in FIGS. 5A-5C.

S4: 인식 대상 문자의 구조적인 특징정보를 이용하여 개별 문자의 특징을 추출한다. 메쉬 특징, 거리정보 특징 및 윤곽선 정보를 이용하여 개별 문자에 대한 특징을 추출하도록 한다. 그 가운데 메쉬 특징 정보는 상,하,좌,우 화소 이동 메쉬 특징을 추출하도록 한다. 도 6에는 화소 이동 망 특징 추출에 대한 개념이 도시되 어 있다. S4: Extract the feature of the individual letter using the structural feature information of the character to be recognized. Features of individual characters are extracted by using mesh features, distance information features, and contour information. Among them, the mesh feature information extracts up, down, left, and right pixel moving mesh features. 6 illustrates a concept of pixel moving network feature extraction.

S5: 추출된 개별 문자의 유형을 분류한 다음 인식한다. 한글, 영어, 기호, 숫자가 혼용된 문자 인식을 위하여 한글에 대하여 여섯 가지 문자유형으로 분류하고 영어, 기호, 숫자를 비 한글 유형으로 분류하여 각 문자 유형별로 문자들을 인식하도록 한다. 본 실시예에서는 MLP를 이용한 유형 분류기 및 HMM과 MLP를 이용한 개별 문자 인식기를 구현한다. S5: Classify and recognize the types of the extracted individual characters. In order to recognize characters mixed with Korean, English, symbols, and numbers, Korean characters are classified into six types of characters, and English, symbols, and numbers are classified into non-Hangul types. In this embodiment, a type classifier using MLP and an individual character recognizer using HMM and MLP are implemented.

S6: 카메라 문자 인식과 같이 인식하기 어려운 문자를 인식하기 위해서 이종 인식기를 결합하여 가중치를 부여한 다음, 곱하는 방식을 이용하여 문자를 인식하도록 한다. 도 7은 가중치가 부여된 이종 인식기 결합 인식 방법을 도시한 도면이다. 도 7에 도시된 바와 같이, 개별문자에 대해 다수의 HMM을 이용하여 인식을 수행한 결과를 조합한 인식결과1과 MLP를 이용하여 인식을 수행한 결과를 조합한 인식결과2에 각각 가중치를 부여하여 결합한다. S6: In order to recognize characters that are difficult to recognize, such as camera character recognition, weights are combined by using a heterogeneous recognizer, and then the characters are recognized using a multiplication method. 7 is a diagram illustrating a weighted heterogeneous recognizer combination recognition method. As shown in FIG. 7, weights are respectively assigned to the recognition result 1 that combines the result of performing the recognition using a plurality of HMMs for the individual characters and the recognition result 2 that combines the result of performing the recognition using the MLP. To combine.

이후, 문자 인식 결과에 대해 사용자의 편의에 맞게 문서 편집, 데이터베이스 구축에 활용하거나, 외국어 변환, 음성 서비스 등의 응용소프트웨어와 결합하여 활용한다. Afterwards, the text recognition result is used for document editing, database construction, or combined with application software such as foreign language conversion and voice service.

이와 같이, 본 발명에서는 카메라를 이용하여 화소 이동된 부-문서 영상을 획득한 다음, 영상 개선 알고리즘을 이용하여 영상 화질을 개선하고, 국소 이진화를 행하여 인식하지 못하는 문서 영역을 없애도록 하였다. 개별 문자를 추출하고 인식하는데 있어서 여러 가지 구조적인 문자 특징 정보를 추출하여 결합하도록 하 였으며, 특히 카메라 문자와 같이 다양항 형태를 가진 문자 인식을 위하여 화소 이동된 망 특징을 추출하여 적용하도록 하였다. 또한, 이종 인식기를 설계하여 각 부-문서 영상의 인식결과를 얻고, 이종 인식기의 상호 보완 기능의 장점을 활용하여 인식 결과에 가중치를 부여하고, 두 인식 결과를 서로 곱하여 최종 인식 결과를 얻도록 하였다. As described above, in the present invention, the pixel-shifted sub-document image is acquired by using a camera, and then the image quality is improved by using an image enhancement algorithm, and local binarization is performed to eliminate the document region that is not recognized. In order to extract and recognize individual characters, various structural character feature information is extracted and combined. Especially, the pixel shifted network feature is extracted and applied for character recognition with various forms such as camera characters. In addition, the heterogeneous recognizer was designed to obtain recognition results of each sub-document image, and the recognition results were weighted by using the advantages of the complementary functions of the heterogeneous recognizers, and the final recognition results were obtained by multiplying the two recognition results. .

이와 같이, 카메라 문자 인식은 기존의 스캐너 문자 인식과는 달리 주변 조명 및 카메라 렌즈의 영향으로 인하여 문자 인식하기가 매우 어렵기 때문에 본 발명에서는 화소 이동된 부-문서 영상을 여러 장 입력하여 영상 개선하고 국소 이진화하도록 하였다. 개별 문자의 특징을 추출할 때도 카메라 문자의 형태 변형을 고려한 특징을 추출하도록 하였으며, 이종 인식기를 결합하여 인식하도록 함으로써 카메라 문자 인식의 인식 성능을 향상시킬 수 있도록 하였다. As described above, since the camera character recognition is very difficult to recognize characters due to the influence of the ambient light and the camera lens, unlike the conventional scanner character recognition, the present invention improves the image by inputting a plurality of pixel shifted sub-document images. Local binarization was allowed. When extracting the features of individual characters, we also extract features that take into account the shape change of the camera characters, and improve the recognition performance of camera character recognition by combining them with heterogeneous recognizers.

이상에서 몇 가지 실시예를 들어 본 발명을 더욱 상세하게 설명하였으나, 본 발명은 반드시 이러한 실시예로 국한되는 것이 아니고 본 발명의 기술사상을 벗어나지 않는 범위 내에서 다양하게 변형 실시될 수 있다. Although the present invention has been described in more detail with reference to some embodiments, the present invention is not necessarily limited to these embodiments, and various modifications can be made without departing from the spirit of the present invention.

도 1은 본 발명의 일실시예에 의한 화소 이동 문서 영상 조합 인식 방법을 이용한 문자 인식 장치의 제어회로블록도, 1 is a control circuit block diagram of a character recognition apparatus using a pixel shift document image combination recognition method according to an embodiment of the present invention;

도 2는 본 발명의 일실시예에 의한 화소 이동 문서 영상 조합 인식 방법을 이용한 문자 인식 방법의 흐름도, 2 is a flowchart of a character recognition method using a pixel shift document image combination recognition method according to an embodiment of the present invention;

도 3은 카메라로 입력된 문서 영상 샘플들을 도시한 도면, 3 is a diagram illustrating document image samples input to a camera;

도 4는 부-문서 영상에 대한 국소 이진화 결과를 도시한 도면, 4 is a diagram illustrating local binarization results for sub-document images;

도 5a 내지 5c는 문자열, 단어 및 개별 문자 추출 과정 및 결과를 도시한 도면, 5A to 5C are diagrams illustrating a process of extracting a string, a word, and an individual character, and results;

도 6은 화소 이동 망 특징 추출을 도시한 도면, 6 is a diagram illustrating pixel moving network feature extraction;

도 7은 가중치가 부여된 이종 인식기 결합 인식 방법을 도시한 도면이다. 7 is a diagram illustrating a weighted heterogeneous recognizer combination recognition method.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for main parts of the drawings>

1 : 부-문서 영상처리부 2 : 이진화부 1: sub-document image processing unit 2: binarization unit

3 : 문자추출부 4 : 문자특징추출부 3: character extracting unit 4: character feature extracting unit

5 : 인식기 51 : MLP 유형분류기5: recognizer 51: MLP type classifier

52 : HMM(Hidden Markov Model) 및 MLP(Multi-Layer Perceptron) 52: Hidden Markov Model (HMM) and Multi-Layer Perceptron (MLP)

6 : 결합기 6: combiner

Claims

A sub-document image processing unit which processes and outputs a plurality of sub-document images in which pixels are moved with respect to the input image;

A binarization unit configured to locally binarize an image that has undergone an image enhancement algorithm on the sub-document image;

A character processing unit for extracting structural characteristic information of individual characters and pixel moving network characteristics of a character region to which pixels are moved with respect to the binarized image;

Recognition processing unit for outputting the recognition result by combining the classification of the character type and the heterogeneous recognizer for recognizing the individual characters

Camera character recognition apparatus using a pixel moving document image combination recognition method comprising a.

The method of claim 1, wherein the character processing unit,

A character extraction unit for dividing and extracting the individualized characters from the structural feature information of the individual characters in the binarized image; And

Extracting the pixel moving network feature for the character region in which the pixels are moved in the up, down, left, and right directions with respect to the extracted individual characters, as well as the aspect ratio of the contour, chain code, distance, and character image Character feature extraction unit for extracting structural features including

The method of claim 2, wherein the character extracting unit,

And extracting a word from the binarized image by using a projection technique, and extracting individual characters by applying a combination and separation algorithm to the word.

The method according to any one of claims 1 to 3, wherein the recognition processing unit,

An MLP type classifier for classifying types of characters including Korean, English, symbols, and numbers;

A heterogeneous recognizer configured to perform character recognition in response to a character already learned for the individual character; And

A combiner that derives the final recognition result by giving weight to the recognition result outputted from the heterogeneous recognizer and weighting and combining the recognition result outputted from the MLP type classifier.

The apparatus of claim 4, wherein the heterogeneous recognizer comprises a plurality of Hidden Markov Models (HMMs) and a Multi-Layer Perceptron (MLP). 6.

(a) processing and outputting a plurality of sub-document images in which pixels are moved with respect to an input image;

(b) applying an image enhancement algorithm to the sub-document image;

(c) extracting a binary document image by applying binarization to an image that has undergone the image enhancement algorithm;

(d) extracting structural feature information of individual characters and pixel moving network characteristics of a character region to which pixels are moved to the binarized image; And

(e) combining the classification of character types and a heterogeneous recognizer that recognizes the individual characters and outputting a recognition result;

Camera character recognition method using a pixel shift document image combination recognition method comprising a.

7. The method of claim 6, wherein the image enhancement algorithm of step (b) is a brightness level normalization method of extracting only a portion of a character by emphasizing a character from a background.

7. The camera according to claim 6, wherein the binarization of the step (c) is performed by localizing the binarization threshold by investigating the intensity distribution of the pixels with respect to the sub-document image. Character recognition method.

7. The pixel moving document image combination recognition according to claim 6, wherein the step (c) further includes a noise removing step of analyzing and removing the size of the connection pixel including a table and a picture other than the characters resulting from the binarization. Camera Character Recognition Method

The method of any one of claims 6 to 9, wherein the extracting of the pixel shift network feature for the character region to which the pixel of step (d) is moved comprises:

(d-1) dividing and extracting the individualized characters from the structural feature information of the individual characters in the binary image; And

(d-2) Extract the pixel movement network feature for the character area in which the pixels are moved in the up, down, left, and right directions with respect to the extracted individual characters, as well as the horizontal lines of the contour, chain code, distance, and character image. Extracting structural features including aspect ratios

11. The method of claim 10, wherein the step of dividing into individual letters of the step (d-1) comprises extracting a word from the binarized image using a projection technique, and applying a combining and separating algorithm to the word. Camera character recognition method using a pixel shift document image combination recognition method characterized in that the extraction of characters.

The method of claim 11, wherein combining the classification of the character type and a heterogeneous recognizer for recognizing the individual character and outputting a recognition result includes:

(e-1) classifying a type of characters including Korean, English, symbols, and numbers in a multi-layer perceptron (MLP) type classifier;

(e-2) performing character recognition using a heterogeneous recognizer corresponding to a character already learned for the individual character; And

(e-3) deriving a final recognition result by assigning weights to the recognition results outputted from the heterogeneous recognizer and weighting and combining the recognition results outputted from the MLP type classifier;