KR101912758B1

KR101912758B1 - Method and apparatus for rectifying document image

Info

Publication number: KR101912758B1
Application number: KR1020170045970A
Authority: KR
Inventors: 조남익; 서원교; 길태호
Original assignee: 주식회사 한글과컴퓨터; 서울대학교산학협력단
Priority date: 2017-04-10
Filing date: 2017-04-10
Publication date: 2018-10-30
Also published as: KR20180114352A

Abstract

문서 영상을 평활화하기 위한 방법 및 장치를 제시하며, 문서 영상을 평활화하기 위한 방법은, 문서 영상의 비텍스트 영역에 포함된 선분들의 성질에 기초하여 제1 비용 함수를 설계하는 단계, 상기 문서 영상의 텍스트 영역에 포함된 텍스트 라인들의 성질에 기초하여 제2 비용 함수를 설계하는 단계, 상기 제1 비용 함수 및 제2 비용 함수를 결합하여 제3 비용 함수를 생성하는 단계 및 상기 제3 비용 함수를 이용하여 상기 문서 영상에 존재하는 왜곡을 제거하는 단계를 포함한다.A method and apparatus for smoothing a document image, the method for smoothing a document image comprises: designing a first cost function based on properties of lines contained in a non-text area of the document image; Designing a second cost function based on properties of text lines included in a text area of the text area, generating a third cost function by combining the first cost function and the second cost function, And removing distortion existing in the document image.

Description

METHOD AND APPARATUS FOR RECTIFIING DOCUMENT IMAGE < RTI ID = 0.0 >

본 명세서에서 개시되는 실시예들은 문서 영상을 평활화하기 위한 방법 및 장치에 관한 것이다.The embodiments disclosed herein relate to a method and apparatus for smoothing document images.

최근에는 스마트폰과 같이 개인이 휴대하는 기기에 구비된 카메라의 성능이 향상됨에 따라, 스캐너를 이용하여 문서를 스캔하는 대신에 스마트폰 등에 구비된 카메라를 이용하여 문서를 촬영하고, 촬영된 문서 영상을 스캔 영상 대신 사용하는 경우가 많다.In recent years, the performance of cameras provided in personal devices such as smart phones has been improved. Therefore, instead of scanning a document using a scanner, a document is photographed using a camera provided in a smart phone or the like, Is often used instead of a scan image.

그런데, 카메라로 촬영된 문서 영상의 경우, 카메라와 문서간의 위치 관계 및 문서 표면의 기하학적 형태로 인해 왜곡이 발생할 가능성이 높다. 그런데, 문서 영상을 전자 문서화하기 위해서는 문자 인식의 정확도를 높일 필요가 있고, 따라서 문자 인식을 수행하기 전에 문서 영상에 포함된 왜곡을 제거하는 평활화(rectification) 과정이 필요하다.However, in the case of a document image photographed with a camera, there is a high possibility that distortion occurs due to the positional relationship between the camera and the document and the geometric shape of the document surface. However, in order to digitally document a document image, it is necessary to increase the accuracy of character recognition, and therefore, a rectification process is required to remove the distortion included in the document image before performing character recognition.

관련하여 선행기술 문헌인 한국특허공개번호 제10-2015-0037374호에서는 일반 카메라로 촬영한 한 장의 문서 영상으로부터 텍스트 라인을 추출하고, 추출된 텍스트 라인을 스캔 문서 평면상으로 투영 시키는 투영식을 최적화시키는 파라미터를 산출함으로써, 스캐너 없이도 비정형 문서 영상을 스캔 문서로 변환하는 방법 및 장치를 개시하고 있다.Korean Patent Laid-Open Publication No. 10-2015-0037374, which is related to the prior art, extracts a text line from a single document image taken by a general camera, and optimizes a projection expression for projecting the extracted text line onto a scanned document plane To thereby convert an irregular document image into a scanned document without using a scanner.

한편, 전술한 배경기술은 발명자가 본 발명의 도출을 위해 보유하고 있었거나, 본 발명의 도출 과정에서 습득한 기술 정보로서, 반드시 본 발명의 출원 전에 일반 공중에게 공개된 공지기술이라 할 수는 없다.On the other hand, the background art described above is technical information acquired by the inventor for the derivation of the present invention or obtained in the derivation process of the present invention, and can not necessarily be a known technology disclosed to the general public before the application of the present invention .

본 명세서에서 개시되는 실시예들은, 문서 영상을 평활화하기 위한, 특히 비텍스트 영역에 존재하는 왜곡을 효과적으로 제거하기 위한 방법 및 장치를 제시하는데 목적이 있다.SUMMARY OF THE INVENTION The embodiments disclosed herein are aimed at providing a method and apparatus for smoothing a document image, and effectively eliminating distortion, especially in non-text areas.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 일 실시예에 따르면, 문서 영상 평활화 방법은, 문서 영상의 비텍스트 영역에 포함된 선분들의 성질에 기초하여 제1 비용 함수를 설계하는 단계, 상기 문서 영상의 텍스트 영역에 포함된 텍스트 라인들의 성질에 기초하여 제2 비용 함수를 설계하는 단계, 상기 제1 비용 함수 및 제2 비용 함수를 결합하여 제3 비용 함수를 생성하는 단계 및 상기 제3 비용 함수를 이용하여 상기 문서 영상에 존재하는 왜곡을 제거하는 단계를 포함할 수 있다.According to an embodiment of the present invention, there is provided a document image smoothing method comprising: designing a first cost function based on properties of lines included in a non-text area of a document image; Designing a second cost function based on properties of text lines included in a text area of a document image, combining the first cost function and a second cost function to produce a third cost function, And removing the distortion existing in the document image using the function.

다른 실시예에 따르면, 문서 영상 평활화 방법을 수행하기 위한 컴퓨터 프로그램으로서, 문서 영상 평활화 방법은, 문서 영상의 비텍스트 영역에 포함된 선분들의 성질에 기초하여 제1 비용 함수를 설계하는 단계, 상기 문서 영상의 텍스트 영역에 포함된 텍스트 라인들의 성질에 기초하여 제2 비용 함수를 설계하는 단계, 상기 제1 비용 함수 및 제2 비용 함수를 결합하여 제3 비용 함수를 생성하는 단계 및 상기 제3 비용 함수를 이용하여 상기 문서 영상에 존재하는 왜곡을 제거하는 단계를 포함할 수 있다.According to another embodiment, there is provided a computer program for performing a document image smoothing method, the method comprising: designing a first cost function based on properties of lines contained in a non-text area of a document image; Designing a second cost function based on properties of text lines included in a text area of a document image, combining the first cost function and a second cost function to produce a third cost function, And removing the distortion existing in the document image using the function.

또 다른 실시예에 따르면, 문서 영상 평활화 방법을 수행하는 프로그램이 기록된 컴퓨터 판독 가능한 기록매체로서, 문서 영상 평활화 방법은, 문서 영상의 비텍스트 영역에 포함된 선분들의 성질에 기초하여 제1 비용 함수를 설계하는 단계, 상기 문서 영상의 텍스트 영역에 포함된 텍스트 라인들의 성질에 기초하여 제2 비용 함수를 설계하는 단계, 상기 제1 비용 함수 및 제2 비용 함수를 결합하여 제3 비용 함수를 생성하는 단계 및 상기 제3 비용 함수를 이용하여 상기 문서 영상에 존재하는 왜곡을 제거하는 단계를 포함할 수 있다.According to yet another embodiment, there is provided a computer-readable medium having recorded thereon a program for performing a document image smoothing method, the document image smoothing method comprising the steps of: determining, based on properties of lines included in a non- Designing a second cost function based on properties of text lines included in a text area of the document image; combining the first cost function and the second cost function to generate a third cost function; And removing distortion existing in the document image using the third cost function.

또 다른 실시예에 따르면, 문서 영상 평활화 장치는, 문서 영상의 처리와 관련된 입력을 수신하고, 문서 영상의 처리가 진행되는 상황 및 결과를 보여주기 위한 입출력부, 문서 영상 평활화를 수행하기 위한 프로그램이 저장되는 저장부 및 상기 프로그램을 실행함으로써 상기 문서 영상의 평활화를 수행하는 제어부를 포함하며, 상기 제어부는, 상기 문서 영상의 비텍스트 영역에 포함된 선분들의 성질에 기초하여 제1 비용 함수를 설계하고, 상기 문서 영상의 텍스트 영역에 포함된 텍스트 라인들의 성질에 기초하여 제2 비용 함수를 설계하고, 상기 제1 비용 함수 및 제2 비용 함수를 결합하여 생성된 제3 비용 함수를 이용하여 상기 문서 영상에 존재하는 왜곡을 제거할 수 있다.According to another embodiment, the document image smoothing apparatus includes an input / output unit for receiving an input related to processing of a document image and displaying a situation and a result of processing of the document image, a program for performing smoothing of document image And a control unit for performing smoothing of the document image by executing the program, wherein the control unit is configured to design a first cost function based on properties of lines included in a non-text area of the document image Designing a second cost function based on properties of text lines included in a text area of the document image, and using the third cost function generated by combining the first cost function and the second cost function, It is possible to eliminate distortion existing in the image.

전술한 과제 해결 수단 중 어느 하나에 의하면, 문서 영상의 비텍스트 영역에 포함된 선분(line segment)의 성질을 반영한 비용 함수를 설계하고, 이를 평활화 과정에서 이용함으로써 비텍스트 영역에 존재하는 왜곡까지도 효과적으로 제거할 수 있는 효과를 기대할 수 있다.According to any one of the above-mentioned problems, the cost function reflecting the property of the line segment included in the non-text area of the document image is designed and used in the smoothing process, so that the distortion existing in the non- It is expected that the effect can be removed.

개시되는 실시예들에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 개시되는 실시예들이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects obtained in the disclosed embodiments are not limited to the effects mentioned above, and other effects not mentioned are obvious to those skilled in the art to which the embodiments disclosed from the following description belong It can be understood.

도 1은 일 실시예에 따른 문서 영상 평활화 장치의 구성을 도시한 블록도이다.
도 2는 하드카피 문서와, 이를 카메라로 촬영한 문서 영상간의 관계를 나타낸 도면이다.
도 3 및 도 4는 일 실시예에 따른 왜곡이 포함된 문서 영상을 도시한 도면들이다.
도 5는 일 실시예에 따라 문서 영상에 포함된 선분들을 그룹핑한 결과를 도시한 도면이다.
도 6은 일 실시예에 따른 왜곡이 포함된 문서 영상을 도시한 도면이다.
도 7은 일 실시예에 따른 문서 영상 평활화 방법을 설명하기 위한 순서도이다.
도 8은 일 실시예에 따른 문서 영상 평활화 방법을 수행한 결과를, 기존의 평활화 방법을 수행한 결과와 비교한 도면이다.1 is a block diagram illustrating a configuration of a document image smoothing apparatus according to an embodiment.
2 is a diagram showing a relationship between a hard copy document and a document image captured by the camera.
FIGS. 3 and 4 are views showing a document image including a distortion according to an embodiment.
5 is a diagram illustrating a result of grouping line segments included in a document image according to an exemplary embodiment.
6 is a diagram illustrating a document image including a distortion according to an exemplary embodiment.
FIG. 7 is a flowchart for explaining a document image smoothing method according to an embodiment.
FIG. 8 is a view comparing the result of performing the document image smoothing method according to the embodiment with the result of performing the smoothing method.

아래에서는 첨부한 도면을 참조하여 다양한 실시예들을 상세히 설명한다. 아래에서 설명되는 실시예들은 여러 가지 상이한 형태로 변형되어 실시될 수도 있다. 실시예들의 특징을 보다 명확히 설명하기 위하여, 이하의 실시예들이 속하는 기술분야에서 통상의 지식을 가진 자에게 널리 알려져 있는 사항들에 관해서 자세한 설명은 생략하였다. 그리고, 도면에서 실시예들의 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Various embodiments are described in detail below with reference to the accompanying drawings. The embodiments described below may be modified and implemented in various different forms. In order to more clearly describe the features of the embodiments, detailed descriptions of known matters to those skilled in the art are omitted. In the drawings, parts not relating to the description of the embodiments are omitted, and like parts are denoted by similar reference numerals throughout the specification.

명세서 전체에서, 어떤 구성이 다른 구성과 "연결"되어 있다고 할 때, 이는 '직접적으로 연결'되어 있는 경우뿐 아니라, '그 중간에 다른 구성을 사이에 두고 연결'되어 있는 경우도 포함한다. 또한, 어떤 구성이 어떤 구성을 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한, 그 외 다른 구성을 제외하는 것이 아니라 다른 구성들을 더 포함할 수도 있음을 의미한다.Throughout the specification, when a configuration is referred to as being "connected" to another configuration, it includes not only a case of being directly connected, but also a case of being connected with another configuration in between. In addition, when a configuration is referred to as "including ", it means that other configurations may be included, as well as other configurations, as long as there is no specially contradicted description.

이하 첨부된 도면을 참고하여 실시예들을 상세히 설명하기로 한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings.

도 1은 일 실시예에 따른 문서 영상 평활화 장치의 구성을 도시한 블록도이다.1 is a block diagram illustrating a configuration of a document image smoothing apparatus according to an embodiment.

일 실시예에 따른 문서 영상 평활화 장치는, 문서를 직접 촬영하고 촬영된 문서 영상의 평활화를 수행하는 장치일 수 있다. 예를 들어, 문서 영상 평활화 장치는 카메라를 구비하며 문서 영상 평활화를 수행하기 위한 어플리케이션이 설치된 스마트폰 또는 태블릿 등의 단말일 수 있다.The document image smoothing device according to an embodiment may be a device for directly photographing a document and performing smoothing of the photographed document image. For example, the document image smoothing device may be a terminal such as a smart phone or a tablet equipped with an application for performing smoothing of a document image.

또는, 일 실시예에 따른 문서 영상 평활화 장치는, 다른 장치로부터 문서 영상을 수신하고, 수신한 문서 영상을 평활화하는 장치일 수 있다. 예를 들어, 문서 영상 평활화 장치는 문서 영상 평활화를 수행하기 위한 프로그램이 설치된 데스크탑 또는 노트북 등일 수 있다.Alternatively, the document image smoothing apparatus according to an exemplary embodiment may be a device that receives a document image from another apparatus and smoothes the received document image. For example, the document image smoothing device may be a desktop or a notebook equipped with a program for performing document image smoothing.

또는, 일 실시예에 따른 문서 영상 평활화 장치는, 네트워크를 통해 사용자의 단말로부터 문서 영상을 수신하여 평활화를 수행하고, 그 결과를 다시 사용자의 단말로 전송하는 서버일 수도 있다.Alternatively, the document image smoothing apparatus according to an exemplary embodiment may be a server that receives a document image from a user's terminal through a network, performs smoothing, and transmits the document image to the user's terminal.

도 1을 참조하면, 일 실시예에 따른 문서 영상 평활화 장치(100)는 입출력부(110), 제어부(120), 저장부(130) 및 통신부(140)를 포함할 수 있다. 또한, 도 1에는 도시되지 않았지만, 문서 영상 평활화 장치(100)는 촬영부를 더 포함할 수도 있다.1, a document image smoothing apparatus 100 according to an embodiment may include an input / output unit 110, a control unit 120, a storage unit 130, and a communication unit 140. Although not shown in FIG. 1, the document image smoothing apparatus 100 may further include a photographing unit.

입출력부(110)는 사용자로부터 문서 영상의 처리와 관련된 입력을 수신하고, 문서 영상의 처리가 진행되는 상황 및 결과를 보여주는 화면을 표시할 수 있다. 예를 들어, 입출력부(110)는 사용자 입력을 수신하는 조작 패널(operation panel) 및 화면을 표시하는 디스플레이 패널(display panel) 등을 포함할 수 있다.The input / output unit 110 receives input related to the processing of the document image from the user, and can display a screen showing the progress of the processing of the document image and the result. For example, the input / output unit 110 may include an operation panel for receiving user input and a display panel for displaying a screen.

구체적으로, 입력부는 키보드, 물리 버튼, 터치 스크린, 카메라 또는 마이크 등과 같이 다양한 형태의 사용자 입력을 수신할 수 있는 장치들을 포함할 수 있다. 또한, 출력부는 디스플레이 패널 또는 스피커 등을 포함할 수 있다. 다만, 이에 한정되지 않고 입출력부(110)는 다양한 입출력을 지원하는 구성을 포함할 수 있다.In particular, the input unit may include devices capable of receiving various types of user input, such as a keyboard, a physical button, a touch screen, a camera or a microphone. Also, the output unit may include a display panel or a speaker. However, the present invention is not limited to this, and the input / output unit 110 may include various input / output support structures.

제어부(120)는 문서 영상 평활화 장치(100)의 전체적인 동작을 제어하며, CPU 등과 같은 프로세스를 포함할 수 있다. 일 실시예에 따르면, 제어부(120)는 적어도 하나의 프로세서를 포함할 수 있다. 제어부(120)는 저장부(130)에 저장된 프로그램을 실행시킴으로써 문서 영상을 평활화하는 프로세스들을 수행할 수 있다.The control unit 120 controls the overall operation of the document image smoothing apparatus 100 and may include a process such as a CPU. According to one embodiment, the control unit 120 may include at least one processor. The control unit 120 may perform processes for smoothing the document image by executing a program stored in the storage unit 130. [

저장부(130)에는 파일, 어플리케이션 및 프로그램 등과 같은 다양한 종류의 데이터가 설치 및 저장될 수 있다. 일 실시예에 따르면, 저장부(130)에는 문서 영상 평활화 방법을 수행하기 위한 프로그램이 설치될 수 있다.Various types of data such as files, applications, programs, and the like may be installed and stored in the storage unit 130. According to an exemplary embodiment, a program for performing a document image smoothing method may be installed in the storage unit 130.

통신부(140)는 다른 디바이스 또는 네트워크와 유무선 통신을 수행할 수 있다. 이를 위해, 통신부(140)는 다양한 유무선 통신 방법 중 적어도 하나를 지원하는 통신 모듈을 포함할 수 있다. 예를 들어, 통신 모듈은 칩셋(chipset)의 형태로 구현될 수 있다. 일 실시예에 따르면, 통신부(140)는 다른 장치로부터 문서 영상을 수신할 수 있다.The communication unit 140 can perform wire / wireless communication with another device or a network. To this end, the communication unit 140 may include a communication module supporting at least one of various wired / wireless communication methods. For example, the communication module may be implemented in the form of a chipset. According to one embodiment, the communication unit 140 can receive a document image from another apparatus.

통신부(140)가 지원하는 무선 통신은, 예를 들어 Wi-Fi(Wireless Fidelity), Wi-Fi Direct, 블루투스(Bluetooth), UWB(Ultra Wide Band) 또는 NFC(Near Field Communication) 등일 수 있다. 또한, 통신부(140)가 지원하는 유선 통신은, 예를 들어 USB 또는 HDMI(High Definition Multimedia Interface) 등일 수 있다. 또한, 통신부(140)는 인터넷 또는 이동통신망을 통해 목적지에 데이터 또는 메시지 등을 전송할 수도 있다.The wireless communication supported by the communication unit 140 may be Wi-Fi (Wireless Fidelity), Wi-Fi Direct, Bluetooth, UWB (Ultra Wide Band), NFC (Near Field Communication), or the like. The wired communication supported by the communication unit 140 may be, for example, USB or High Definition Multimedia Interface (HDMI). Also, the communication unit 140 may transmit data or messages to a destination via the Internet or a mobile communication network.

이하에서는 일 실시예에 따라 문서 영상 평활화 장치(100)가 문서 영상을 평활화하는 과정을 자세히 설명하도록 한다.Hereinafter, a process of smoothing a document image by the document image smoothing apparatus 100 according to an embodiment will be described in detail.

우선, 일 실시예에 따른 문서 영상 평활화 장치(100)의 구체적인 동작을 설명하기에 앞서, 카메라로 하드카피 문서를 촬영하여 얻은 문서 영상에 발생하는 왜곡 및 그러한 왜곡을 제거하는 평활화 프로세스에 대해서 설명하기로 한다.Before describing the specific operation of the document image smoothing apparatus 100 according to an embodiment, a description will be given of a distortion occurring in a document image obtained by photographing a hard copy document with a camera and a smoothing process for eliminating such distortion .

도 2는 하드카피 문서와, 이를 카메라로 촬영한 문서 영상간의 관계를 나타낸 도면이다.2 is a diagram showing a relationship between a hard copy document and a document image captured by the camera.

도 2를 참조하면, 카메라(10)를 이용해 하드카피 문서(20)를 촬영하면, 문서 영상(21)을 얻게 된다. 이때, 하드카피 문서(20)상의 좌표 값 (X,Y,Z)에 대응되는 문서 영상(21)상의 좌표 값을 (α,β)라고 한다면, 두 좌표 값 사이에는 다음의 수학식 1과 같은 관계가 성립한다.Referring to FIG. 2, when the hard copy document 20 is photographed using the camera 10, the document image 21 is obtained. At this time, if the coordinate values on the document image 21 corresponding to the coordinate values (X, Y, Z) on the hard copy document 20 are (?,?), Then between the two coordinate values, The relationship is established.

여기서, f는 카메라(10)의 초점 거리(focal length)이고, c_x 및 c_y는 각각 문서 영상(21)의 중심에 해당되는 주점(principal point)의 x축 좌표 값 및 y축 좌표 값이며, R은 카메라(10)의 포즈(pose)이고, T는 카메라(10)와 하드카피 문서(20)간의 위치 차이 벡터이다.Here, f is the focal length of the camera 10, c _x and c _y are the x-axis coordinate value and the y-axis coordinate value of the principal point corresponding to the center of the document image 21, respectively , R is the pose of the camera 10 and T is the position difference vector between the camera 10 and the hardcopy document 20.

한편, 하드카피 문서(20)의 대표적인 형태인 책자의 경우, 그 표면의 기하학적 형태는 휘어지는 것이 일반적이다. 이러한 하드카피 문서(20) 표면의 휘어짐을 수학적으로 모델링하기 위해, 하드카피 문서(20)의 휘어진 문서 표면은 일반화된 원통 표면(generalized cylindrical surface) 형태라고 가정한다. 그러면, 도 2의 하드카피 문서(20)상의 특정 점의 좌표 값 (X,Y,Z)은 다음의 수학식 2를 만족한다.On the other hand, in the case of a booklet, which is a representative form of the hard copy document 20, the geometric shape of its surface is generally curved. In order to mathematically model the warpage of the surface of the hard copy document 20, it is assumed that the curled document surface of the hard copy document 20 is in the form of a generalized cylindrical surface. Then, the coordinate values (X, Y, Z) of the specific point on the hard copy document 20 in FIG. 2 satisfy the following equation (2).

카메라(10)로 하드카피 문서(20)를 촬영하여 얻은 문서 영상(21)의 경우, 촬영시 카메라(10)의 위치 및 하드카피 문서(20) 표면의 휘어짐과 같은 기하학적 형태로 인해 왜곡이 발생하게 된다. 이러한 문서 영상(21)의 왜곡을 제거하여 마치 평평한 문서를 정면에서 촬영한 것처럼 보정하는 프로세스를 평활화라고 한다.In the case of the document image 21 obtained by photographing the hard copy document 20 with the camera 10, distortion occurs due to the geometric shape such as the position of the camera 10 and the curvature of the surface of the hard copy document 20 at the time of photographing . The process of removing the distortion of the document image 21 and correcting it as if a flat document is photographed from the front is referred to as smoothing.

위의 수학식 1 및 2를 이용하고, 문서 영상(21)에 존재하는 왜곡을 제거하도록 카메라의 포즈 및 문서 표면의 휘어짐 형태와 관련된 변수 값들을 추정함으로써, 문서 영상(21)의 평활화를 수행할 수 있다. 이때, 왜곡을 제거하도록 변수 값들을 추정하기 위해, 왜곡이 제거될수록 결과 값이 작아지는 비용 함수(cost function)를 설계하고, 비용 함수를 최소화하는 변수 값들을 추정할 수 있다. 이하에서 설명되는 실시예들은 문서 영상(21)의 왜곡 제거에 사용되는 비용 함수를 설계하는 방법에 관한 것이다.The smoothing of the document image 21 is performed by estimating the parameter values related to the pose of the camera and the warping form of the document surface so as to eliminate the distortion existing in the document image 21 using the above Equations 1 and 2 . In this case, in order to estimate the variable values to remove the distortion, a cost function that reduces the result as the distortion is removed can be designed and variable values that minimize the cost function can be estimated. The embodiments described below relate to a method for designing a cost function used to remove distortion of a document image 21. [

일 실시예에 따른 문서 영상 평활화 장치(100)는 텍스트 영역에 존재하는 텍스트 라인의 성질을 반영한 비용 함수 및 비텍스트 영역에 존재하는 선분(line segment)의 성질을 반영한 비용 함수를 결합하여 새로운 비용 함수를 생성하고, 생성된 비용 함수를 이용함으로써, 텍스트 영역뿐만 아니라 비텍스트 영역에 존재하는 왜곡까지도 효과적으로 제거할 수 있다.The document image smoothing apparatus 100 according to an embodiment combines a cost function reflecting the property of a text line existing in a text area and a cost function reflecting a property of a line segment existing in a non-text area, By using the generated cost function, not only the text region but also the distortion existing in the non-text region can be effectively removed.

우선, 제어부(120)가 텍스트 라인의 성질을 반영한 비용 함수를 설계하는 과정에 대해서 설명한다.First, the process of designing the cost function reflecting the property of the text line will be described.

문서 영상에서 왜곡이 제거될 수록, 문서 영상에 포함된 텍스트 라인들은 다음의 성질들을 만족할 가능성이 높아진다. 첫 번째로, 텍스트 라인은 가로 방향의 직선이다. 두 번째로, 같은 문단에 포함된 텍스트 라인간의 간격은 일정하다. 세 번째로, 같은 문단에 포함된 텍스트 라인들은 좌측 혹은 우측에 정렬된 형태이다.As the distortion is removed from the document image, the text lines included in the document image are more likely to satisfy the following properties. First, the text line is a straight line in the horizontal direction. Second, the spacing between the lines of text in the same paragraph is constant. Third, the text lines contained in the same paragraph are arranged on the left or right side.

따라서, 제어부(120)는 각각의 성질들을 반영하여 아래의 수학식 3 내지 5의 비용 함수들을 설계할 수 있다.Therefore, the controller 120 may design the cost functions of the following equations (3) to (5) to reflect the respective properties.

이때,

는 k번째 텍스트 라인에 있는 i번째 문자의 y축 좌표 값이고,

는 k번째 텍스트 라인의 y축 좌표 값이다. 또한,

는 k번째 텍스트 라인의 가장 좌측에 있는 x축 좌표 값(좌정렬 기준)이며,

은 문단 정렬 선의 x축 좌표 값이다.At this time,

Is the y-axis coordinate value of the i-th character in the k-th text line,

Is the y-axis coordinate value of the kth text line. Also,

Is the leftmost x-axis coordinate value (left-aligned reference) of the k-th text line,

Is the x-axis coordinate value of the paragraph alignment line.

위의 수학식 3 내지 5의 비용 함수들은 각각 텍스트 라인이 가로 방향의 직선일수록, 텍스트 라인간의 간격이 일정할수록, 그리고 텍스트 라인이 좌측에 정렬할수록 작은 값을 가지게 된다. 제어부(120)는 이러한 비용 함수들을 적절한 계수를 이용해 결합함으로써 문서 평활화를 위한 비용 함수를 설계할 수 있다. 예를 들어, 제어부(120)는 아래의 수학식 6 및 7과 같이 비용 함수를 설계할 수 있다.The cost functions of Equations (3) to (5) have smaller values as the line of the text line is in the horizontal direction, the interval between the text lines is constant, and the text line is arranged in the left side. The control unit 120 can design a cost function for document smoothing by combining these cost functions with appropriate coefficients. For example, the controller 120 may design a cost function as shown in Equations (6) and (7) below.

이때,

및

는 텍스트 라인의 각 성질을 반영한 비용 함수간의 가중치를 나타내는 계수로서, 설계 환경에 따라서 적절한 값으로 설정될 수 있다.At this time,

And

Is a coefficient indicating a weight between the cost functions reflecting each property of the text line, and can be set to an appropriate value according to the design environment.

예를 들어, 좌정렬(또는 우정렬)을 대체적으로 만족하는 문서 영상에 대해서는 수학식 6을 이용하여 평활화를 수행하고, 또는 텍스트 라인간의 간격이 대체적으로 일정한 문서 영상에 대해서는 수학식 7을 이용하여 평활화를 수행하는 것이 효과적일 수 있다.For example, smoothing may be performed using Equation (6) for a document image that substantially satisfies the left alignment (or linearity), or smoothing may be performed for a document image whose interval between text lines is substantially constant using Equation (7) May be effective.

그 밖에도 제어부(120)는 필요에 따라 수학식 4 및 5를 결합하여 비용함수를 설계하거나, 수학식 3 내지 5를 모두 결합하여 비용 함수를 설계할 수도 있다.In addition, the control unit 120 may design the cost function by combining the equations (4) and (5) as necessary, or may combine all the equations (3) to (5) to design the cost function.

이번에는 제어부(120)가 비텍스트 영역에 포함된 선분의 성질을 반영한 비용 함수를 설계하는 과정에 대해서 자세히 설명한다.Hereinafter, the process of designing the cost function reflecting the property of the line segment included in the non-text area will be described in detail.

일반적으로 문서 영상의 비텍스트 영역에 포함된 선분은, 문서 영상에서 왜곡이 제거될 수록 다음의 성질들을 만족할 가능성이 높아진다. 첫 번째로, 선분은 직선이다. 두 번째로, 표 또는 그림의 테두리에 해당되는 선분은 가로 방향 또는 세로 방향으로 배치된다.Generally, as the distortion in the document image is removed, the line segments included in the non-text area of the document image are more likely to satisfy the following properties. First, line segments are straight lines. Second, line segments corresponding to the borders of the table or figure are arranged in the horizontal or vertical direction.

먼저, 첫 번째 성질(선분은 직선)을 반영한 비용 함수를 설계하는 방법에 대해서 설명하면 다음과 같다.First, a method of designing a cost function reflecting the first property (line segment) is described as follows.

도 3 및 도 4는 일 실시예에 따른 왜곡이 포함된 문서 영상을 도시한 도면들이다. 도 3을 참조하면, 문서 영상(30)의 31 영역 및 32 영역에 포함된 선분들이 구부러진 곡선 형태임을 알 수 있다. 또한, 도 4를 참조하면, 문서 영상(40)에서 문서의 테두리에 해당되는 선분, 즉 41 영역에 포함된 선분이 구부러진 곡선 형태임을 알 수 있다. 이는 모두 문서 표면의 휘어짐으로 인해 발생한 왜곡들이다.FIGS. 3 and 4 are views showing a document image including a distortion according to an embodiment. Referring to FIG. 3, line segments included in the 31 and 32 regions of the document image 30 are curved. Referring to FIG. 4, it can be seen that a line segment corresponding to the border of the document in the document image 40, that is, a line segment included in the area 41 is a curved line. These are all distortions caused by the warping of the document surface.

이러한 왜곡을 제거하기 위해서, 먼저 하나의 긴 선분을 구성하는 작은 선분들을 그룹핑(grouping)하는 전처리 과정이 필요하다. 따라서, 제어부(120)는 선분들 간의 거리가 일정 기준 이상 가깝고 선분들의 방향이 비슷한 경우, 선분들을 하나의 그룹으로 그룹핑한다. 그룹핑이 완료된 선분들에 대해서 왜곡을 제거하는 과정에 대해서는 아래에서 도 5를 참조하여 설명한다.In order to remove such distortion, a preprocessing process for grouping small lines constituting one long line segment is required. Accordingly, when the distance between the line segments is equal to or more than a certain standard and the direction of the lines is similar, the control unit 120 groups the lines into one group. The process of removing distortion from the line segments for which the grouping is completed will be described below with reference to FIG.

도 5는 일 실시예에 따라 문서 영상에 포함된 선분들을 그룹핑한 결과를 도시한 도면이다. 도 5를 참조하면, N개의 선분들(51, 52, 53, …, 5N)은 하나의 그룹으로 그룹핑되었다.5 is a diagram illustrating a result of grouping line segments included in a document image according to an exemplary embodiment. Referring to FIG. 5, N line segments 51, 52, 53, ..., 5N are grouped into one group.

도 5에서 첫 번째 선분(51)의 시작점(P1)의 좌표를 (

,

)이라고 하고, 마지막 선분(5N)의 끝점(P2)의 좌표를 (

,

)이라고 하면, 두 점들(P1, P2)을 잇는 직선은 아래의 수학식 8 및 9와 같이 나타낼 수 있다.In FIG. 5, the coordinates of the starting point P1 of the first line segment 51 are denoted by (

,

), And the coordinates of the end point P2 of the last line segment 5N are denoted by (

,

), A straight line connecting the two points P1 and P2 can be expressed by the following equations (8) and (9).

N개의 선분들(51, 52, 53, …, 5N) 각각의 시작점들이, P1과 P2를 잇는 직선에 가까워질수록 긴 선분(50)은 직선의 형태를 가지므로, 제어부(120)는 이러한 성질을 이용하여 아래의 수학식 10과 같이 비용 함수를 설계할 수 있다.As the starting points of the N line segments 51, 52, 53, ..., 5N approach the straight line connecting P1 and P2, the long line segment 50 has a straight line shape, The cost function can be designed as shown in Equation (10) below.

수학식 10에 따르면, N개의 선분들(51, 52, 53, …, 5N) 각각의 시작점들이, P1과 P2를 잇는 직선에 가까워질수록 비용 함수의 값이 작아진다.According to the expression (10), as the starting points of each of the N line segments 51, 52, 53, ..., 5N approach the straight line connecting P1 and P2, the value of the cost function becomes smaller.

이번에는, 두 번째 성질(표 또는 그림의 테두리에 해당되는 선분은 가로 방향 또는 세로 방향으로 배치됨)을 반영한 비용 함수를 설계하는 방법에 대해서 설명한다.This time, we will explain how to design a cost function that reflects the second property (the line segment of the table or figure is arranged horizontally or vertically).

도 6은 일 실시예에 따른 왜곡이 포함된 문서 영상을 도시한 도면이다. 도 6을 참조하면, 문서 영상(60)에 포함된 표의 테두리에 해당되는 선분들이 가로 방향 또는 수직 방향에서 벗어나있음 확인할 수 있다. 제어부(120)는 이런 선분들이 가로 방향 또는 세로 방향에 가까워질수록 값이 작아지는 비용 함수를 설계함으로써 왜곡을 보정할 수 있다. 6 is a diagram illustrating a document image including a distortion according to an exemplary embodiment. Referring to FIG. 6, it can be confirmed that the line segments corresponding to the border of the table included in the document image 60 are deviated from the horizontal direction or the vertical direction. The control unit 120 can correct the distortion by designing a cost function such that the value becomes smaller as the line segments approach the horizontal direction or the vertical direction.

일반적으로 표 또는 그림의 테두리에 해당되는 선분은 텍스트 라인의 평균 길이보다 긴 길이를 가지므로, 제어부(120)는 이러한 성질을 이용하여 문서 영상에서 표 또는 그림의 테두리에 해당되는 선분을 추출할 수 있다.In general, the line segment corresponding to the border of the table or picture has a length longer than the average length of the text line, so that the control unit 120 can extract the line segment corresponding to the border of the table or picture in the document image have.

추출된 선분들 중 i번째 선분의 시작점의 좌표가 (

,

)이고, 끝점의 좌표가 (

,

)라면, 아래의 수학식 11의 비용 함수는 추출된 선분들이 가로 방향 또는 세로 방향에 가까워질수록 값이 작아진다.The coordinates of the starting point of the i-th line segment among the extracted line segments are (

,

), And the coordinates of the end point are (

,

), The cost function of the following Equation (11) becomes smaller as the extracted line segments approach the horizontal direction or the vertical direction.

따라서, 제어부(120)는 앞서 살펴본 수학식 6, 7, 10 및 11의 비용 함수들을 결합하여, 아래의 수학식 12의 비용 함수를 설계할 수 있다.Accordingly, the controller 120 may combine the cost functions of Equations (6), (7), (10) and (11) as described above to design the cost function of Equation (12).

이때,

는 수학식 6 또는 7 중에서 어느 하나이거나, 또는 수학식 3 내지 5의 비용 함수들을 다른 형태로 적절히 결합하여 생성된 비용 함수일 수도 있다.At this time,

May be any of the expressions (6) and (7), or may be a cost function generated by appropriately combining the cost functions of the expressions (3) to (5) in other forms.

또한, 수학식 12에서는 우변의 각 항들에 계수가 포함되어 있지 않지만, 필요에 따라서 우변의 각 항의 비용 함수에 가중치를 주기 위해 임의의 값을 가지는 계수를 포함시킬 수도 있다.In Equation (12), coefficients are not included in each term of the right side, but a coefficient having an arbitrary value may be included in order to give a weight to the cost function of each term of the right side as necessary.

한편, 수학식 12의 우변에는 수학식 10 및 11의 비용 함수들이 모두 포함되어 있지만, 이들 중 어느 하나만 포함하도록 비용 함수를 설계할 수도 있다.On the other hand, although the cost functions of Equations (10) and (11) are all included in the right side of Equation (12), a cost function may be designed to include only one of them.

제어부(120)는 수학식 12의 비용 함수를 최소화하는 변수 값들을 추정하고, 추정된 변수 값들을 이용하여 문서 영상의 평활화를 수행할 수 있다.The controller 120 estimates the variable values minimizing the cost function of Equation (12), and performs smoothing of the document image using the estimated variable values.

한편, 수학식 12의

에는 문서 영상에서 추출된 텍스트 라인이 대입되어야 하는데, 오검출된 텍스트 라인이 있을 경우, 즉 실제로는 텍스트 라인이 아닌데 텍스트 라인으로 검출된 성분이 있을 경우에는 평활화 성능이 낮아지는 문제가 있다. 또한, 수학식 12의

및

의 비용 함수들은 문서 영상에 포함된 선분들이 직선이거나, 가로 방향 또는 세로 방향으로 배치된다는 전제 하에 설계된 것이므로, 문서 영상에 곡선이 포함되어 있거나, 또는 문서 영상에 포함된 선분들이 가로 방향 또는 세로 방향의 배치가 아닌 경우라면 역시 평활화 성능이 낮아지게 되는 문제가 있다.On the other hand,

There is a problem that smoothing performance is lowered when there is a component that is detected as a text line when there is an erroneously detected text line, that is, when the text line is not actually a text line. Further, in Equation 12,

And

The cost functions of the document image are designed on the assumption that the line segments included in the document image are straight lines or are arranged in the horizontal direction or the vertical direction. Therefore, if the document image includes a curve or the line segments included in the document image are horizontal There is a problem that the smoothing performance is also lowered if it is not arranged.

따라서, 이와 같은 문제점을 해결하기 위해 제어부(120)는 비용 함수를 적용하여 평활화를 수행하는 프로세스를 반복적으로 수행하고, 매번 프로세스를 수행할 때마다 비용 함수의 값을 크게 만드는 성분들을 제거함으로써 평활화 성능을 높일 수 있다.Accordingly, in order to solve such a problem, the controller 120 repeatedly performs the process of performing the smoothing by applying the cost function, and removes the components that increase the value of the cost function each time the process is performed, .

예를 들어, 제어부(120)는 수학식 12의 비용 함수를 이용하여 문서 영상의 1차 평활화를 수행하고, 1차 평활화를 수행하는 과정에서 수학식 12의 우측 항들의 값을 가장 크게 만들었던 성분들을 제거한다. 그리고 이어서, 제어부(120)는 1차 평활화가 수행된 결과 영상에 대해서 다시 수학식 12의 비용 함수를 이용하여 2차 평활화를 수행한다. 제어부(120)는 2차 평활화를 수행할 때는, 1차 평활화 과정에서 제거된 성분들은 제외하고 평활화를 수행한다. 또한, 제어부(120)는 2차 평활화를 수행하는 과정에서도 역시 수학식 12의 우측 항들의 값을 가장 크게 만들었던 성분들을 제거한다. 제어부(120)는 이런 과정을 반복적으로 수행함으로써 평활화 성능을 높일 수 있다.For example, the controller 120 performs the primary smoothing of the document image using the cost function of Equation (12), and in the process of performing the primary smoothing, the components that make the values of the rightmost terms of Equation (12) Remove. Then, the controller 120 performs second-order smoothing again using the cost function of Equation (12) for the result image subjected to the first-order smoothing. When the secondary smoothing is performed, the controller 120 performs smoothing by excluding the components removed in the primary smoothing process. In addition, the controller 120 also removes components that maximize the value of the right-hand side of Equation (12) in the process of performing the second-order smoothing. The control unit 120 can increase the smoothing performance by repeatedly performing this process.

이와 같이 일 실시예에 따른 문서 영상 평활화 장치(100)를 이용하면, 문서 영상의 비텍스트 영역에 포함된 선분의 성질을 반영하여 비용 함수를 설계하고, 이를 평활화 과정에 이용함으로써, 텍스트 영역뿐만 아니라 비텍스트 영역에 발생한 왜곡 까지도 효과적으로 제거할 수 있는 효과를 기대할 수 있다.By using the document image smoothing apparatus 100 according to an embodiment, the cost function is designed by reflecting the property of the segment included in the non-text area of the document image, and by using the cost function for the smoothing process, It is possible to effectively remove the distortion occurring in the non-text area.

도 7은 일 실시예에 따른 문서 영상 평활화 방법을 설명하기 위한 순서도이다.FIG. 7 is a flowchart for explaining a document image smoothing method according to an embodiment.

도 7에 도시된 실시예에 따른 문서 영상 평활화 방법은 도 1에 도시된 서비스 장치(100)에서 시계열적으로 처리되는 단계들을 포함한다. 따라서, 이하에서 생략된 내용이라고 하더라도 도 1에 도시된 문서 영상 평활화 장치(100)에 관하여 이상에서 기술한 내용은 도 7에 도시된 실시예에 따른 문서 영상 평활화 방법에도 적용될 수 있다.The document image smoothing method according to the embodiment shown in FIG. 7 includes steps that are processed in a time-series manner in the service apparatus 100 shown in FIG. Therefore, even though omitted from the following description, the above description of the document image smoothing apparatus 100 shown in FIG. 1 can be applied to the document image smoothing method according to the embodiment shown in FIG.

도 7을 참조하면, 701 단계에서 문서 영상의 비텍스트 영역에 포함된 선분들의 성질에 기초하여, 문서 영상에 존재하는 왜곡이 제거될수록 값이 작아지는 제1 비용 함수를 설계한다. 자세하게는, 비텍스트 영역에 포함된 선분이 직선에 가까울수록 값이 작아지는 제4 비용 함수를 설계하고, 비텍스트 영역에 포함된 선분이 가로 방향 또는 세로 방향에 가까울수록 값이 작아지는 제5 비용 함수를 설계하고, 상기 제4 비용 함수 및 제5 비용 함수를 결합하여 제1 비용 함수를 생성할 수 있다.Referring to FIG. 7, in step 701, a first cost function is designed based on the property of lines included in a non-text area of a document image so that the value decreases as distortion existing in the document image is removed. In detail, the fourth cost function is designed so that the value decreases as the line segment included in the non-text area becomes closer to the straight line, and when the line segment included in the non-text area approaches the horizontal direction or the vertical direction, Function, and combining the fourth cost function and the fifth cost function to generate a first cost function.

제4 비용 함수를 설계하는 과정을 구체적으로 설명하면, 비텍스트 영역에 포함된 선분들 중 일부를 미리 설정된 기준에 따라서 그룹핑하고, 그룹핑된 선분들 각각의 시작점이 직선에 가깝게 위치할수록 값이 작아지도록 제4 비용 함수를 설계할 수 있다. 자세한 내용은 앞서 도 3 내지 5를 참조하여 설명한 부분에 기재되어 있으며, 제4 비용 함수는 수학식 10과 같이 설계될 수 있다.The process of designing the fourth cost function will be described in more detail. Some of the line segments included in the non-text area are grouped according to a preset reference, so that the value becomes smaller the closer the starting point of each grouped line segment is to a straight line A fourth cost function can be designed. The details are described with reference to Figs. 3 to 5, and the fourth cost function can be designed as shown in Equation (10).

제5 비용 함수를 설계하는 과정을 구체적으로 설명하면, 비텍스트 영역에 포함된 선분들 중, 문서 영상에 포함된 텍스트 라인들의 평균 길이보다 긴 선분들을 추출하고, 추출된 선분들이 가로 방향 또는 세로 방향에 가까울수록 값이 작아지도록 제5 비용 함수를 설계할 수 있다. 자세한 내용은 앞서 도 6을 참조하여 설명한 부분에 기재되어 있으며, 제5 비용 함수는 수학식 11과 같이 설계될 수 있다.The fifth cost function designing process will be described in more detail. The line segments longer than the average length of the text lines included in the document image are extracted from the line segments included in the non-text area. The fifth cost function can be designed so that the value becomes smaller as the direction becomes closer to the direction. The details are described with reference to FIG. 6, and the fifth cost function can be designed as shown in Equation (11).

한편, 제1 비용 함수는, 제4 비용 함수 또는 제5 비용 함수 중 어느 하나만을 포함하도록 설계될 수 있다. 또는, 제1 비용 함수는, 제4 비용 함수 또는 제5 비용 함수 중 적어도 하나에 가중치를 부여하는 계수를 포함하도록 설계될 수도 있다.On the other hand, the first cost function may be designed to include only one of the fourth cost function or the fifth cost function. Alternatively, the first cost function may be designed to include a coefficient that weights at least one of the fourth cost function or the fifth cost function.

702 단계에서는 문서 영상의 텍스트 영역에 포함된 텍스트 라인들의 성질에 기초하여, 문서 영상에 존재하는 왜곡이 제거될수록 값이 작아지는 제2 비용 함수를 설계한다. 이어서, 703 단계에서는 제1 비용 함수 및 제2 비용 함수를 결합하여 제3 비용 함수를 설계한다. 704 단계에서는 제3 비용 함수를 이용하여 문서 영상에 존재하는 왜곡을 제거한다.In step 702, a second cost function is designed based on the property of the text lines included in the text area of the document image so that the value decreases as the distortion existing in the document image is removed. In step 703, a third cost function is designed by combining the first cost function and the second cost function. In step 704, the distortion existing in the document image is removed using the third cost function.

도 8은 일 실시예에 따른 문서 영상 평활화 방법을 수행한 결과를, 기존의 평활화 방법을 수행한 결과와 비교한 도면이다.FIG. 8 is a view comparing the result of performing the document image smoothing method according to the embodiment with the result of performing the smoothing method.

여기서 기존의 평활화 방법이란 텍스트 영역에 포함된 텍스트 라인들의 성질을 반영한 비용 함수만을 이용하여 문서 영상을 평활화하는 방법을 의미한다. 이 경우 비텍스트 영역에 포함된 요소들은 평활화 과정에 반영되지 않으므로 비텍스트 영역에 존재하는 왜곡은 제거되지 않는다.Here, the existing smoothing method refers to a method of smoothing a document image using only a cost function that reflects properties of text lines included in a text area. In this case, since the elements included in the non-text area are not reflected in the smoothing process, the distortion existing in the non-text area is not removed.

도 8에서 첫 번째 영상(81)은 카메라로 촬영한 문서 영상이고, 두 번째 영상(82)은 기존의 평활화 방법을 이용하여 평활화를 수행한 결과 영상이고, 세 번째 영상(83)은 앞서 설명된 실시예에 따른 평활화 방법을 이용하여 평활화를 수행한 결과 영상이다. 도 8을 참조하면, 두 번째 영상(82)에 비해 세 번째 영상(83)에서 왜곡이 훨씬 효과적으로 제거되었음을 알 수 있다.8, the first image 81 is a document image taken by a camera, the second image 82 is a result of smoothing using a conventional smoothing method, and the third image 83 is a document image Which is a result of performing smoothing using the smoothing method according to the embodiment. Referring to FIG. 8, it can be seen that the distortion is more effectively removed from the third image 83 than the second image 82.

이상의 실시예들에서 사용되는 '~부'라는 용어는 소프트웨어 또는 FPGA(field programmable gate array) 또는 ASIC 와 같은 하드웨어 구성요소를 의미하며, '~부'는 어떤 역할들을 수행한다. 그렇지만 '~부'는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. '~부'는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 '~부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램특허 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들, 및 변수들을 포함한다.The term " part " used in the above embodiments means a hardware component such as a software or a field programmable gate array (FPGA) or an ASIC, and the 'part' performs certain roles. However, 'part' is not meant to be limited to software or hardware. &Quot; to " may be configured to reside on an addressable storage medium and may be configured to play one or more processors. Thus, by way of example, 'parts' may refer to components such as software components, object-oriented software components, class components and task components, and processes, functions, , Subroutines, segments of program patent code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables.

구성요소들과 '~부'들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 '~부'들로 결합되거나 추가적인 구성요소들과 '~부'들로부터 분리될 수 있다.The functions provided within the components and components may be combined with a smaller number of components and components or separated from additional components and components.

뿐만 아니라, 구성요소들 및 '~부'들은 디바이스 또는 보안 멀티미디어카드 내의 하나 또는 그 이상의 CPU 들을 재생시키도록 구현될 수도 있다.In addition, the components and components may be implemented to play back one or more CPUs in a device or a secure multimedia card.

도 7을 통해 설명된 실시예에 따른 문서 영상 평활화 방법은 컴퓨터에 의해 실행 가능한 명령어 및 데이터를 저장하는, 컴퓨터로 판독 가능한 매체의 형태로도 구현될 수 있다. 이때, 명령어 및 데이터는 프로그램 코드의 형태로 저장될 수 있으며, 프로세서에 의해 실행되었을 때, 소정의 프로그램 모듈을 생성하여 소정의 동작을 수행할 수 있다. 또한, 컴퓨터로 판독 가능한 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터로 판독 가능한 매체는 컴퓨터 기록 매체일 수 있는데, 컴퓨터 기록 매체는 컴퓨터 판독 가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함할 수 있다. 예를 들어, 컴퓨터 기록 매체는 HDD 및 SSD 등과 같은 마그네틱 저장 매체, CD, DVD 및 블루레이 디스크 등과 같은 광학적 기록 매체, 또는 네트워크를 통해 접근 가능한 서버에 포함되는 메모리일 수 있다.The document image smoothing method according to the embodiment described with reference to FIG. 7 may also be implemented in the form of a computer-readable medium for storing instructions and data executable by a computer. At this time, the command and data may be stored in the form of program code, and when executed by the processor, a predetermined program module may be generated to perform a predetermined operation. In addition, the computer-readable medium can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. The computer-readable medium can also be a computer storage medium, which can be volatile and non-volatile, implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, Volatile, removable and non-removable media. For example, the computer recording medium may be a magnetic storage medium such as an HDD and an SSD, an optical recording medium such as a CD, a DVD and a Blu-ray Disc, or a memory included in a server accessible via a network.

또한 도 7을 통해 설명된 실시예에 따른 문서 영상 평활화 방법은 컴퓨터에 의해 실행 가능한 명령어를 포함하는 컴퓨터 프로그램(또는 컴퓨터 프로그램 제품)으로 구현될 수도 있다. 컴퓨터 프로그램은 프로세서에 의해 처리되는 프로그래밍 가능한 기계 명령어를 포함하고, 고레벨 프로그래밍 언어(High-level Programming Language), 객체 지향 프로그래밍 언어(Object-oriented Programming Language), 어셈블리 언어 또는 기계 언어 등으로 구현될 수 있다. 또한 컴퓨터 프로그램은 유형의 컴퓨터 판독가능 기록매체(예를 들어, 메모리, 하드디스크, 자기/광학 매체 또는 SSD(Solid-State Drive) 등)에 기록될 수 있다. The document image smoothing method according to the embodiment described with reference to FIG. 7 may also be implemented as a computer program (or a computer program product) including instructions executable by a computer. A computer program includes programmable machine instructions that are processed by a processor and can be implemented in a high-level programming language, an object-oriented programming language, an assembly language, or a machine language . The computer program may also be recorded on a computer readable recording medium of a type (e.g., memory, hard disk, magnetic / optical medium or solid-state drive).

따라서 도 7을 통해 설명된 실시예에 따른 문서 영상 평활화 방법은 상술한 바와 같은 컴퓨터 프로그램이 컴퓨팅 장치에 의해 실행됨으로써 구현될 수 있다. 컴퓨팅 장치는 프로세서와, 메모리와, 저장 장치와, 메모리 및 고속 확장포트에 접속하고 있는 고속 인터페이스와, 저속 버스와 저장 장치에 접속하고 있는 저속 인터페이스 중 적어도 일부를 포함할 수 있다. 이러한 성분들 각각은 다양한 버스를 이용하여 서로 접속되어 있으며, 공통 머더보드에 탑재되거나 다른 적절한 방식으로 장착될 수 있다.Therefore, the document image smoothing method according to the embodiment described with reference to FIG. 7 can be implemented by a computer program as described above being executed by the computing device. The computing device may include a processor, a memory, a storage device, a high-speed interface connected to the memory and a high-speed expansion port, and a low-speed interface connected to the low-speed bus and the storage device. Each of these components is connected to each other using a variety of buses and can be mounted on a common motherboard or mounted in any other suitable manner.

여기서 프로세서는 컴퓨팅 장치 내에서 명령어를 처리할 수 있는데, 이런 명령어로는, 예컨대 고속 인터페이스에 접속된 디스플레이처럼 외부 입력, 출력 장치상에 GUI(Graphic User Interface)를 제공하기 위한 그래픽 정보를 표시하기 위해 메모리나 저장 장치에 저장된 명령어를 들 수 있다. 다른 실시예로서, 다수의 프로세서 및(또는) 다수의 버스가 적절히 다수의 메모리 및 메모리 형태와 함께 이용될 수 있다. 또한 프로세서는 독립적인 다수의 아날로그 및(또는) 디지털 프로세서를 포함하는 칩들이 이루는 칩셋으로 구현될 수 있다.Where the processor may process instructions within the computing device, such as to display graphical information to provide a graphical user interface (GUI) on an external input, output device, such as a display connected to a high speed interface And commands stored in memory or storage devices. As another example, multiple processors and / or multiple busses may be used with multiple memory and memory types as appropriate. The processor may also be implemented as a chipset comprised of chips comprising multiple independent analog and / or digital processors.

또한 메모리는 컴퓨팅 장치 내에서 정보를 저장한다. 일례로, 메모리는 휘발성 메모리 유닛 또는 그들의 집합으로 구성될 수 있다. 다른 예로, 메모리는 비휘발성 메모리 유닛 또는 그들의 집합으로 구성될 수 있다. 또한 메모리는 예컨대, 자기 혹은 광 디스크와 같이 다른 형태의 컴퓨터 판독 가능한 매체일 수도 있다.The memory also stores information within the computing device. In one example, the memory may comprise volatile memory units or a collection thereof. In another example, the memory may be comprised of non-volatile memory units or a collection thereof. The memory may also be another type of computer readable medium such as, for example, a magnetic or optical disk.

그리고 저장장치는 컴퓨팅 장치에게 대용량의 저장공간을 제공할 수 있다. 저장 장치는 컴퓨터 판독 가능한 매체이거나 이런 매체를 포함하는 구성일 수 있으며, 예를 들어 SAN(Storage Area Network) 내의 장치들이나 다른 구성도 포함할 수 있고, 플로피 디스크 장치, 하드 디스크 장치, 광 디스크 장치, 혹은 테이프 장치, 플래시 메모리, 그와 유사한 다른 반도체 메모리 장치 혹은 장치 어레이일 수 있다.And the storage device can provide a large amount of storage space to the computing device. The storage device may be a computer readable medium or a configuration including such a medium and may include, for example, devices in a SAN (Storage Area Network) or other configurations, and may be a floppy disk device, a hard disk device, Or a tape device, flash memory, or other similar semiconductor memory device or device array.

상술된 실시예들은 예시를 위한 것이며, 상술된 실시예들이 속하는 기술분야의 통상의 지식을 가진 자는 상술된 실시예들이 갖는 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 상술된 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.It will be apparent to those skilled in the art that the above-described embodiments are for illustrative purposes only and that those skilled in the art will readily understand that various changes and modifications can be made without departing from the spirit or scope of the present invention. You will understand. It is therefore to be understood that the above-described embodiments are to be considered in all respects only as illustrative and not restrictive. For example, each component described as a single entity may be distributed and implemented, and components described as being distributed may also be implemented in a combined form.

본 명세서를 통해 보호 받고자 하는 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태를 포함하는 것으로 해석되어야 한다.It is to be understood that the scope of the present invention is defined by the appended claims rather than the foregoing description and should be construed as including all changes and modifications that come within the meaning and range of equivalency of the claims, .

100: 문서 영상 평활화 장치 110: 입출력부
120: 제어부 130: 저장부
140: 통신부100: document image smoothing device 110: input /
120: control unit 130:
140:

Claims

A method for smoothing a document image,
Designing a first cost function based on properties of lines contained in a non-text area of a document image;
Designing a second cost function based on properties of text lines included in a text area of the document image;
Combining the first cost function and the second cost function to generate a third cost function; And
And removing distortion existing in the document image using the third cost function,
Wherein designing the first cost function comprises:
Designing a fourth cost function in which a value of a line segment included in the non-text area becomes smaller as the line segment is closer to a straight line;
Designing a fifth cost function in which a value of a line segment included in the non-text area decreases as the line segment approaches the horizontal direction or the vertical direction; And
Combining the fourth cost function and the fifth cost function to generate the first cost function,
Wherein designing the fourth cost function comprises:
Grouping a part of the line segments included in the non-text area according to a preset reference; And
And designing the fourth cost function so that the starting point of each of the grouped line segments becomes smaller as the line segment connecting the starting point and the ending point of two line segments located at both ends of the grouped line segments becomes closer . &Lt; / RTI >

delete

The method according to claim 1,
Wherein the grouping comprises:
Extracting line segments satisfying a condition that a distance between line segments is equal to or less than a preset predetermined value and a difference in direction between line segments is equal to or less than a preset constant value is extracted and grouped Way.

The method according to claim 1,
Wherein designing the fifth cost function comprises:
Extracting line segments longer than an average length of text lines included in the document image, among line segments included in the non-text area; And
And designing the fifth cost function such that the value of the extracted line segments decreases as the line segments approach the horizontal or vertical direction.

The method according to claim 1,
Wherein the first cost function comprises only one of the fourth cost function or the fifth cost function.

The method according to claim 1,
Wherein the first cost function comprises a coefficient giving a weight to at least one of the fourth cost function or the fifth cost function.

A computer-readable recording medium on which a program for carrying out the method according to claim 1 is recorded.

A computer program stored in a medium for performing the method of claim 1, which is performed by a document image smoothing device.

A document area dividing device comprising:
An input / output unit for receiving an input related to processing of a document image and for displaying a status and a result of processing of the document image;
A storage unit for storing a program for performing smoothing of a document image; And
And a control unit for performing smoothing of the document image by executing the program,
The control unit may design a first cost function based on properties of lines included in the non-text area of the document image, and calculate a second cost function based on properties of the text lines included in the text area of the document image And removing a distortion existing in the document image using a third cost function generated by combining the first cost function and the second cost function,
Wherein the control unit designates a fourth cost function in which a value of the line segment included in the non-text area becomes smaller as the line segment is closer to a straight line and designates a fourth cost function in which the value decreases as the line segment included in the non- 5 cost function, combining the fourth cost function and the fifth cost function to generate the first cost function,
Wherein the control unit groups a part of the line segments included in the non-text area according to a preset reference, and the start point of each of the grouped line segments is divided into two line segments positioned at both ends of the grouped line segments And designing the fourth cost function so that the value becomes smaller as it is located closer to the straight line connecting the start point and the end point.

delete

11. The method of claim 10,
Wherein,
Extracting line segments satisfying a condition that a distance between line segments is equal to or less than a preset predetermined value and a difference in direction between line segments is equal to or less than a preset constant value is extracted and grouped Device.

11. The method of claim 10,
Wherein,
Extracting line segments that are longer than an average length of the text lines included in the document image from the line segments included in the non-text area, and reducing the value of the extracted line segments as the line segments are closer to the horizontal direction or the vertical direction, And designing a cost function.

11. The method of claim 10,
Wherein,
And designing the first cost function to include only the fourth cost function or the fifth cost function.

11. The method of claim 10,
Wherein,
And to design the first cost function to include a coefficient that weights at least one of the fourth cost function or the fifth cost function.