KR101831783B1

KR101831783B1 - Apparatus for image and text recognition included in output printout and method thereof

Info

Publication number: KR101831783B1
Application number: KR1020170074166A
Authority: KR
Inventors: 이희용
Original assignee: 주식회사 처음마음
Priority date: 2016-10-27
Filing date: 2017-06-13
Publication date: 2018-02-27

Abstract

The present invention relates to a device and a method thereof to recognize an image and a text included in a printout. The image and text recognizing device includes: an input part receiving a photographed image of a printout to be recognized; an image analyzing part dividing an image and a text from the photographed image through deep-learning, and generating pattern data by analyzing arrangement patterns and paragraphs of the text and the image; a database storing pattern data about a plurality of different images and texts; a search part searching for pattern data matched with the generated pattern data by comparing the generated pattern data to pre-stored pattern data; and a control part displaying a screen or image corresponding to the matched pattern data if the pattern data exist. According to the present invention, the device is capable of reducing a necessary amount of initial data for analysis on an image or text by correcting a photographed image with a preset reference value for a plurality of photographing environments, thereby improving the performance and accuracy of analysis. Moreover, the present invention is capable of providing data corresponding to a marker function in accordance with a result of text or image analysis without an extra marker such as a QR code, thereby conveniently providing a variety of information about a text or image.

Description

[0001] APPARATUS FOR IMAGE AND TEXT RECOGNITION INCLUDED IN OUTPUT PRINTOUT AND METHOD THEREOF [0002]

본 발명은 출력 인쇄물에 포함된 이미지 및 텍스트 인식 장치 및 그 방법에 관한 것으로, 보다 상세하게는 출력 인쇄물에서 이미지 및 텍스트를 인식하고 그에 대응하는 정보를 제공하는 출력 인쇄물에 포함된 이미지 및 텍스트 인식 장치 및 그 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image and text recognition apparatus and method in an output printed matter, and more particularly to an image and text recognition apparatus and method in an output printed matter which recognize an image and text in an output printed matter and provide corresponding information, And a method thereof.

종래에는 문자 인식은 스캐너를 통한 이미지 생성과 문서 처리에 집중되어 상용화되었다. 하지만, 모바일 단말 환경의 발전으로 사용자의 이동성이 활발해지는 최근에는 사용자 단말에 장착된 카메라를 통한 영상 취득과 촬영 영상의 이미지 및 텍스트 인식을 통해 정보를 수집하고 인식된 이미지 및 텍스트에 대한 콘텐츠를 제공하는 서비스가 요구되고 있다. Conventionally, character recognition has been commercialized by focusing on image generation through a scanner and document processing. However, in recent years, the user's mobility has become more active due to the development of the mobile terminal environment. In recent years, the user acquires the information through the image acquisition through the camera mounted on the user terminal and the image and text recognition of the captured image, Service is required.

이에 모바일 단말 및 카메라 제조업체뿐만 아니라 서비스구축이 가능한 이동통신사 및 포털 업체, 솔루션 업체 및 컨텐츠 업체 등에서 촬영 영상을 통해 이미지 및 텍스트를 인식하고 활용하여 다양한 서비스를 제공하는 기술이 개발되고 있다. Accordingly, technologies for providing various services by recognizing and utilizing images and texts through photographed images are being developed by mobile communication companies, portal companies, solution providers, and content providers capable of building services as well as mobile terminal and camera manufacturers.

하지만, 촬영 영상은 해당 촬영 환경에 따라 영향을 크게 받기 때문에 서로 다른 촬영 환경에서 같은 이미지 또는 텍스트를 촬영하더라도 서로 다른 이미지 또는 텍스트로 인식될 수 있다.However, because the photographed image is greatly influenced by the photographing environment, even if the same image or text is photographed in different photographing environments, it can be recognized as a different image or text.

예를 들어, 광 반사, 열악한 조명, 인쇄물의 구김등과 같이, 촬영 환경이 열악한 경우, 조명의 밝기에 따른 색상의 인식이 달라질 수 있으며 촬영 영상에서 이미지 또는 텍스트를 부정확하게 인식하게 되는 문제점이 있다.For example, when the shooting environment is poor, such as light reflection, poor lighting, wrinkling of prints, etc., there is a problem that the recognition of colors depending on the brightness of illumination may be changed and the images or text may be incorrectly recognized in the shot image .

특히 추상적 이미지는 선, 면, 형상, 색깔, 크기 등이 다양한 요소들이 조합된 결과물로 촬영 환경에 따라 인식되는 부분이 달라질 수 있기 때문에 단순한 조건으로 선형을 구분하기에는 역부족이다. In particular, the abstract image is a result of combining various elements such as line, surface, shape, color, and size. Therefore, it is not enough to distinguish the linearity by simple conditions because the portion recognized according to the shooting environment may be different.

즉, 다양한 촬영 환경에서 잘못 인식되는 문제를 해결할 수 있으며, 이미지 및 텍스트를 정확하게 추출하고 인식하는 기술이 요구된다. That is, it is possible to solve a problem of being misrecognized in various shooting environments, and a technique of accurately extracting and recognizing images and texts is required.

본 발명의 배경이 되는 기술은 대한민국 국내공개특허 제10-2014-0068302호(2012.11.26. 공개)에 개시되어 있다.The technology that provides the background of the present invention is disclosed in Korean Patent Laid-Open No. 10-2014-0068302 (published on November 26, 2012).

본 발명이 이루고자 하는 기술적 과제는 출력 인쇄물에서 이미지 및 텍스트를 인식하고 그에 대응하는 정보를 제공하는 출력 인쇄물에 포함된 이미지 및 텍스트 인식 장치 및 그 방법을 제공하는 것이다.An object of the present invention is to provide an apparatus and method for recognizing images and texts included in an output printed matter that recognize images and text in an output printed matter and provide corresponding information.

이러한 기술적 과제를 이루기 위한 본 발명의 실시예에 따르면, 출력 인쇄물에 포함된 이미지 및 텍스트에 대한 인식 장치에 있어서, 인식하고자 하는 출력 인쇄물에 대한 촬영 영상을 입력받는 입력부, 딥러닝을 이용하여 상기 촬영 영상으로부터 이미지 및 텍스트를 구분하고, 이미지 및 텍스트의 단락과 배열 패턴을 분석하여 패턴 데이터를 생성하는 영상 분석부, 서로 다른 형태의 복수의 이미지 및 텍스트에 대한 패턴 데이터를 저장하고 있는 데이터베이스, 상기 생성된 패턴 데이터와 기 저장된 패턴 데이터를 비교하여, 상기 생성된 패턴 데이터와 일치하는 패턴 데이터를 검색하는 검색부, 그리고 일치하는 패턴 데이터가 존재하면, 해당되는 패턴 데이터에 대응하는 화면 또는 영상을 표시하는 제어부를 포함한다. According to an embodiment of the present invention, there is provided an apparatus for recognizing images and texts included in an output printed matter, the apparatus comprising: an input unit for inputting a captured image of an output printed material to be recognized; An image analyzing unit that separates an image and a text from an image and generates pattern data by analyzing a paragraph and an arrangement pattern of the image and text, a database storing pattern data for a plurality of images and texts of different types, A search unit for comparing the pattern data and pre-stored pattern data to search for pattern data that matches the generated pattern data; and a display unit for displaying a screen or an image corresponding to the corresponding pattern data, And a control unit.

상기 영상 분석부는, 상기 촬영 영상의 색상, 밝기 및 구겨짐 정도 중에서 적어도 하나를 통해 촬영 환경을 추정하고, 상기 추정된 촬영 환경에 대응하여 기 설정된 기준 값으로 상기 촬영 영상을 보정할 수 있다. The image analyzing unit may estimate the photographing environment through at least one of color, brightness, and wrinkling degree of the photographed image, and may correct the photographed image with a preset reference value corresponding to the estimated photographed environment.

상기 영상 분석부는, 상기 촬영 영상에서 딥러닝을 이용하여 이미지의 외곽선, 형태, 색상, 크기 중에서 적어도 하나를 분석하여 이미지와 텍스트를 구분하고, 상기 텍스트의 들여쓰기 및 띄어쓰기를 이용하여 단락을 구분하며, 상기 이미지와 텍스트의 배치 위치를 통하여 배열 패턴을 추출할 수 있다. Wherein the image analyzing unit comprises: The method includes the steps of: dividing an image and a text by analyzing at least one of an outline, a shape, a color, and a size of the image by using deep running in the captured image; separating paragraphs using indentation and spacing of the text; It is possible to extract the arrangement pattern through the arrangement position.

상기 검색부는, 상기 일치하는 패턴 데이터가 상기 데이터베이스에 복수 개 존재하는 경우, 상기 텍스트의 글자 폰트 종류, 크기, 굵기 및 글자간의 여백의 크기 중에서 적어도 하나를 이용하여 상기 데이터베이스로부터 일치하는 패턴 데이터를 1차적으로 검색할 수 있다, The search unit may search, If there is a plurality of matching pattern data in the database, matching pattern data is primarily searched from the database using at least one of the type, size, thickness, Can,

상기 검색부는, 상기 1차 검색 이후에도 상기 일치하는 패턴 데이터가 상기 데이터베이스에 복수 개 존재하는 경우, 상기 텍스트에서 임계값 이상으로 반복적으로 작성된 키워드를 이용하여 상기 데이터베이스로부터 일치하는 패턴 데이터를 2차적으로 검색할 수 있다. The search unit may search, If there is a plurality of matching pattern data in the database after the first search, The matching pattern data can be secondarily retrieved from the database by using the keyword repeatedly created in the text above the threshold value.

상기 제어부는, 상기 촬영 영상에 대한 일치하는 패턴 데이터가 존재하면, 매칭되는 동영상, 이미지, 텍스트, 도표, 관련 URL, 관련 앱 중에서 적어도 하나를 연동하여 표시하거나 실행시킬 수 있다. The controller may display or execute at least one of a matching moving image, an image, a text, a diagram, a related URL, and related apps in association with each other if pattern data corresponding to the captured image exists.

본 발명의 다른 실시예에 따르면, 이미지 및 텍스트 인식 장치를 이용한 이미지 및 텍스트 인식 방법에 있어서, 인식하고자 하는 출력 인쇄물에 대한 촬영 영상을 입력받는 단계, 딥러닝을 이용하여 상기 촬영 영상으로부터 이미지 및 텍스트를 구분하고, 이미지 및 텍스트의 단락과 배열 패턴을 분석하여 패턴 데이터를 생성하는 단계, 서로 다른 형태의 복수의 이미지 및 텍스트에 대한 패턴 데이터를 저장하고 있는 데이터베이스를 이용하여 상기 생성된 패턴 데이터와 기 저장된 패턴 데이터를 비교하여, 상기 생성된 패턴 데이터와 일치하는 패턴 데이터를 검색하는 단계, 그리고 일치하는 패턴 데이터가 존재하면, 해당되는 패턴 데이터에 대응하는 화면 또는 영상을 표시하는 단계를 포함한다.According to another embodiment of the present invention, there is provided a method of recognizing an image and a text using an image and text recognition apparatus, the method comprising: receiving a captured image of an output printed material to be recognized; Generating pattern data by analyzing paragraphs and arrangement patterns of images and texts, generating pattern data by using a database storing pattern data for a plurality of different types of images and texts, Comparing the stored pattern data to search for pattern data that coincides with the generated pattern data, and displaying a screen or an image corresponding to the corresponding pattern data if there is matching pattern data.

본 발명에 따르면, 복수의 촬영 환경에 대한 기 설정된 기준값을 이용하여 촬영 영상을 보정함으로써, 이미지 또는 텍스트를 분석할 때 필요한 초기 데이터량을 줄일 수 있으며, 그에 따라 분석 성능 및 정확도를 향상시킬 수 있다. According to the present invention, it is possible to reduce the amount of initial data necessary for analyzing an image or text by correcting the photographed image by using a predetermined reference value for a plurality of photographing environments, thereby improving analysis performance and accuracy .

또한, 본 발명에 따르면, 별도의 QR 코드와 같은 마커를 이용하지 않아도 인식되는 텍스트 또는 이미지 분석 결과에 따라 마커 기능과 동일하게 대응되는 데이터를 제공함으로써, 텍스트 또는 이미지와 관련된 다양한 정보들을 편리하게 제공할 수 있다. In addition, according to the present invention, it is possible to conveniently provide various information related to a text or an image by providing data corresponding to a marker function according to a recognized text or image analysis result without using a marker such as a separate QR code can do.

도 1은 본 발명의 실시예에 따른 출력 인쇄물에 포함된 이미지 및 텍스트 인식 시스템을 설명하기 위한 도면이다.
도 2는 본 발명의 실시예에 따른 이미지 및 텍스트 인식 장치를 나타낸 구성도이다.
도 3은 본 발명의 실시예에 따른 이미지 및 텍스트 인식 장치의 이미지 및 텍스트 인식 방법을 나타낸 순서도이다. 1 is a diagram for explaining an image and text recognition system included in an output print according to an embodiment of the present invention.
2 is a block diagram of an image and text recognition apparatus according to an embodiment of the present invention.
3 is a flowchart illustrating an image and text recognition method of an image and text recognition apparatus according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings, which will be readily apparent to those skilled in the art. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise.

그러면 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention.

이하에서는 도 1을 통해 출력 인쇄물에 포함된 이미지 및 텍스트 인식 시스템에 대해 상세하게 설명한다. Hereinafter, an image and text recognition system included in an output printed matter will be described in detail with reference to FIG.

도 1은 본 발명의 실시예에 따른 출력 인쇄물에 포함된 이미지 및 텍스트 인식 시스템을 설명하기 위한 도면이다. 1 is a diagram for explaining an image and text recognition system included in an output print according to an embodiment of the present invention.

도 1에 도시한 바와 같이, 본 발명의 실시예에 따른 이미지 및 텍스트 인식 시스템은 출력 인쇄물(100)을 인식하기 위한 이미지 및 텍스트 인식 장치(200) 및 카메라(300)를 포함한다.As shown in FIG. 1, an image and text recognition system according to an embodiment of the present invention includes an image and text recognition apparatus 200 and a camera 300 for recognizing an output printed matter 100.

먼저, 출력 인쇄물(100)은 인식하고자 하는 대상으로, 팜플렛, 전단지, 정보지, 홍보물, 카달로그, 책자, 명함 등을 포함한다.First, the output printed matter 100 includes a pamphlet, a leaflet, an information sheet, a promotional article, a catalog, a booklet, a business card, and the like, as an object to be recognized.

다음으로 이미지 및 텍스트 인식 장치(200)는 출력 인쇄물(100)을 분석하여 출력 인쇄물(100)에 포함된 이미지와 텍스트를 구분하고 배열 패턴 및 여백의 크기등을 분석한 패턴 데이터를 생성한다. 그리고 이미지 및 텍스트 인식 장치(200)는 생성된 패턴 데이터와 기 저장된 패턴 데이터를 비교하여 일치하는 기 저장된 패턴 데이터를 검색하면, 대응되는 화면 또는 영상을 표시한다. Next, the image and text recognizing apparatus 200 analyzes the output printed material 100, separates the image and the text included in the output printed material 100, and generates pattern data analyzing an arrangement pattern, a margin size, and the like. Then, the image and text recognition apparatus 200 compares the generated pattern data with pre-stored pattern data, and displays the corresponding screen or image when searching for previously stored pattern data.

한편, 이미지 및 텍스트 인식 장치(200)는 데이터베이스에 서로 다른 형태의 복수의 이미지 및 텍스트에 대한 패턴 데이터를 저장하고 있으며, 각 패턴 데이터에 매칭되는 동영상, 이미지, 텍스트, 도표, 관련 URL, 관련 앱 중에서 적어도 하나가 링크되어 저장하고 있다. Meanwhile, the image and text recognition apparatus 200 stores pattern images of a plurality of images and texts of different types in a database, and stores a plurality of images, texts, images, texts, charts, related URLs, At least one of them is linked and stored.

또한, 이미지 및 텍스트 인식 장치(200)는 데이터베이스에 다양한 촬영 환경에 대한 촬영 영상의 색상, 밝기 및 구겨짐 정도 값의 기준 값을 저장하고 있다. In addition, the image and text recognition apparatus 200 stores reference values of color, brightness, and wrinkle degree values of the photographed image for various shooting environments in the database.

다음으로 카메라(300)는 별도의 장치로 구성되거나 이미지 및 텍스트 인식 장치(200)에 장착될 수 있으며, 특정 종류의 카메라로 한정되지는 않는다.Next, the camera 300 may be configured as a separate device or mounted in the image and text recognition apparatus 200, and is not limited to a specific type of camera.

그리고 카메라(300)는 출력 인쇄물(100)을 촬영하고, 촬영 영상을 네트워크로 연결된 이미지 및 텍스트 인식 장치(200)로 전달한다. Then, the camera 300 photographs the output printed matter 100 and transmits the photographed image to the image and text recognition apparatus 200 connected to the network.

즉, 이미지 및 텍스트 인식 장치(200)와 카메라(300)는 네트워크로 서로 연결되어 정보 교환이 가능하며, 이러한 통신망의 일 예는, 유선 통신망 외에도 인터넷(Internet), LAN(Local Area Network), Wireless LAN(Wireless Local Area Network), WAN(Wide Area Network), PAN(Personal Area Network), 3G, 4G, Wi-Fi 등의 무선 통신망이 포함되나 이에 한정되지는 않는다.That is, the image and text recognizing apparatus 200 and the camera 300 are connected to each other via a network so that information can be exchanged. In addition to the wired communication network, an Internet, a LAN (Local Area Network) But are not limited to, wireless communication networks such as a LAN (Wireless Local Area Network), a WAN (Wide Area Network), a PAN (Personal Area Network), 3G, 4G, Wi-Fi.

이하에서는 도 2를 통해 출력 인쇄물에 포함된 이미지 및 텍스트를 인식하는 이미지 및 텍스트 인식 장치(200)에 대해 상세하게 설명한다. Hereinafter, the image and text recognition apparatus 200 for recognizing images and text included in an output printed matter will be described in detail with reference to FIG.

도 2는 본 발명의 실시예에 따른 이미지 및 텍스트 인식 장치의 구성도이다. 도 2에 도시한 바와 같이, 본 발명의 실시예에 따른 이미지 및 텍스트 인식 장치(200)는 입력부(210), 영상 분석부(220), 데이터베이스(230), 검색부(240) 및 제어부(250)를 포함한다. 2 is a configuration diagram of an image and text recognition apparatus according to an embodiment of the present invention. 2, an image and text recognition apparatus 200 according to an exemplary embodiment of the present invention includes an input unit 210, an image analysis unit 220, a database 230, a search unit 240, and a control unit 250 ).

먼저, 입력부(210)는 인식하고자 하는 출력 인쇄물에 대한 촬영 영상을 입력받는다. 즉, 입력부(210)는 연동되는 카메라로부터 실시간으로 촬영 영상을 입력받거나 네트워크를 통해 촬영 영상을 다운받아 입력받을 수 있다. First, the input unit 210 receives a photographed image of an output printed material to be recognized. That is, the input unit 210 may receive the captured image in real time from the interlocked camera or download the captured image through the network.

여기서 출력 인쇄물은 팜플렛, 전단지, 정보지, 홍보물, 카달로그, 책자 등을 포함한다. Here, the printed matter includes pamphlets, flyers, information papers, promotional materials, catalogs, brochures, and the like.

다음으로 영상 분석부(220)는 딥러닝을 이용하여 촬영 영상에서 이미지 또는 텍스트를 구분하고 각 단락과 배열 패턴을 추출하여 패턴 데이터를 생성한다. Next, the image analyzer 220 separates an image or a text from the photographed image by using deep running, extracts each paragraph and an array pattern, and generates pattern data.

여기서 영상 분석부(220)는 사물이나 데이터를 군집화하거나 분류하는 기술인 딥러닝(Deep Learning)을 이용하여 촬영 영상으로부터 이미지에 포함된 선을 추출하고 왜곡하는 반복적인 과정을 통해 이미지와 텍스트를 구분할 수 있다. Here, the image analyzing unit 220 can separate the image and the text through a repetitive process of extracting and distorting a line included in the image from the captured image using Deep Learning, which is a technique of clustering or classifying objects or data. have.

또한, 영상 분석부(220)는 촬영 영상에서의 출력 인쇄물 크기의 비율, 출력 인쇄물에서의 이미지 또는 텍스트가 위치하는 위치에 따른 상단, 하단 및 좌우 여백의 크기를 이용하여 패턴 데이터를 생성할 수 있다. In addition, the image analysis unit 220 may generate pattern data using the sizes of the top, bottom, and left and right margins according to the ratio of the size of the output printed matter in the photographed image, the position of the image or text in the output printed matter .

다음으로 데이터베이스(230)는 서로 다른 형태의 복수의 이미지 및 텍스트에 대한 패턴 데이터를 저장하고 있다. 또한 데이터베이스(230)는 각각의 패턴 데이터와 관련도가 높은 동영상, 이미지, 텍스트, 도표, 관련 URL, 관련 앱 등이 매칭되어 저장하고 있다. Next, the database 230 stores pattern data for a plurality of images and texts of different types. Also, the database 230 stores moving pictures, images, texts, graphics, related URLs, related apps, etc. highly related to each pattern data.

다음으로 검색부(240)는 생성된 패턴 데이터와 기 저장된 패턴 데이터를 비교하여, 생성된 패턴 데이터와 일치하는 패턴 데이터를 검색한다. 그리고 검색부(240)는 일치하는 패턴 데이터가 복수 개 존재하는 경우, 촬영 영상의 텍스트의 글자 폰트 종류, 크기, 굵기 및 글자간의 간격에 따른 여백의 크기를 이용하여 가장 일치하는 패턴 데이터를 검색한다. Next, the search unit 240 compares the generated pattern data with pre-stored pattern data, and searches for pattern data that matches the generated pattern data. If there is a plurality of matching pattern data, the search unit 240 searches for the pattern data that coincides most with the size of the margin based on the font type, size, thickness, and spacing between characters of the text of the photographed image .

다음으로 제어부(250)는 일치하는 패턴 데이터가 존재하면, 해당되는 패턴 데이터에 대응하는 화면 또는 영상을 표시한다. 이때, 제어부(250)는 화면 또는 영상 외에도 동영상, 이미지, 텍스트, 도표, 관련 URL, 관련 앱을 연동하여 표시하거나 실행시킬 수 있다. Next, the control unit 250 displays a screen or an image corresponding to the corresponding pattern data if there is matching pattern data. At this time, the control unit 250 can display and execute moving images, images, texts, graphics, related URLs, and related apps in cooperation with the screen or image.

이하에서는 도 3을 이용하여 본 발명의 실시예에 따른 출력 인쇄물에 포함된 이미지 및 텍스트 인식 장치(200)의 이미지 및 텍스트를 인식하는 과정에 대해서 설명한다. Hereinafter, a process of recognizing images and texts of the image and text recognition apparatus 200 included in an output printed matter according to an embodiment of the present invention will be described with reference to FIG.

도 3은 본 발명의 실시예에 따른 이미지 및 텍스트 인식 장치의 이미지 및 텍스트 인식 방법을 나타낸 순서도이다. 3 is a flowchart illustrating an image and text recognition method of an image and text recognition apparatus according to an embodiment of the present invention.

먼저, 이미지 및 텍스트 인식 장치(200)는 인식하고자 하는 출력 인쇄물에 대한 촬영 영상을 입력받는다(S310). First, the image and text recognition apparatus 200 receives a captured image of an output printed material to be recognized (S310).

이때, 이미지 및 텍스트 인식 장치(200)는 촬영과 동시에 출력 인쇄물(100)의 촬영 영상을 실시간으로 입력받거나 이전 시점에 촬영된 출력 인쇄물(100)의 촬영 영상을 입력받을 수 있다. At this time, the image and text recognition apparatus 200 can receive the photographed image of the output printed material 100 at the same time as the photographed image or receive the photographed image of the output printed material 100 photographed at the previous time.

다음으로 이미지 및 텍스트 인식 장치(200)는 딥러닝을 이용하여 촬영 영상에서 이미지 또는 텍스트를 구분하고 각 단락과 배열 패턴을 추출하여 패턴 데이터를 생성한다(S320). Next, the image and text recognition apparatus 200 separates the image or text from the photographed image by using deep running, extracts each of the paragraphs and the array pattern, and generates pattern data (S320).

먼저, 이미지 및 텍스트 인식 장치(200)는 촬영 영상의 색상, 밝기 및 구겨짐 정도 중에서 적어도 하나를 통해 촬영 환경을 추정할 수 있다. 그리고 이미지 및 텍스트 인식 장치(200)는 추정된 촬영 환경에 대응하여 기 설정된 기준 값으로 촬영 영상을 보정할 수 있다.First, the image and text recognition apparatus 200 can estimate the photographing environment through at least one of color, brightness, and wrinkle degree of the photographed image. Then, the image and text recognition apparatus 200 can correct the photographed image to a preset reference value corresponding to the estimated photographic environment.

본 발명의 실시예에 따른 이미지 및 텍스트 인식 장치(200)는 다양한 촬영 환경을 가정하고, 각각의 다양한 촬영 환경에서 촬영된 출력 인쇄물(100)의 색상, 밝기 및 구겨짐 정도의 값을 추출한다. 그리고 이미지 및 텍스트 인식 장치(200)는 각각의 촬영 환경에서 촬영 영상의 색상, 밝기 및 구겨짐 정도의 값을 조정하면서 출력 인쇄물(100)의 이미지 및 텍스트를 인식하는 시뮬레이션을 통해 이미지 및 텍스트의 인식율이 가장 높은 촬영 영상의 색상, 밝기 및 구겨짐 정도의 값을 추출할 수 있다. The image and text recognition apparatus 200 according to the embodiment of the present invention assumes various shooting environments and extracts values of the color, brightness and wrinkle degree of the photographed output print 100 in each of various shooting environments. Then, the image and text recognition apparatus 200 recognizes the recognition rate of the image and the text through the simulation of recognizing the image and text of the output printed matter 100 while adjusting the values of the color, brightness and wrinkle degree of the photographed image in each photographing environment The value of the color, brightness, and wrinkle degree of the highest captured image can be extracted.

그러면 이미지 및 텍스트 인식 장치(200)는 추출된 촬영 영상의 색상, 밝기 및 구겨짐 정도의 값들을 각 촬영 환경에 대응하는 기준값으로 설정하여 저장할 수 있다. Then, the image and text recognition apparatus 200 can store the values of the color, brightness, and wrinkle degree of the extracted photographic image as reference values corresponding to the respective photographing environments.

이와 같이, 이미지 및 텍스트 인식 장치(200)는 촬영 영상을 입력받으면, 입력받은 촬영 영상의 색상, 밝기 및 구겨짐 정도의 값을 추출하여 촬영 환경을 추정하고, 추정된 촬영 환경에 대응하여 기 설정된 기준 값으로 보정할 수 있다. When the image and text recognition apparatus 200 receives the captured image, the image and text recognition apparatus 200 extracts the values of the color, brightness, and wrinkle degree of the captured image to estimate the shooting environment, Value.

한편, 이미지 및 텍스트 인식 장치(200)는 촬영 영상에서 출력 인쇄물(100)의 구겨짐 정도 값이 기준값보다 큰 경우, 추정된 출력 인쇄물(100)의 가로 세로의 비율을 이용하여 구겨진 영역을 펴서 구겨짐이 제거된 출력 인쇄물(100)의 촬영 영상으로 증강 현실을 통해 구현할 수 있다. On the other hand, when the wrinkled degree value of the output printed matter 100 in the photographed image is larger than the reference value, the image and text recognition apparatus 200 spreads the wrinkled region by using the ratio of the estimated output printed matter 100 in the horizontal and vertical directions, It can be realized through the augmented reality as the photographed image of the output printed material 100 that has been removed.

다음으로 이미지 및 텍스트 인식 장치(200)는 촬영 영상에서 딥러닝을 이용하여 이미지의 외곽선, 형태, 색상, 크기 중에서 적어도 하나를 분석하여 이미지와 텍스트를 구분할 수 있다. Next, the image and text recognition apparatus 200 can distinguish between the image and the text by analyzing at least one of the outline, shape, color, and size of the image by using deep running in the captured image.

그리고 이미지 및 텍스트 인식 장치(200)는 촬영 영상에서 텍스트의 들여쓰기 및 띄어쓰기를 이용하여 단락을 구분하고, 이미지와 텍스트의 배치 위치를 통하여 배열 패턴을 추출할 수 있다. Then, the image and text recognition apparatus 200 can distinguish paragraphs by using indentation and spacing of text in the photographed image, and extract an arrangement pattern through arrangement positions of images and text.

또한, 증강 현실로 구현된 구겨짐이 제거된 출력 인쇄물(100)의 촬영 영상의 경우, 이미지 및 텍스트 인식 장치(200)는 딥 러닝을 이용하여 구겨짐으로 인해 가려진 영역이 이미지 영역 또는 텍스트 영역인지 추정할 수 있다.In the case of the photographed image of the wrinkled output printed material 100 implemented as an augmented reality, the image and text recognition apparatus 200 estimates whether the obscured area is an image area or a text area due to wrinkling using deep running .

예를 들어, 이미지 및 텍스트 인식 장치(200)는 촬영 영상에서 구겨짐으로 인해 가려진 영역의 주변 영역의 색상을 분석하여 복수의 색상이 검출되면 이미지로 분류하거나 딥 러닝을 이용하여 선, 형상들의 요소가 기준 값 이상으로 검출되면 이미지로 분류할 수 있다. 그리고 이미지 및 텍스트 인식 장치(200)는 촬영 영상에서 출력 인쇄물(100)의 다른 정상적인 영역에서 추정된 이미지 또는 텍스트 단락의 크기에 대응하여 구겨짐으로 인해 가려진 영역의 이미지 또는 텍스트 단락의 크기를 추정할 수 있다. For example, the image and text recognition apparatus 200 analyzes the color of the surrounding region of the obscured region due to wrinkling in the photographed image, classifies it into an image if a plurality of colors are detected, If it is detected above the reference value, it can be classified as an image. The image and text recognition apparatus 200 can estimate the size of the image or text paragraph of the obscured region due to wrinkles corresponding to the size of the estimated image or text paragraph in the other normal region of the output printed matter 100 have.

다음으로 이미지 및 텍스트 인식 장치(200)는 패턴 데이터와 데이터베이스의 기 저장된 패턴 데이터를 비교하여 일치하는 패턴 데이터를 검색한다(S330). Next, the image and text recognition apparatus 200 compares pattern data and pre-stored pattern data of the database to search for matching pattern data (S330).

여기서, 데이터베이스는 서로 다른 형태의 복수의 이미지 및 텍스트에 대한 패턴 데이터를 저장하고 있으며, 각각의 패턴 데이터는 관련도가 높은 동영상, 이미지, 텍스트, 도표, 관련 URL, 관련 앱 중에서 적어도 하나가 매칭되어 있다. Here, the database stores pattern data for a plurality of images and texts of different types, and each pattern data is at least one of a highly related video, image, text, diagram, related URL, have.

그리고 이미지 및 텍스트 인식 장치(200)는 일치하는 패턴 데이터가 상기 데이터베이스에 복수 개 존재하는 경우, 촬영 영상에서 텍스트 단락의 텍스트의 글자 폰트 종류, 크기, 굵기 및 글자간의 여백의 크기를 추출할 수 있다. If there is a plurality of matching pattern data in the database, the image and text recognition apparatus 200 can extract the font type, size, thickness, and size of margins between characters in the text of the text segment in the captured image .

그리고 이미지 및 텍스트 인식 장치(200)는 추출된 텍스트의 글자 폰트 종류, 크기, 굵기 및 글자간의 여백의 크기 중에서 적어도 하나를 이용하여 데이터베이스로부터 일치하는 패턴 데이터를 1차적으로 검색할 수 있다. Then, the image and text recognition apparatus 200 can primarily search for matching pattern data from the database using at least one of the font type, size, thickness, and the size of the space between characters of the extracted text.

즉, 이미지 및 텍스트 인식 장치(200)는 촬영 영상에서 추출된 텍스트의 글자 폰트 종류, 크기, 굵기 및 글자간의 여백의 크기와 동일한 텍스트의 특성을 가지는 패턴 데이터를 검색할 수 있다. That is, the image and text recognition apparatus 200 can retrieve pattern data having the same text characteristics as the font type, size, thickness, and size of the space between characters extracted from the photographed image.

한편, 이미지 및 텍스트 인식 장치(200)는 1차 검색 이후에도 일치하는 패턴 데이터가 데이터베이스에 복수 개 존재하는 경우, 텍스트에서 임계값 이상으로 반복적으로 작성된 키워드를 추출할 수 있다. 그리고 이미지 및 텍스트 인식 장치(200)는 추출된 키워드를 이용하여 데이터베이스로부터 일치하는 패턴 데이터를 2차적으로 검색할 수 있다. On the other hand, if a plurality of matching pattern data exist in the database even after the first search, the image and text recognition apparatus 200 can extract the keywords repeatedly created in the text above the threshold value. Then, the image and text recognition apparatus 200 can secondarily search for matching pattern data from the database using the extracted keywords.

즉, 이미지 및 텍스트 인식 장치(200)는 1차 검색 이후 복수개의 일치하는 패턴 데이터 중에서 추출된 키워드를 포함하거나 가장 많이 포함하는 패턴 데이터를 검색할 수 있다. That is, the image and text recognition apparatus 200 can retrieve pattern data including or extracted from a plurality of matching pattern data after the primary search.

이와 같이, 이미지 및 텍스트 인식 장치(200)는 단계별로 검색 조건을 달리하여 생성된 패턴 데이터와 가장 일치하는 데이터베이스에 기 저장된 패턴 데이터를 검색할 수 있다. As described above, the image and text recognition apparatus 200 can search pattern data previously stored in a database that best matches the pattern data generated by changing the search condition in stages.

다음으로 이미지 및 텍스트 인식 장치(200)는 일치하는 패턴 데이터가 존재하면, 해당되는 패턴 데이터에 대응하는 화면 또는 영상을 표시한다(S340).Next, the image and text recognition apparatus 200 displays a screen or an image corresponding to the corresponding pattern data if there is matching pattern data (S340).

즉, 이미지 및 텍스트 인식 장치(200)는 촬영 영상에 대한 일치하는 패턴 데이터가 존재하면, 매칭되는 동영상, 이미지, 텍스트, 도표, 관련 URL, 관련 앱 중에서 적어도 하나를 연동하여 표시하거나 실행시킬 수 있다. That is, if there is pattern data corresponding to the photographed image, the image and text recognition apparatus 200 may display or execute at least one of the matched moving image, image, text, diagram, related URL, .

이때, 매칭되는 동영상은 증강현실을 이용하여 입체감이 있는 영상을 제공할 수 있다. At this time, a matching moving image can provide a three-dimensional image using an augmented reality.

이와 같이 본 발명의 실시예에 따르면, 복수의 촬영 환경에 대한 기 설정된 기준값을 이용하여 촬영 영상을 보정함으로써, 이미지 또는 텍스트를 분석할 때 필요한 초기 데이터량을 줄일 수 있으며, 그에 따라 분석 성능 및 정확도를 향상시킬 수 있다. As described above, according to the embodiment of the present invention, it is possible to reduce the initial amount of data necessary for analyzing an image or text by correcting the photographed image using preset reference values for a plurality of photographing environments, Can be improved.

또한, 본 발명의 실시예에 따르면 별도의 QR 코드와 같은 마커를 이용하지 않아도 인식되는 텍스트 또는 이미지 분석 결과에 따라 마커 기능과 동일하게 대응되는 데이터를 제공함으로써, 텍스트 또는 이미지와 관련된 다양한 정보들을 편리하게 제공할 수 있다. According to the embodiment of the present invention, even if a marker such as a separate QR code is not used, data corresponding to the marker function is provided according to the recognized text or image analysis result, Can be provided.

본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 다른 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의하여 정해져야 할 것이다.While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.

100: 출력 인쇄물 200: 이미지 및 텍스트 인식 장치
210: 입력부 220: 영상 분석부
230: 데이터베이스 240: 검색부
250: 제어부 300: 카메라 100: output print 200: image and text recognition device
210: Input unit 220: Image analysis unit
230: database 240: search unit
250: control unit 300: camera

Claims

An apparatus for recognizing images and text contained in an output print,
An input unit for inputting a photographed image of an output print to be recognized,
An image analyzing unit for separating an image and text from the photographed image by using deep running and analyzing a short circuit and an arrangement pattern of an image and a text to generate pattern data,
A database storing pattern data for a plurality of images and texts of different types,
A search unit for comparing the generated pattern data with pre-stored pattern data and searching for pattern data matching the generated pattern data; and
And a control unit for displaying a screen or an image corresponding to the corresponding pattern data if there is matching pattern data.

The method according to claim 1,
Wherein the image analyzing unit comprises:
And estimates the photographing environment through at least one of color, brightness, and wrinkling degree of the photographed image, and corrects the photographed image with a predetermined reference value corresponding to the estimated photographing environment.

The method according to claim 1,
Wherein the image analyzing unit comprises:
The method includes at least one of an outline, a shape, a color, and a size of an image by using deep running in the captured image,
The paragraphs are separated using indentation and spacing of the text,
And extracts an arrangement pattern through an arrangement position of the image and the text.

The method of claim 3,
The search unit may search,
If there is a plurality of matching pattern data in the database, matching pattern data is primarily searched from the database using at least one of the font type, size, thickness, and size of the space between characters of the text Image and text recognition device.

5. The method of claim 4,
The search unit may search,
If there is a plurality of matching pattern data in the database after the first search,
And secondarily searching for matching pattern data from the database by using a keyword repeatedly created in the text above a threshold value.

The method according to claim 1,
Wherein,
And displaying at least one of a matching moving image, an image, a text, a diagram, a related URL, and related apps in association with each other if pattern data corresponding to the captured image exists.

A method of recognizing an image and a text using an image and text recognition apparatus,
Receiving a photographed image of an output printed matter to be recognized,
Separating an image and a text from the photographed image using deep running, analyzing a paragraph and an arrangement pattern of an image and a text to generate pattern data,
Comparing the generated pattern data with pre-stored pattern data using a database storing pattern data for a plurality of images and texts of different types, and searching for pattern data corresponding to the generated pattern data; And
And displaying a screen or an image corresponding to the corresponding pattern data if there is matching pattern data.

8. The method of claim 7,
Wherein the generating the pattern data comprises:
And estimating the photographing environment through at least one of color, brightness, and wrinkling degree of the photographed image, and correcting the photographed image with a preset reference value corresponding to the estimated photographing environment.

8. The method of claim 7,
Wherein the generating the pattern data comprises:
The method includes at least one of an outline, a shape, a color, and a size of an image by using deep running in the captured image,
The paragraphs are separated using indentation and spacing of the text,
And extracting an arrangement pattern through an arrangement position of the image and the text.

10. The method of claim 9,
Wherein the searching comprises:
If there is a plurality of matching pattern data in the database, matching pattern data is primarily searched from the database using at least one of the font type, size, thickness, and size of the space between characters of the text Image and text recognition method.

11. The method of claim 10,
Wherein the searching comprises:
If there is a plurality of matching pattern data in the database after the first search,
Wherein the matching pattern data is secondarily searched from the database by using a keyword repeatedly created in the text above a threshold value.

8. The method of claim 7,
Wherein the displaying comprises:
And displaying and executing at least one of a matching moving image, an image, a text, a diagram, an associated URL, and an associated app when the matching pattern data for the captured image exists.