KR102374281B1

KR102374281B1 - Importance Determination System of Text Block Extracted from Image and Its Method

Info

Publication number: KR102374281B1
Application number: KR1020200024023A
Authority: KR
Inventors: 박지혁; 한예지; 장민성
Original assignee: 주식회사 와들
Priority date: 2020-02-27
Filing date: 2020-02-27
Publication date: 2022-03-16
Also published as: WO2021172699A1; KR20210109146A

Abstract

본 발명은 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템 및 그 방법에 관한 것으로서, 입력받은 이미지로부터 텍스트를 추출하는 문자 인식부; 추출된 텍스트를 문장 단위로 구분하여 텍스트 블록을 생성하는 문자 블록부; 및 텍스트 블록 각각으로부터 특징을 추출하여 해당 텍스트 블록에 대한 특성을 지정하고, 지정한 텍스트 블록 특성값이 기 설정된 역치값을 초과하는 경우, 해당 출력대상 텍스트 블록으로 분류하는 연산부를 포함한다.
상기와 같은 본 발명에 따르면, 이미지에 포함된 텍스트들을 인식하여 생성한 텍스트 블록으로부터 특징을 추출하고, 추출한 특징들에 대한 중요도 계산을 통해 출력 대상 텍스트에 대한 이진분류를 수행함에 따라, 이미지에 포함된 텍스트들 중에 계산된 중요도와 부합하는 텍스트만을 선별해 출력할 수 있고, 스크린 리더기와의 연계를 통해 시각장애인의 온라인 쇼핑시 필요한 정보만을 음성으로 제공할 수 있다.The present invention relates to a system and method for determining the importance of a text block extracted from an image, comprising: a character recognition unit for extracting text from an input image; a character block unit generating a text block by dividing the extracted text into sentence units; and an operation unit that extracts features from each text block to designate a characteristic for the text block, and classifies the text block into a corresponding output target text block when the specified text block characteristic value exceeds a preset threshold value.
According to the present invention as described above, features are extracted from a text block generated by recognizing texts included in an image, and binary classification is performed on the output target text by calculating the importance of the extracted features. Among the texts, only texts that match the calculated importance can be selected and output, and only information necessary for online shopping for the visually impaired can be provided by voice through linkage with a screen reader.

Description

Importance Determination System of Text Block Extracted from Image and Its Method

본 발명은 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템 및 그 방법에 관한 것으로, 더욱 상세하게는 이미지로부터 인식한 텍스트들 중에 계산된 중요도와 부합하는 텍스트만을 선별해 출력하는 기술에 관한 것이다.The present invention relates to a system and method for determining the importance of a text block extracted from an image, and more particularly, to a technology for selecting and outputting only texts matching calculated importance among texts recognized from an image.

광학 문자 판독 장치(OCR : Optical Character Reader)는 빛을 이용해 문자를 판독하는 장치로, 종이에 인쇄되거나 손으로 쓴 문자, 숫자 또는 다른 기호의 형태가 갖는 정보로부터 디지털 컴퓨터에 알맞게 부호화된 전기신호로 변환하는 장치를 일컫는다. Optical Character Reader (OCR) is a device that reads characters using light. It converts information in the form of letters, numbers, or other symbols printed on paper or written by hand into an electrical signal that is coded appropriately for a digital computer. device that converts.

다수의 기업이나 연구소에서 다양한 OCR 모델을 개발하고 있으며, 이러한 OCR은 이미지 내 모든 텍스트를 정밀하게 인식하는 것에 초첨을 맞추어 개발되고 있어 근래에는 작고 흐릿한 글씨까지 인식하는 것이 가능해졌다.A number of companies and research institutes are developing various OCR models, and these OCRs are being developed with a focus on recognizing all texts in images with precision.

그러나, 종래의 OCR은 배경에 인쇄된 작은 텍스트 등 중요 컨텐츠와 무관한 텍스트까지 인식함에 따라 불필요한 텍스트를 필터링 해야하는 번거로움이 있다.However, as the conventional OCR recognizes even text irrelevant to important content, such as small text printed on the background, it is inconvenient to filter unnecessary text.

또한, OCR을 통해 인식한 텍스트를 TTS(Text To Speech) 기능을 통해 출력하는 경우에도, 모든 텍스트를 읽어주게 되고 이때 불필요한 텍스트까지 전달하게 되어 청자(聽者)에게 정확한 의미를 전달하기 어려운 문제점이 있다.In addition, even when the text recognized through OCR is output through the TTS (Text To Speech) function, all text is read, and unnecessary text is delivered at this time, making it difficult to convey the correct meaning to the listener. there is.

대한민국 공개특허 제10-2017-0010843호(2017.02.01.공개)Republic of Korea Patent Publication No. 10-2017-0010843 (published on February 1, 2017)

본 발명의 목적은, 이미지에 포함된 텍스트들을 인식하여 생성한 텍스트 블록으로부터 특징을 추출하고, 추출한 특징들에 대한 중요도 계산을 통해 출력 대상 텍스트에 대한 이진분류를 수행함으로써, 이미지에 포함된 텍스트들 중에 계산된 중요도와 부합하는 텍스트만을 선별해 출력하도록 하는데 있다.An object of the present invention is to extract features from a text block generated by recognizing texts included in an image, and perform binary classification on the output target text by calculating the importance of the extracted features, so that texts included in an image are The purpose is to select and output only texts that match the calculated importance.

본 발명의 목적은, 이미지에 포함된 텍스트들 중에 계산된 중요도와 부합하는 텍스트만을 선별해 출력함으로써, 상품 정보가 포함된 쇼핑몰 이미지에 적용시 노이즈 없는 상품 정보만을 텍스트로 제공함에 따라, 스크린 리더기와의 연계를 통해 화면을 확인하지 않고도 필요한 정보만을 음성으로 안내하는데 있다.An object of the present invention is to provide only noise-free product information as text when applied to a shopping mall image including product information by selecting and outputting only text that matches the calculated importance among texts included in an image, It is to guide only the necessary information by voice without checking the screen through the connection of

본 발명의 목적은, 지도학습을 통해 도출된 중요도에 따라 출력 대상 텍스트 블록을 선별함으로써, 출력대상 텍스트 블록 선별을 위한 연산속도를 현저히 단축시키는데 있다.An object of the present invention is to remarkably reduce the operation speed for selecting an output target text block by selecting an output target text block according to the importance derived through supervised learning.

이러한 기술적 과제를 해결하기 위한 본 발명의 일 실시예는 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템으로서, 입력받은 이미지로부터 텍스트를 추출하는 문자 인식부; 추출된 텍스트를 문장 단위로 구분하여 텍스트 블록을 생성하는 문자 블록부; 및 텍스트 블록 각각으로부터 특징을 추출하여 해당 텍스트 블록에 대한 특성을 지정하고, 지정한 텍스트 블록 특성값이 기 설정된 역치값을 초과하는 경우, 해당 출력대상 텍스트 블록으로 분류하는 연산부를 포함하는 것을 특징으로 한다.One embodiment of the present invention for solving the above technical problem is a system for determining the importance of a text block extracted from an image, comprising: a character recognition unit for extracting text from an input image; a character block unit generating a text block by dividing the extracted text into sentence units; and an operation unit that extracts features from each text block to designate a characteristic for the text block, and classifies the text block into a corresponding output target text block when the specified text block characteristic value exceeds a preset threshold value. .

바람직하게는, 특징은 텍스트 블록에 대한 '크기, 너비, 길이, 글자 신뢰도 또는 기울기' 중에 적어도 어느 하나의 속성값을 포함하는 것을 특징으로 한다.Preferably, the characteristic includes at least one attribute value among 'size, width, length, character reliability or inclination' for the text block.

연산부는, 문자 블록부로부터 인가받은 텍스트 블록에 포함된 특징을 추출하여 텍스트 블록별로 특성을 지정하되, 텍스트 블록 특성값이 기 설정된 역치값을 초과하는 경우, 출력대상 텍스트 블록인 것으로 판단하여 해당 텍스트 블록을 '1'로 라벨링하고, 텍스트 블록 특성값이 기 설정된 역치값 이하인 경우, 해당 텍스트 블록을 '0'으로 라벨링하여 필터링하는 것을 특징으로 한다.The calculation unit extracts features included in the text block authorized by the character block unit and designates characteristics for each text block. When the text block characteristic value exceeds a preset threshold value, it is determined that the text block is an output target text block and the corresponding text It is characterized in that the block is labeled as '1', and when the text block characteristic value is less than or equal to a preset threshold value, the text block is labeled as '0' and filtered.

지도학습을 통해 텍스트 블록 특성값 각각에 대해 '0' 내지 '1' 사이의 값을 부여하는 학습부를 더 포함하는 것을 특징으로 한다.It is characterized in that it further includes a learning unit that assigns a value between '0' and '1' to each text block characteristic value through supervised learning.

이때, 연산부는 '0'으로 라벨링된 텍스트 블록의 특성값을 학습부에 인가하고, 학습부는 인공 신경망에 특성값을 넣어 텍스트 블록을 다시 라벨링한다. 그리고 '1'로 라벨링된 텍스트 블록은 출력부로 전달된다.At this time, the operation unit applies the characteristic value of the text block labeled '0' to the learning unit, and the learning unit puts the characteristic value into the artificial neural network to label the text block again. And the text block labeled '1' is delivered to the output unit.

그리고, 전술한 시스템을 기반으로 하는 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 방법은, 문자 인식부가 입력받은 이미지로부터 텍스트를 추출하는 (a) 단계; 문자 블록부가 추출된 텍스트를 문장 단위로 구분하여 텍스트 블록을 생성하는 (b) 단계; 연산부가 텍스트 블록 각각으로부터 특징을 추출하여 해당 텍스트 블록에 대한 특성을 지정하는 (c) 단계; 연산부가 텍스트 블록 특성값이 기 설정된 역치값을 초과하는지 여부를 판단하는 (d) 단계; (d) 단계의 판단결과, 텍스트 블록 특성값이 기 설정된 역치값을 초과하는 경우, 연산부가 출력대상 텍스트 블록인 것으로 판단하여 해당 텍스트 블록을 '1'로 라벨링하는 (e) 단계; 연산부가 '1'로 라벨링된 텍스트 블록을 출력하고, 그 밖에 텍스트 블록은 제거하는 (f) 단계를 포함하는 것을 특징으로 한다.In addition, the method for determining the importance of a text block extracted from an image according to an embodiment of the present invention based on the system described above includes the steps of: (a) extracting text from an image received by a character recognition unit; (b) generating a text block by dividing the extracted text by the character block unit into sentences; (c) an operation unit extracting features from each text block and designating a feature for the text block; (d) determining, by an operation unit, whether the text block characteristic value exceeds a preset threshold value; (d) when the text block characteristic value exceeds a preset threshold value as a result of the determination in step (d), (e) determining that the text block is an output target text block and labeling the text block as '1'; The operation unit outputs a text block labeled as '1', and (f) removing other text blocks.

바람직하게는, (d) 단계의 판단결과, 텍스트 블록 특성값이 기 설정된 역치값을 초과하지 않는 경우, 학습부가 '1'로 라벨링된 텍스트 블록을 연산부로 인가하고 (f) 단계로 절차를 이행하는 (g) 단계를 더 포함하는 것을 특징으로 한다.Preferably, as a result of the determination in step (d), if the text block characteristic value does not exceed a preset threshold value, the learning unit applies the text block labeled '1' to the operation unit and proceeds to step (f) It is characterized in that it further comprises the step of (g).

상기와 같은 본 발명의 일 실시예에 따르면, 이미지에 포함된 텍스트들을 인식하여 생성한 텍스트 블록으로부터 특징을 추출하고, 추출한 특징들에 대한 중요도 계산을 통해 출력 대상 텍스트에 대한 이진분류를 수행함으로써, 이미지에 포함된 텍스트들 중에 계산된 중요도와 부합하는 텍스트만을 선별하는 효과가 있다.According to an embodiment of the present invention as described above, by extracting features from a text block generated by recognizing texts included in an image, and performing binary classification on the output target text by calculating the importance of the extracted features, There is an effect of selecting only texts that match the calculated importance among texts included in the image.

본 발명에 따르면, 이미지에 포함된 텍스트들 중에 계산된 중요도와 부합하는 텍스트만을 선별해 출력함으로써, 상품 정보가 포함된 쇼핑몰 이미지에 적용시 노이즈 없는 상품 정보만을 텍스트로 제공함에 따라, 스크린 리더기와의 연계를 통해 화면을 확인하지 않고도 필요한 정보만을 음성으로 안내할 수 있다.According to the present invention, by selecting and outputting only texts that match the calculated importance among texts included in images, when applied to a shopping mall image including product information, only product information without noise is provided as text, so that Through the linkage, only necessary information can be guided by voice without checking the screen.

본 발명에 따르면, 지도학습을 통해 도출된 중요도에 따라 출력 대상 텍스트 블록을 선별함으로써, 출력대상 텍스트 블록 선별을 위한 연산속도를 현저히 단축시킬 수 있다.According to the present invention, by selecting the output target text block according to the importance derived through supervised learning, the operation speed for selecting the output target text block can be significantly reduced.

도 1은 종래의 OCR을 통해 추출된 텍스트 블록과, 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템에 의해 추출된 텍스트 블록을 비교한 예시도.
도 2는 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템의 세부구성을 도시한 블록도.
도 3은 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템의 문자 인식으로부터 출력대상 텍스트 블록을 선별하는 최종 결과에 이르는 흐름을 도시한 예시도.
도 4는 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템에 부가되는 학습부 및 출력부를 도시한 블록도.
도 5는 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템의 연산부를 통해 텍스트 블록에 포함된 특징을 필터링하고, 신경망 모델의 학습을 수행하는 것을 도시한 예시도.
도 6은 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템의 학습부의 신경망 모델에 의해 필터링된 텍스트 블록을 '0'으로 분류하는 것을 도시한 예시도.
도 7은 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 방법을 도시한 순서도.
도 8은 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 방법의 제S708단계 이후 과정을 도시한 순서도.1 is an exemplary diagram comparing a text block extracted through conventional OCR and a text block extracted by a system for determining the importance of a text block extracted from an image according to an embodiment of the present invention.
2 is a block diagram illustrating a detailed configuration of a system for determining the importance of a text block extracted from an image according to an embodiment of the present invention.
3 is an exemplary diagram illustrating a flow from character recognition of the system for determining the importance of a text block extracted from an image according to an embodiment of the present invention to a final result of selecting an output target text block;
4 is a block diagram illustrating a learning unit and an output unit added to the system for determining the importance of a text block extracted from an image according to an embodiment of the present invention.
5 is an exemplary diagram illustrating filtering of features included in a text block and learning of a neural network model through the operation unit of the system for determining the importance of a text block extracted from an image according to an embodiment of the present invention.
6 is an exemplary diagram illustrating classification of a text block filtered by the neural network model of the learning unit of the system for determining the importance of a text block extracted from an image according to an embodiment of the present invention as '0'.
7 is a flowchart illustrating a method for determining the importance of a text block extracted from an image according to an embodiment of the present invention.
8 is a flowchart illustrating a process after step S708 of a method for determining the importance of a text block extracted from an image according to an embodiment of the present invention.

본 발명의 구체적인 특징 및 이점들은 첨부 도면에 의거한 다음의 상세한 설명으로 더욱 명백해질 것이다. 이에 앞서, 본 명세서 및 청구범위에 사용된 용어나 단어는 발명자가 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 그 개념을 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야 할 것이다. 또한, 본 발명에 관련된 공지 기능 및 그 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는, 그 구체적인 설명을 생략하였음에 유의해야 할 것이다.The specific features and advantages of the present invention will become more apparent from the following detailed description taken in conjunction with the accompanying drawings. Prior to this, the terms or words used in the present specification and claims conform to the technical spirit of the present invention based on the principle that the inventor can appropriately define the concept in order to best describe his invention. It should be interpreted in terms of meaning and concept. In addition, it should be noted that, if it is determined that the detailed description of the well-known functions related to the present invention and its configuration may unnecessarily obscure the gist of the present invention, the detailed description thereof is omitted.

도 1은 종래의 OCR을 통해 추출된 텍스트 블록과, 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템에 의해 추출된 텍스트 블록을 도시한 예시도이다.1 is an exemplary diagram illustrating a text block extracted through conventional OCR and a text block extracted by a system for determining the importance of a text block extracted from an image according to an embodiment of the present invention.

도 1의 (a)에 도시된 바와 같이 종래의 OCR에 의하면 이미지에 포함된 모든 텍스트를 인식하여 텍스트 영역을 블록으로 추출하는데 반해, 도 1의 (b)에 도시된 본 발명의 일 실시예에 따른 OCR 엔진에 의하면 중요도가 낮은 텍스트 블록을 선별하여(빨간색으로 표시) 필터링하고, 중요도가 높은 텍스트 블록만을 출력 대상 텍스트 블록으로 선별할 수 있다.As shown in Fig. 1 (a), according to the conventional OCR, all text included in an image is recognized and the text area is extracted as a block, whereas in the embodiment of the present invention shown in Fig. 1 (b), According to the OCR engine, it is possible to select and filter a text block having a low importance (displayed in red), and select only a text block having a high importance as an output target text block.

이하, 도 2 내지 도 6을 참조하여 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템(S)이 어떠한 구성적 특징에 의해 중요도가 높은 텍스트 블록만을 출력 대상 텍스트 블록으로 선별하는지에 대해 살피면 아래와 같다.Hereinafter, with reference to FIGS. 2 to 6 , the system (S) for determining the importance of a text block extracted from an image according to an embodiment of the present invention selects only a text block having high importance as an output target text block by what structural characteristics Looking at it, it is as follows:

도 2를 참조하면 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템(S)은, 문자 인식부(100), 문자 블록부(200) 및 연산부(300)를 포함하여 구성된다.Referring to FIG. 2 , the system S for determining the importance of a text block extracted from an image according to an embodiment of the present invention includes a character recognition unit 100 , a character block unit 200 , and an operation unit 300 . .

이러한, 문자 인식부(100)는 입력받은 이미지로부터 텍스트를 추출하고, 문자 블록부(200)는 추출된 텍스트를 문장 단위로 구분하여 텍스트 블록을 생성하며, 연산부(300)는 텍스트 블록 각각으로부터 특징을 추출하여 해당 텍스트 블록에 대한 특성을 지정하고, 지정한 텍스트 블록 특성값이 기 설정된 역치값을 초과하는 경우, 해당 출력대상 텍스트 블록으로 분류한다.The character recognition unit 100 extracts text from the input image, the character block unit 200 divides the extracted text into sentence units to generate a text block, and the operation unit 300 generates a feature from each text block. is extracted to designate a characteristic for the text block, and if the specified text block characteristic value exceeds a preset threshold value, it is classified as a corresponding output target text block.

이때, 연산부(300)가 텍스트 블록으로부터 추출하는 특징은 '크기, 너비, 길이, 글자 신뢰도 또는 기울기' 중에 적어도 어느 하나의 속성값을 포함하며, 이와 같은 속성값은 텍스트 블록에 포함된 또 다른 속성값으로 변경되거나 추가될 수 있다.At this time, the feature extracted from the text block by the operation unit 300 includes at least one attribute value among 'size, width, length, character reliability or slope', and such attribute value is another attribute included in the text block. It can be changed or added to a value.

여기서, 특징에 포함된 요소 중에 글자 신뢰도는, 인식된 텍스트가 알려진 텍스트와 얼마나 일치하는지를 수치화 한 값으로 이해함이 바람직하다.Here, among the elements included in the feature, the character reliability is preferably understood as a numerical value indicating how well the recognized text matches the known text.

이하에서는 그 구체적인 언급을 생략하겠으나, 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템(S)은, 정보통신망을 통해 접속된 서버와 통신이 가능한 PC, 노트북, 태블릿 또는 스마트폰 중에 어느 하나의 디바이스에 내장되고, 온라인을 통해 배포되어 설치되는 어플리케이션에 의해 구동된다.Hereinafter, a detailed description thereof will be omitted, but the system (S) for determining the importance of a text block extracted from an image according to an embodiment of the present invention is a PC, notebook, tablet or smart phone capable of communicating with a server connected through an information communication network. It is built in any one device and is driven by an application that is distributed and installed online.

이하, 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템(S)의 세부구성에 대해 살피면 아래와 같다.Hereinafter, the detailed configuration of the system S for determining the importance of a text block extracted from an image according to an embodiment of the present invention will be described below.

구체적으로, 문자 인식부(100)는 입력받은 이미지에 포함된 텍스트 각각을 개별적으로 추출하여 순차적으로 인식하고, 인식된 텍스트를 문자 블록부(200)로 인가한다.Specifically, the character recognition unit 100 individually extracts each text included in the input image, sequentially recognizes it, and applies the recognized text to the character block unit 200 .

또한, 문자 블록부(200)는 문자 인식부(100)에 의해 인식된 텍스트의 시작점 좌표와 인식된 텍스트의 끝점 좌표를 포함하도록 문장 단위의 블록을 묶는 텍스트 블록을 설정한다. 이때, 텍스트의 끝점 좌표는 인식된 텍스트에 마침표가 위치하는 좌표 또는 인식된 텍스트가 문장의 마지막 텍스트인 경우 끝점 좌표인 것으로 특정할 수 있으나, 본 발명의 일 실시예가 이에 국한되는 것은 아니다.In addition, the character block unit 200 sets a text block tying blocks of sentence units to include the coordinates of the start point of the text recognized by the character recognition unit 100 and the coordinates of the end point of the recognized text. In this case, the endpoint coordinates of the text may be specified as coordinates at which a period is positioned in the recognized text or endpoint coordinates when the recognized text is the last text of a sentence, but an embodiment of the present invention is not limited thereto.

그리고, 연산부(300)는 문자 블록부(200)로부터 인가받은 텍스트 블록에 포함된 특징을 추출하여 텍스트 블록별로 특성을 지정하고, 텍스트 블록 특성값이 기 설정된 역치값을 초과하는 경우, 출력대상 텍스트 블록인 것으로 판단하여 해당 텍스트 블록을 '1'로 라벨링하고, 텍스트 블록 특성값이 기 설정된 역치값 이하인 경우, 해당 텍스트 블록을 '0'으로 라벨링하여 필터링 한다.Then, the operation unit 300 extracts features included in the text block authorized by the character block unit 200 to designate characteristics for each text block, and when the text block characteristic value exceeds a preset threshold value, output target text It is determined that it is a block, and the corresponding text block is labeled as '1', and when the text block characteristic value is less than or equal to a preset threshold value, the corresponding text block is labeled as '0' and filtered.

이때, 연산부(300)는 텍스트 블록과 대응하는 이미지에서 '중요하다', 또는 '중요하지 않다' 각각을 '1' 과 '0'로 설정하는 이진 분류를 통해 라벨링을 수행하는데, 라벨링의 판단 기준이 되는 휴리스틱 모델의 평가지표는 정밀도(precision), 재현율(recall) 및 정확도(accuracy)를 포함하여 구성된다.At this time, the operation unit 300 performs labeling through binary classification by setting each of 'important' or 'not important' to '1' and '0' in the image corresponding to the text block. The evaluation index of this heuristic model consists of precision, recall, and accuracy.

여기서, 정밀도는 휴리스틱 모델이 1로 분류한 텍스트 블록 중에서 실제 1인 블록의 비율이고, 재현율은 실제 1인 텍스트 블록 중에서 휴리스틱 모델이 1로 분류한 블록의 비율이며, 정확도는 전체 텍스트 블록 중에서 휴리스틱 모델이 라벨을 맞춘 텍스트 블록의 비율이다.Here, precision is the ratio of blocks that are actually 1 among text blocks classified by the heuristic model as 1, recall is the ratio of blocks classified as 1 by the heuristic model among text blocks that are actually 1, and accuracy is the ratio of blocks classified as 1 by the heuristic model among all text blocks. Percentage of text blocks that fit this label.

또한, 역치값은 휴리스틱 모델의 평가지표별 점수를 토대로 설정되는데, 연산부(300)가 특정한 다수의 테스트 값들이 임의로 설정된 초기 역치값을 초과하는 경우, 해당 테스트 값을 '1'로 라벨링하고 '1'로 라벨링된 테스트 값들에 대한 정밀도, 재현율 및 정확도에 대한 검증을 수행하여 결정된다. 이때, 각 검증마다 초기 역치값을 기 설정된 값 만큼 증감 또는 감소시켜 입력받은 횟수와 대응하도록 반복할 수 있다.In addition, the threshold value is set based on the score for each evaluation index of the heuristic model. When the calculation unit 300 exceeds an arbitrarily set initial threshold value for a plurality of specific test values, the test value is labeled as '1' and '1' It is determined by performing verification on the precision, recall, and accuracy of the test values labeled with '. In this case, the initial threshold value may be increased or decreased by a preset value for each verification, and may be repeated to correspond to the input number.

이때, 검증은 '1'로 라벨링된 테스트 값이 사전에 검증된 값(정답값)의 범위(spctrum)에 포함되는 횟수를 카운팅하고, 초기 역치값들 중에 카운팅된 값이 가장 높은 초기 역치값을 역치값으로 추출하는 것으로 이해함이 바람직하다.At this time, the verification counts the number of times that the test value labeled '1' is included in the range (spectrum) of the previously verified values (correct answer values), and among the initial threshold values, the counted value is the highest initial threshold value It is preferable to understand that extraction is performed as a threshold value.

아울러, 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템(S)에서 각 특성별로 라벨링의 기준이 되는 역치값은, 휴리스틱 모델의 평가지표('크기, 너비, 길이, 글자 신뢰도 및 기울기') 별로 각각 '400 내지 600, 30 내지 40, 30 내지 40, 0.5 내지 1, 및 0.02 내지 0.1'의 값으로 설정될 수 있다.In addition, in the system (S) for determining the importance of a text block extracted from an image according to an embodiment of the present invention, the threshold value serving as a standard for labeling for each characteristic is the evaluation index ('size, width, length, character reliability) of the heuristic model. and slope') may be set to values of '400 to 600, 30 to 40, 30 to 40, 0.5 to 1, and 0.02 to 0.1', respectively.

또한, 전술한 휴리스틱 모델의 평가지표는 각각 '500, 35, 35, 0.75, 및 0.05' 로 설정되는 것이 바람직하나, 앞서 언급한 테스트에 의해 변경될 수 있는바, 특정한 수치에 한정되는 것은 아니다.In addition, the evaluation index of the above-described heuristic model is preferably set to '500, 35, 35, 0.75, and 0.05', respectively, but may be changed by the aforementioned test, and is not limited to a specific numerical value.

전술한 바와 같은, 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템(S)은, 도 3의 (a)에 도시된 바와 같이 문자 인식부(100)가 이미지(사진)를 입력받고, 도 3의 (b)에 도시된 바와 같이 이미지로부터 텍스트를 추출하며, 도 3의 (c)에 도시된 바와 같이 문자 블록부(200)가 추출한 문자들로부터 텍스트 블록을 생성하고, 도 3의 (d)에 도시된 바와 같이 연산부(300)가 중요도가 낮은 텍스트 블록을 필터링하여 노이즈(붉은색 박스)를 제거하며, 도 3의 (e)에 도시된 바와 같이 연산부(300)가 출력대상 텍스트 블록을 최종 결과로 도출하게 된다.As described above, in the system S for determining the importance of a text block extracted from an image according to an embodiment of the present invention, the character recognition unit 100 detects the image (photo) as shown in FIG. It receives input, extracts text from the image as shown in FIG. 3B, and generates a text block from the characters extracted by the character block unit 200 as shown in FIG. 3C, as shown in FIG. As shown in (d) of FIG. 3, the calculating unit 300 filters the text blocks with low importance to remove noise (red box), and as shown in FIG. 3(e), the calculating unit 300 outputs the output. The target text block is derived as the final result.

한편, 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템(S)은 도 4에 도시된 바와 같이, 연산부(300)의 휴리스틱 모델을 통해 출력대상 텍스트 블록을 분류하는 구성 외에, 딥러닝 신경망 모델을 출력대상 텍스트 블록을 분류하는 학습부(400) 및 출력대상 텍스트 블록에 포함된 텍스트 각각을 스피커를 통해 기 녹음된 음성으로 출력하는 출력부(500)를 더 포함하여 구성된다.On the other hand, as shown in FIG. 4, the importance determination system (S) of a text block extracted from an image according to an embodiment of the present invention classifies an output target text block through a heuristic model of the operation unit 300, The deep learning neural network model further includes a learning unit 400 for classifying an output target text block and an output unit 500 for outputting each text included in the output target text block as a pre-recorded voice through a speaker.

구체적으로, 학습부(400)는 입력층이 연산부(300)에 의해 '0'으로 라벨링되어 필터링된 텍스트 블록 특성값을 입력받아 은닉층으로 인가하고, 지도학습을 통해 출력층에서 텍스트 블록 특성값 각각에 대해 '0' 내지 '1' 사이의 값을 부여하며, 부여한 출력값이 큰 숫자를 텍스트 블록의 라벨로 설정한다. 이때 값이 '1'에 가까울수록 중요한 텍스트 블록으로 이해함이 바람직하다.Specifically, the learning unit 400 receives the text block characteristic value filtered by the input layer labeled as '0' by the operation unit 300 and applies it to the hidden layer, and through supervised learning to each text block characteristic value in the output layer. A value between '0' and '1' is assigned to the text block, and a number with a large output value is set as the label of the text block. In this case, the closer the value is to '1', the better it is understood as an important text block.

또한, 학습부(400)는 지도학습을 통해 '1'로 라벨링된 텍스트 블록을 연산부(300)로 인가하고, 이를 인가받은 연산부(300)에 의해 출력대상 텍스트 블록으로 분류된다.In addition, the learning unit 400 applies a text block labeled '1' to the operation unit 300 through supervised learning, and is classified as an output target text block by the authorized operation unit 300 .

이러한 학습부(400)를 구성함에 따라 본 발명의 일 실시예는, 도 5의 (a)에 도시된 바와 같이 연산부(300)가 텍스트 블록에 포함된 특징('크기, 너비, 길이, 글자 신뢰도 및 기울기')을 추출한다.According to the configuration of the learning unit 400, in one embodiment of the present invention, as shown in FIG. and slope').

그리고, 도 5의 (b)에 도시된 바와 같이 '1'로 라벨링된 텍스트 블록을 제외한 텍스트 블록들에 대한 신경망 모델의 학습을 수행하고, 이에 따라 텍스트 블록 각각의 특징값에 대한 중요도를 판단하여 노이즈를 제거할 수 있다. 이때, 도 5의 (b)와 같이 '1'로 라벨링 되지 않은 텍스트 블록은, 도 6에 도시된 바와 같이 '0'으로 라벨링 되어 필터링 된다.Then, as shown in (b) of FIG. 5, the neural network model is trained for text blocks except for the text block labeled '1', and accordingly, the importance of each feature value of the text block is determined. Noise can be removed. In this case, the text block not labeled with '1' as shown in (b) of FIG. 5 is labeled with '0' and filtered as shown in FIG. 6 .

한편, 전술한 바와 같은 본 발명의 일 실시예에 따른 출력대상 텍스트 블록 선별 결과 도 3, 도 4 및 도 6에 도시된 이미지를 참조해 살피면 아래와 같다.On the other hand, as described above, the output target text block selection result according to an embodiment of the present invention is as follows with reference to the images shown in FIGS. 3, 4 and 6 .

도 3, 도 4 및 도 6에 도시된 이미지에서 종래와 같이 모든 텍스트 블록을 추출하여 출력대상으로 선정하지 않고, 해당 이미지에서 태블릿 PC의 성능을 표현하는 문구인 "부착하기만 하면 페어링도 되고 충전도 됩니다."라는 문장과, "한 마디로 착 붙이면 척이죠"라는 2개의 문장만을 출력대상으로 분류하게 된다.3, 4, and 6 do not extract all text blocks from the images shown in the prior art and select them as output objects, but rather a phrase expressing the performance of the tablet PC from the image, “just attach and pair and charge.” Only two sentences are classified as output targets: the sentence “It’s okay.

또한, '메신저 대화 내용'은 관계없는 텍스트 블록으로 분류해 출력 대상에서 제외시키게 되고, 이러한 일련의 절차는 문자 인식부(100), 문자 블록부(200) 및 연산부(300)에 의해 수행되며, 부가적으로 구성되는 학습부(400)의 필터링 기능을 통해 연산부(300)가 수행하는 출력대상 텍스트 블록에 대한 분류 속도도 향상시킬 수 있다.In addition, 'messenger conversation contents' are classified as unrelated text blocks and excluded from output, and this series of procedures is performed by the character recognition unit 100, the character block unit 200 and the calculation unit 300, Through the filtering function of the additionally configured learning unit 400 , the classification speed of the output target text block performed by the calculation unit 300 may also be improved.

이하, 도 7을 참조하여 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 방법에 대해 살피면 아래와 같다.Hereinafter, a method for determining the importance of a text block extracted from an image according to an embodiment of the present invention will be described with reference to FIG. 7 .

먼저, 문자 인식부(100)가 입력받은 이미지로부터 텍스트를 추출한다(S702).First, the text recognition unit 100 extracts text from the input image (S702).

이어서, 문자 블록부(200)가 추출된 텍스트를 문장 단위로 구분하여 텍스트 블록을 생성한다(S704).Next, the text block unit 200 divides the extracted text into sentence units to generate a text block ( S704 ).

뒤이어, 연산부(300)가 텍스트 블록 각각으로부터 특징을 추출하여 해당 텍스트 블록에 대한 특성을 지정한다(S706).Subsequently, the operation unit 300 extracts features from each text block and designates the properties for the text block ( S706 ).

이어서, 연산부(300)가 텍스트 블록 특성값이 기 설정된 역치값을 초과하는지 여부를 판단한다(S708).Next, the operation unit 300 determines whether the text block characteristic value exceeds a preset threshold value (S708).

제S708단계의 판단결과, 텍스트 블록 특성값이 기 설정된 역치값을 초과하는 경우, 연산부(300)가 출력대상 텍스트 블록인 것으로 판단하여 해당 텍스트 블록을 '1'로 라벨링 한다(S710).As a result of the determination in step S708, when the text block characteristic value exceeds a preset threshold value, the operation unit 300 determines that the text block is an output target text block and labels the text block as '1' (S710).

그리고, 연산부(300)가 '1'로 라벨링된 텍스트 블록을 출력하고, 그 밖에 텍스트 블록은 제거한다(S712).Then, the operation unit 300 outputs a text block labeled '1', and removes other text blocks (S712).

이하, 도 8을 참조하여 본 발명의 일 실시예에 따른 이미지로부터 추출한 텍스트 블록의 중요도 판단 방법의 제S708단계 이후 과정에 대해 살피면 아래와 같다.Hereinafter, with reference to FIG. 8, the process after step S708 of the method for determining the importance of a text block extracted from an image according to an embodiment of the present invention will be described.

제S708단계의 판단결과, 텍스트 블록 특성값이 기 설정된 역치값을 초과하지 않는 경우, 학습부(400)가 지도학습을 통해 도출된 중요도에 따라 텍스트 블록을 '1'로 라벨링 한다(S802).As a result of the determination in step S708, if the text block characteristic value does not exceed a preset threshold value, the learning unit 400 labels the text block as '1' according to the importance derived through supervised learning (S802).

그리고, 학습부(400)가 '1'로 라벨링된 텍스트 블록을 연산부(300)로 인가하고 제S712 단계로 절차를 이행한다(S804).Then, the learning unit 400 applies the text block labeled '1' to the operation unit 300, and the procedure proceeds to step S712 (S804).

이처럼, 전술한 바와 같은 본 발명의 일 실시예에 의하면, 이미지에 포함된 텍스트들을 인식하여 생성한 텍스트 블록으로부터 특징을 추출하고, 추출한 특징들에 대한 중요도 계산을 통해 출력 대상 텍스트에 대한 이진분류를 수행함에 따라, 이미지에 포함된 텍스트들 중에 계산된 중요도와 부합하는 텍스트만을 선별해 출력할 수 있고, 스크린 리더기와의 연계를 통해 시각장애인의 온라인 쇼핑시 필요한 정보만을 음성으로 제공할 수 있다.As described above, according to an embodiment of the present invention as described above, a feature is extracted from a text block generated by recognizing texts included in an image, and binary classification of an output target text is performed by calculating the importance of the extracted features. As it is performed, only text that matches the calculated importance among texts included in the image can be selected and output, and only information necessary for online shopping for the visually impaired can be provided by voice through linkage with a screen reader.

이상으로 본 발명의 기술적 사상을 예시하기 위한 바람직한 실시 예와 관련하여 설명하고 도시하였지만, 본 발명은 이와 같이 도시되고 설명된 그대로의 구성 및 작용에만 국한되는 것이 아니며, 기술적 사상의 범주를 일탈함이 없이 본 발명에 대해 다수의 변경 및 수정이 가능함을 당업자들은 잘 이해할 수 있을 것이다. 따라서 그러한 모든 적절한 변경 및 수정과 균등물들도 본 발명의 범위에 속하는 것으로 간주되어야 할 것이다.Although the above has been described and illustrated in relation to a preferred embodiment for illustrating the technical idea of the present invention, the present invention is not limited to the configuration and operation as shown and described as such, and deviates from the scope of the technical idea. It will be apparent to those skilled in the art that many changes and modifications may be made to the present invention without reference to the invention. Accordingly, all such suitable alterations and modifications and equivalents are to be considered as being within the scope of the present invention.

S: 이미지로부터 추출한 텍스트 블록의 중요도 판단 시스템
100: 문자 인식부
200: 문자 블록부
300: 연산부
400: 학습부
500: 출력부S: A system for determining the importance of text blocks extracted from images
100: character recognition unit
200: character block part
300: arithmetic unit
400: study unit
500: output unit

Claims

a character recognition unit for extracting text from the input image;
a character block unit generating a text block by dividing the extracted text into sentence units to include the coordinates of the start point of the text recognized by the character recognition unit and the coordinates of the end point of the recognized text;
Characteristics are extracted from each text block with the text recognition result recognized through the character recognition unit, the starting point coordinates and the ending point coordinates of the character block unit, and digitized as a text block characteristic value for the corresponding text block for each extracted characteristic, and the text block an operation unit that classifies the text block as an output target text block when the characteristic value exceeds a preset threshold value; and
The input layer receives the text block characteristic value that is not classified as a case in which the text block characteristic value is smaller than the preset threshold value in the operation unit and applies it to the hidden layer, and each text block characteristic value according to the importance derived through supervised learning in the output layer a learning unit that assigns a value between '0' and '1' to '0' to '1' and applies the text block assigned as a large number to the operation unit to classify the text block as an output target text block; and
The calculation unit outputs a text block labeled as '1', and removes other text blocks to select only text that matches the importance among texts included in the image. .

According to claim 1,
The feature is
The system for determining the importance of a text block extracted from an image, characterized in that it includes at least one attribute value among 'size, width, length, character reliability, or inclination' for the text block.

According to claim 1,
The calculation unit,
Extracting the features included in the text block authorized by the text block unit and designating the properties for each text block,
When the text block property value exceeds a preset threshold value, it is determined that the text block is an output target text block and labels the text block as '1'. When the text block property value is less than or equal to a preset threshold value, the text block A system for determining the importance of a text block extracted from an image, characterized in that it is filtered by labeling it as '0'.

delete

(a) extracting text from the image received by the character recognition unit;
(b) generating a text block by dividing the extracted text into sentence units so that the character block unit includes the coordinates of the start point of the text recognized by the character recognition unit and the coordinates of the end point of the recognized text;
(c) an operation unit extracts features from each text block with the text recognition result recognized through the character recognition unit and the start point coordinates and end point coordinates of the character block unit, and digitizes each extracted feature as a text block characteristic value for the text block to do;
(d) determining, by the calculator, whether the text block characteristic value exceeds a preset threshold value;
(e) if it is determined in step (d) that the text block characteristic value exceeds a preset threshold value, the calculator determines that the text block is an output target text block and labels the text block as '1';
(f) the operation unit outputs a text block labeled as '1', and removes other text blocks to select only texts matching the importance among texts included in the image;
As a result of the determination in step (d), the input layer receives the text block characteristic value that is not classified as a case where the text block characteristic value is smaller than the preset threshold value in the operation unit, applies it to the hidden layer, and derives it through supervised learning in the output layer assigning a value between '0' to '1' for each text block characteristic value according to the selected importance level and applying the text block with a large output value to the operation unit to classify the text block as an output target text block; A method for determining the importance of a text block extracted from an image, comprising:

delete