KR20230083971A

KR20230083971A - A method for translating and editing text contained within an image and a device for performing the same

Info

Publication number: KR20230083971A
Application number: KR1020220038883A
Authority: KR
Inventors: 김진구
Original assignee: 주식회사 오후랩스
Priority date: 2021-12-03
Filing date: 2022-03-29
Publication date: 2023-06-12
Also published as: WO2023101114A1

Abstract

According to one embodiment of the present application, provided is a method for translating target text included in an image, which includes the steps of: obtaining an image, wherein the image is an image including target text; extracting a target text area, which is an area corresponding to at least a portion of the target text, from the obtained image; obtaining translated text by translating the target text included in the target text area using a predetermined algorithm; processing the target text area according to predetermined standards; and inserting and placing the obtained translated text into the target text area.

Description

A method for translating and editing text included in an image and a device for performing the same

본 발명은 이미지 내에 포함되어 있는 텍스트를 번역하고 번역된 텍스트를 편집하는 방법 및 이를 수행하는 장치에 관한 것이다.The present invention relates to a method for translating text included in an image and editing the translated text, and an apparatus for performing the same.

최근 해외 구매 대행 서비스를 이용하는 사용자가 급증하고 있고, 이에 따라 해외 구매 대행 온라인 셀러도 증가하고 있다. 해외 구매 대행 온라인 셀러들은 해외 상품을 검색한 후 이에 대한 정보를 수집하고, 해당 상품 정보를 번역하는 등 재가공한 후 마켓에 등록하여 판매하고 있는데, 해외 상품에 대한 정보를 번역하는 것에 여러 한계점이 존재한다.Recently, the number of users using overseas purchasing agency services is rapidly increasing, and accordingly, the number of overseas purchasing agency online sellers is also increasing. Overseas purchasing agency online sellers search for overseas products, collect information about them, translate the product information, and reprocess it before registering and selling it in the market. There are several limitations in translating information about overseas products. do.

웹 브라우저에서 텍스트 번역 서비스는 기본적으로 지원하고 있지만, 이미지에 포함되어 있는 텍스트에 대한 번역 서비스는 활성화되어 있지 못하므로, 이미지 번역 작업은 텍스트를 번역하는 작업에 비해 오랜 시간이 소요되고 비용도 많이 소요된다.The text translation service is basically supported in the web browser, but the translation service for the text included in the image is not activated, so the image translation task takes a long time and is expensive compared to the text translation task. do.

이에 따라, 이미지에 포함되어 있는 텍스트를 자동으로 번역해주고, 번역된 텍스트를 적절히 편집할 수 있는 이미지 번역 솔루션이 요구되고 있는 상황이다.Accordingly, there is a demand for an image translation solution capable of automatically translating text included in an image and properly editing the translated text.

본 발명의 일 과제는, 이미지에 포함된 텍스트를 자동으로 번역하고, 번역된 텍스트를 편집하는 방법 및 이를 수행하는 장치를 제공하는 것이다.One object of the present invention is to provide a method of automatically translating text included in an image and editing the translated text, and an apparatus for performing the same.

본 발명이 해결하고자 하는 과제가 상술한 과제로 제한되는 것은 아니며, 언급되지 아니한 과제들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The problem to be solved by the present invention is not limited to the above-mentioned problems, and problems not mentioned will be clearly understood by those skilled in the art from this specification and the accompanying drawings. .

본 출원에 개시된 이미지 내에 포함된 대상 텍스트를 번역하는 방법은 이미지를 획득하는 단계- 상기 이미지는 대상 텍스트를 포함하는 이미지임 -; 상기 획득된 이미지로부터 상기 대상 텍스트 중 적어도 일부에 대응되는 영역인 대상 텍스트 영역을 추출하는 단계; 미리 정해진 알고리즘을 통해 상기 대상 텍스트 영역에 포함되어 있는 대상 텍스트를 번역하여 번역 텍스트를 획득하는 단계; 상기 대상 텍스트 영역을 미리 정해진 기준에 따라 가공하는 단계; 및 상기 획득된 번역 텍스트를 상기 대상 텍스트 영역에 삽입 배치하는 단계;를 포함할 수 있다.A method for translating target text included in an image disclosed in this application includes acquiring an image, wherein the image is an image including the target text; extracting a target text area corresponding to at least a part of the target text from the obtained image; obtaining translated text by translating target text included in the target text area through a predetermined algorithm; processing the target text area according to a predetermined criterion; and inserting and arranging the obtained translated text into the target text area.

본 발명의 과제의 해결 수단이 상술한 해결 수단들로 제한되는 것은 아니며, 언급되지 아니한 해결 수단들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The solutions to the problems of the present invention are not limited to the above-described solutions, and solutions not mentioned will be clearly understood by those skilled in the art from this specification and the accompanying drawings. You will be able to.

본 출원의 실시예에 의하면, 이미지에 포함되어 있는 텍스트를 실시간으로 번역하여 이미지에 반영하고, 번역된 텍스트를 편집할 수 있는 서비스를 제공함으로써 효율적이고 간편하게 이미지를 번역하고 편집할 수 있다.According to an embodiment of the present application, text included in an image is translated in real time, reflected in the image, and a service capable of editing the translated text is provided, so that the image can be translated and edited efficiently and conveniently.

본 발명의 효과가 상술한 효과들로 제한되는 것은 아니며, 언급되지 아니한 효과들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.Effects of the present invention are not limited to the above-mentioned effects, and effects not mentioned will be clearly understood by those skilled in the art from this specification and the accompanying drawings.

도 1은 일 실시예에 따른 이미지 내에 포함된 텍스트를 번역하고 편집하는 시스템을 설명하기 위한 도면이다.
도 2는 일 실시예에 따른 서버의 구성을 설명하기 위한 도면이다.
도 3은 일 실시예에 따른 이미지 번역 및 편집 방법을 설명하기 위한 도면이다.
도 4는 서버가 이미지로부터 대상 텍스트를 추출하는 것을 예시적으로 설명하기 위한 도면이다.
도 5는 서버가 이미지로부터 정형화된 대상 텍스트 및 비정형화된 대상 텍스트를 추출하는 방법을 예시적으로 설명하기 위한 도면이다.
도 6은 일 실시예에 따른 서버가 번역 텍스트를 획득하는 방법을 예시적으로 설명하기 위한 도면이다.
도 7은 다른 실시예에 따른 서버가 번역 텍스트를 획득하는 방법을 예시적으로 설명하기 위한 도면이다.
도 8은 이미지로부터 상품 정보를 분석한 후 이를 활용하여 텍스트를 번역하는 방법을 예시적으로 설명하기 위한 도면이다.
도 9 및 도 10는 서버가 이미지 내에 포함된 텍스트를 선택적으로 번역하는 방법을 예시적으로 설명하기 위한 도면이다.
도 11 및 도 12은 서버가 대상 텍스트를 번역하는 구체적인 방법을 예시적으로 설명하기 위한 도면이다.
도 13은 획득된 번역 텍스트를 이미지에 삽입하는 방법을 예시적으로 설명하기 위한 도면이다.
도 14 및 15는 대상 텍스트의 특징 정보(스타일 정보)가 반영된 번역 텍스트를 이미지에 삽입하는 방법을 예시적으로 설명하기 위한 도면이다.
도 16 및 도 17은 일 실시예에 따른 대상 텍스트 특징 정보를 획득하는 방법을 설명하기 위한 도면이다.
도 18은 서버가 번역 텍스트를 추가적으로 수정하여 삽입하는 방법을 예시적으로 설명하기 위한 도면이다.
도 19는 번역 텍스트가 이미지 내의 영역을 벗어나 삽입되는 경우 서버가 번역 텍스트를 추가적으로 수정하는 방법을 설명하기 위한 도면이다.
도 20 및 도 21은 삽입 배치된 번역 텍스트를 추가적으로 편집하고 편집된 이미지를 저장 및 관리할 수 있도록 하는 소프트웨어의 UX/UI를 예시적으로 설명하기 위한 도면이다.
도 22는 문자 인식 알고리즘을 이용하여 이미지로부터 대상 텍스트를 추출하는 방법을 예시적으로 설명하기 위한 도면이다.
도 23은 이미지 내에 포함되어 있는 텍스트를 적어도 하나 이상의 그룹으로 정의하는 방법을 예시적으로 설명하기 위한 도면이다.
도 24는 알고리즘을 이용하여 이미지로부터 Symbol 정보 및 텍스트 영역 정보를 획득하는 방법을 예시적으로 설명하기 위한 도면이다.
도 25는 일 실시예에 따른 이미지 내에 포함된 텍스트를 Word 그룹으로 분류 및 결정하는 방법을 예시적으로 설명하기 위한 도면이다.
도 26은 이미지 내에 포함된 복수의 Symbol을 적어도 하나 이상의 Word 그룹으로 결정하는 방법을 예시적으로 설명하기 위한 도면이다.
도 27은 일 실시예에 따른 이미지 내에 포함된 텍스트를 Line 그룹으로 분류 및 결정하는 방법을 예시적으로 설명하기 위한 도면이다.
도 28 내지 도 31은 복수의 Word 그룹을 Line 그룹으로 결정하기 위한 조건을 예시적으로 설명하기 위한 도면이다.
도 32는 일 실시예에 따른 이미지 내에 포함된 텍스트를 Paragraph 그룹으로 분류 및 결정하는 방법을 예시적으로 설명하기 위한 도면이다.
도 33은 복수의 Line 그룹을 Paragraph 그룹으로 결정하기 위한 조건을 예시적으로 설명하기 위한 도면이다.1 is a diagram for describing a system for translating and editing text included in an image according to an exemplary embodiment.
2 is a diagram for explaining the configuration of a server according to an embodiment.
3 is a diagram for explaining an image translation and editing method according to an exemplary embodiment.
4 is a diagram for illustratively explaining that a server extracts target text from an image.
5 is a diagram for illustratively explaining a method of extracting, by a server, standardized target text and unstructured target text from an image.
6 is a diagram for illustratively describing a method for obtaining translated text by a server according to an exemplary embodiment.
7 is a diagram for illustratively explaining a method for obtaining translated text by a server according to another embodiment.
8 is a diagram for illustratively explaining a method of analyzing product information from an image and then using it to translate text.
9 and 10 are diagrams for illustratively explaining a method for a server to selectively translate text included in an image.
11 and 12 are diagrams illustratively illustrating a specific method for a server to translate target text.
13 is a diagram for illustratively explaining a method of inserting the acquired translated text into an image.
14 and 15 are diagrams for illustratively explaining a method of inserting translated text reflecting feature information (style information) of target text into an image.
16 and 17 are diagrams for explaining a method of obtaining target text feature information according to an exemplary embodiment.
18 is a diagram for exemplarily explaining a method for a server to additionally modify and insert translated text.
FIG. 19 is a diagram for explaining a method for a server to additionally modify translated text when the translated text is inserted out of an area within an image.
20 and 21 are diagrams for illustratively explaining UX/UI of software that enables additional editing of inserted and arranged translated text and storage and management of edited images.
22 is a diagram for exemplarily explaining a method of extracting target text from an image using a text recognition algorithm.
23 is a diagram for illustratively explaining a method of defining text included in an image as one or more groups.
24 is a diagram for exemplarily explaining a method of acquiring symbol information and text area information from an image using an algorithm.
25 is a diagram for exemplarily explaining a method of classifying and determining text included in an image into word groups according to an exemplary embodiment.
26 is a diagram for illustratively explaining a method of determining a plurality of symbols included in an image as at least one word group.
27 is a diagram for exemplarily explaining a method of classifying and determining text included in an image into a line group according to an exemplary embodiment.
28 to 31 are diagrams for explaining conditions for determining a plurality of word groups as a line group by way of example.
32 is a diagram for exemplarily explaining a method of classifying and determining text included in an image into paragraph groups according to an exemplary embodiment.
33 is a diagram for illustratively explaining conditions for determining a plurality of line groups as a paragraph group.

본 출원의 상술한 목적, 특징들 및 장점은 첨부된 도면과 관련된 다음의 상세한 설명을 통해 보다 분명해질 것이다. 다만, 본 출원은 다양한 변경을 가할 수 있고 여러 가지 실시예들을 가질 수 있는 바, 이하에서는 특정 실시예들을 도면에 예시하고 이를 상세히 설명하고자 한다.The foregoing objects, features and advantages of the present application will become more apparent from the following detailed description taken in conjunction with the accompanying drawings. However, the present application can apply various changes and can have various embodiments. Hereinafter, specific embodiments will be illustrated in the drawings and described in detail.

명세서 전체에 걸쳐서 동일한 참조번호들은 원칙적으로 동일한 구성요소들을 나타낸다. 또한, 각 실시예의 도면에 나타나는 동일한 사상의 범위 내의 기능이 동일한 구성요소는 동일한 참조부호를 사용하여 설명하며, 이에 대한 중복되는 설명은 생략하기로 한다.Like reference numerals designate essentially like elements throughout the specification. In addition, components having the same function within the scope of the same idea appearing in the drawings of each embodiment will be described using the same reference numerals, and overlapping descriptions thereof will be omitted.

본 출원과 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 출원의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 본 명세서의 설명 과정에서 이용되는 숫자(예를 들어, 제1, 제2 등)는 하나의 구성요소를 다른 구성요소와 구분하기 위한 식별기호에 불과하다.If it is determined that a detailed description of a known function or configuration related to the present application may unnecessarily obscure the subject matter of the present application, the detailed description thereof will be omitted. In addition, numbers (eg, first, second, etc.) used in the description process of this specification are only identifiers for distinguishing one component from another component.

또한, 이하의 실시예에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다.In addition, the suffixes "module" and "unit" for components used in the following embodiments are given or used interchangeably in consideration of ease of writing the specification, and do not have meanings or roles that are distinguished from each other by themselves.

이하의 실시예에서, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.In the following examples, expressions in the singular number include plural expressions unless the context clearly dictates otherwise.

이하의 실시예에서, 포함하다 또는 가지다 등의 용어는 명세서상에 기재된 특징, 또는 구성요소가 존재함을 의미하는 것이고, 하나 이상의 다른 특징들 또는 구성요소가 부가될 가능성을 미리 배제하는 것은 아니다.In the following embodiments, terms such as include or have mean that features or components described in the specification exist, and do not preclude the possibility that one or more other features or components may be added.

도면에서는 설명의 편의를 위하여 구성 요소들이 그 크기가 과장 또는 축소될 수 있다. 예컨대, 도면에서 나타난 각 구성의 크기 및 두께는 설명의 편의를 위해 임의로 나타낸 것으로, 본 발명이 반드시 도시된 바에 한정되지 않는다.In the drawings, the size of components may be exaggerated or reduced for convenience of explanation. For example, the size and thickness of each component shown in the drawings are arbitrarily shown for convenience of explanation, and the present invention is not necessarily limited to those shown.

어떤 실시예가 달리 구현 가능한 경우에 특정한 프로세스의 순서는 설명되는 순서와 다르게 수행될 수도 있다. 예를 들어, 연속하여 설명되는 두 프로세스가 실질적으로 동시에 수행될 수도 있고, 설명되는 순서와 반대의 순서로 진행될 수 있다.If an embodiment is otherwise implementable, the order of specific processes may be performed differently from the order described. For example, two processes that are described in succession may be performed substantially concurrently, or may proceed in an order reverse to that described.

이하의 실시예에서, 구성 요소 등이 연결되었다고 할 때, 구성 요소들이 직접적으로 연결된 경우뿐만 아니라 구성요소들 중간에 구성 요소들이 개재되어 간접적으로 연결된 경우도 포함한다.In the following embodiments, when components are connected, a case in which the components are directly connected as well as a case in which components are interposed between the components and connected indirectly is included.

예컨대, 본 명세서에서 구성 요소 등이 전기적으로 연결되었다고 할 때, 구성 요소 등이 직접 전기적으로 연결된 경우뿐만 아니라, 그 중간에 구성 요소 등이 개재되어 간접적으로 전기적 연결된 경우도 포함한다.For example, when it is said that components are electrically connected in this specification, not only the case where the components are directly electrically connected, but also the case where the components are interposed and electrically connected indirectly is included.

일 실시예에 따른 이미지 내에 포함된 대상 텍스트를 번역하는 방법에 있어서, 이미지를 획득하는 단계- 상기 이미지는 대상 텍스트를 포함하는 이미지임 -; 상기 획득된 이미지로부터 상기 대상 텍스트 중 적어도 일부에 대응되는 영역인 대상 텍스트 영역을 추출하는 단계; 미리 정해진 알고리즘을 통해 상기 대상 텍스트 영역에 포함되어 있는 대상 텍스트를 번역하여 번역 텍스트를 획득하는 단계; 상기 대상 텍스트 영역을 미리 정해진 기준에 따라 가공하는 단계; 및 상기 획득된 번역 텍스트를 상기 대상 텍스트 영역에 삽입 배치하는 단계;를 포함할 수 있다.A method of translating target text included in an image according to an embodiment, comprising: obtaining an image, wherein the image is an image including target text; extracting a target text area corresponding to at least a part of the target text from the obtained image; obtaining translated text by translating target text included in the target text area through a predetermined algorithm; processing the target text area according to a predetermined criterion; and inserting and arranging the obtained translated text into the target text area.

상기 이미지에는 복수의 대상 텍스트가 포함되어 있고, 상기 복수의 대상 텍스트는 미리 정해진 기준에 따라 적어도 하나 이상의 대상 텍스트 그룹으로 분류되되, 상기 번역 텍스트를 획득하는 단계는, 상기 미리 정해진 알고리즘을 통해 상기 적어도 하나 이상의 대상 텍스트 그룹 별로 상기 대상 텍스트를 번역하는 단계를 포함할 수 있다.The image includes a plurality of target texts, and the plurality of target texts are classified into at least one target text group according to a predetermined criterion. Translating the target text for each of one or more target text groups may be included.

상기 복수의 대상 텍스트는 상기 대상 텍스트의 폰트 종류, 폰트 색상, 사이즈, 언어 종류 중 어느 하나에 기초하여 분류된 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹을 포함할 수 있다.The plurality of target texts may include a first target text group and a second target text group classified based on any one of font type, font color, size, and language type of the target text.

상기 번역 텍스트를 획득하는 단계는, 제1 알고리즘을 통해 상기 제1 대상 텍스트 그룹에 포함된 대상 텍스트를 번역하는 단계 및 제2 알고리즘을 통해 상기 제2 대상 텍스트 그룹에 포함된 대상 텍스트를 번역하는 단계를 포함할 수 있다.The acquiring of the translated text may include translating the target text included in the first target text group through a first algorithm and translating the target text included in the second target text group through a second algorithm. can include

상기 번역 텍스트를 획득하는 단계는, 상기 제1 대상 텍스트 그룹 및 상기 제2 대상 텍스트 그룹 중 미리 정해진 조건에 해당하는 어느 하나의 그룹을 선택적으로 번역할 수 있다.In the obtaining of the translated text, one of the first target text group and the second target text group corresponding to a predetermined condition may be selectively translated.

상기 복수의 대상 텍스트는 제1 유형의 대상 텍스트 및 제2 유형의 대상 텍스트를 포함하되, 상기 제1 유형의 대상 텍스트는 상기 복수의 대상 텍스트 중 정형화된 대상 텍스트이고, 상기 제2 유형의 대상 텍스트는 상기 복수의 대상 텍스트 중 비정형화된 대상 텍스트이고, 상기 번역 텍스트를 획득하는 단계는, 제1 알고리즘을 통해 상기 제1 유형의 대상 텍스트를 번역하는 단계 및 제2 알고리즘을 통해 상기 제2 유형의 대상 텍스트를 번역하는 단계를 포함할 수 있다.The plurality of target texts include a first type of target text and a second type of target text, wherein the first type of target text is standardized target text among the plurality of target texts, and the second type of target text is unstructured target text among the plurality of target texts, and the obtaining of the translated text includes the steps of translating the target text of the first type through a first algorithm and the target text of the second type through a second algorithm. It may include translating the target text.

상기 제1 유형의 대상 텍스트는 상기 이미지 내에서 배경에 대응되는 영역에 위치한 대상 텍스트이고, 상기 제2 유형의 대상 텍스트는 상기 이미지 내에서 제품 사진에 대응되는 영역에 위치한 대상 텍스트일 수 있다.The target text of the first type may be target text located in an area corresponding to a background in the image, and the target text of the second type may be target text located in an area corresponding to a product photo in the image.

상기 번역 텍스트를 획득하는 단계는, 미리 정해진 알고리즘을 통해 상기 대상 텍스트 영역에 포함되어 있는 대상 텍스트를 번역하여 제1 후보 번역 텍스트 및 제2 후보 번역 텍스트를 획득하는 단계; 및 미리 정해진 기준에 따라 상기 제1 후보 번역 텍스트 및 상기 제2 후보 번역 텍스트 중 어느 하나를 상기 번역 텍스트로 결정하는 단계를 포함할 수 있다.The obtaining of the translated text may include translating the target text included in the target text area through a predetermined algorithm to obtain a first candidate translation text and a second candidate translation text; and determining one of the first candidate translation text and the second candidate translation text as the translated text according to a predetermined criterion.

상기 이미지 내에 포함된 대상 텍스트를 번역하는 방법은, 상기 이미지로부터 메타 데이터를 추출하는 단계를 포함하고, 상기 번역 텍스트를 획득하는 단계는, 상기 추출된 메타 데이터에 기초하여 상기 대상 텍스트 영역에 포함되어 있는 대상 텍스트를 번역하여 번역 텍스트를 획득할 수 있다.The method of translating target text included in the image includes extracting meta data from the image, and obtaining the translated text is included in the target text area based on the extracted meta data. The translated text may be obtained by translating the existing target text.

상기 이미지 내에 포함된 대상 텍스트를 번역하는 방법은, 상기 이미지로부터 메타 데이터를 추출하는 단계를 포함하고, 상기 번역 텍스트를 획득하는 단계는, 상기 메타 데이터에 기초하여 상기 제1 후보 번역 텍스트 및 상기 제2 후보 번역 텍스트 중 어느 하나를 상기 번역 텍스트로 결정하는 단계를 포함할 수 있다.The method of translating target text included in the image includes extracting meta data from the image, and the obtaining of the translated text includes the first candidate translated text and the first candidate translated text based on the meta data. A step of determining one of two candidate translation texts as the translation text may be included.

상기 이미지 내에 포함된 대상 텍스트를 번역하는 방법은, 사용자 정보를 획득하는 단계를 포함하고- 상기 사용자 정보는 사용자가 속해있는 국가 또는 사용자의 언어 환경에 관한 정보를 포함함 -, 상기 번역 텍스트를 획득하는 단계는, 상기 획득된 사용자 정보에 기초하여 상기 대상 텍스트 영역에 포함되어 있는 대상 텍스트를 번역하여 번역 텍스트를 획득할 수 있다.The method of translating the target text included in the image includes obtaining user information, wherein the user information includes information about a country to which the user belongs or a language environment of the user, and obtaining the translated text. In the step of translating the target text included in the target text area based on the obtained user information, translated text may be obtained.

상기 이미지 내에 포함된 대상 텍스트를 번역하는 방법은, 사용자 정보를 획득하는 단계를 포함하고- 상기 사용자 정보는 사용자가 속해있는 국가 또는 사용자의 언어 환경에 관한 정보를 포함함 -, 상기 번역 텍스트를 획득하는 단계는, 상기 획득된 사용자 정보에 기초하여 상기 제1 대상 텍스트 그룹 및 상기 제2 대상 텍스트 그룹 중 어느 하나의 그룹을 선택적으로 번역할 수 있다.The method of translating the target text included in the image includes obtaining user information, wherein the user information includes information about a country to which the user belongs or a language environment of the user, and obtaining the translated text. In the step of translating, one of the first target text group and the second target text group may be selectively translated based on the obtained user information.

상기 이미지 내에 포함된 대상 텍스트를 번역하는 방법은, 사용자 정보를 획득하는 단계를 포함하고- 상기 사용자 정보는 사용자가 속해있는 국가 또는 사용자의 언어 환경에 관한 정보를 포함함 -, 상기 번역 텍스트를 획득하는 단계는, 상기 획득된 사용자 정보에 기초하여 상기 제1 후보 번역 텍스트 및 상기 제2 후보 번역 텍스트 중 어느 하나를 상기 번역 텍스트로 결정하는 단계를 포함할 수 있다.The method of translating the target text included in the image includes obtaining user information, wherein the user information includes information about a country to which the user belongs or a language environment of the user, and obtaining the translated text. The doing may include determining one of the first candidate translation text and the second candidate translation text as the translation text based on the obtained user information.

상기 이미지 내에 포함된 대상 텍스트를 번역하는 방법은, 상기 이미지로부터 상품 정보를 획득하는 단계를 포함하고- 상기 상품 정보는 상기 이미지 내에 포함되어 있는 상품에 관한 정보임 -, 상기 번역 텍스트를 획득하는 단계는, 상기 획득된 상품 정보에 기초하여 상기 대상 텍스트 영역에 포함되어 있는 대상 텍스트를 번역하여 번역 텍스트를 획득할 수 있다.The method for translating target text included in the image includes obtaining product information from the image, wherein the product information is information about a product included in the image, and obtaining the translated text. may acquire translated text by translating the target text included in the target text area based on the obtained product information.

상기 이미지 내에 포함된 대상 텍스트를 번역하는 방법은, 상기 이미지로부터 상품 정보를 획득하는 단계를 포함하고- 상기 상품 정보는 상기 이미지 내에 포함되어 있는 상품에 관한 정보임 -, 상기 번역 텍스트를 획득하는 단계는, 상기 획득된 상품 정보에 기초하여 상기 제1 후보 번역 텍스트 및 상기 제2 후보 번역 텍스트 중 어느 하나를 상기 번역 텍스트로 결정하는 단계를 포함할 수 있다.The method for translating target text included in the image includes obtaining product information from the image, wherein the product information is information about a product included in the image, and obtaining the translated text. The may include determining one of the first candidate translation text and the second candidate translation text as the translation text based on the obtained product information.

상기 번역 텍스트를 획득하는 단계는, 미리 정해진 알고리즘을 통해 상기 대상 텍스트 영역에 포함되어 있는 대상 텍스트를 번역하여 후보 번역 텍스트를 획득하는 단계; 미리 정해진 기준에 따라 상기 후보 번역 텍스트의 동의어 또는 유사어에 해당하는 추천 번역 텍스트를 획득하는 단계; 및 상기 후보 번역 텍스트 및 상기 추천 번역 텍스트 중 어느 하나를 상기 번역 텍스트로 결정하는 단계;를 포함할 수 있다.The obtaining of the translated text may include translating the target text included in the target text area through a predetermined algorithm to obtain a candidate translated text; obtaining a recommended translation text corresponding to synonyms or similar words of the candidate translation text according to a predetermined criterion; and determining one of the candidate translation text and the recommended translation text as the translation text.

상기 추천 번역 텍스트는 상기 이미지로부터 획득된 메타 데이터, 상기 이미지로부터 획득된 상품 정보 및 사용자 정보 중 적어도 하나에 기초하여 정해지는 상기 후보 번역 텍스트의 동의어 또는 유사어일 수 있다.The recommended translation text may be a synonym or similar word of the candidate translation text determined based on at least one of metadata obtained from the image, product information obtained from the image, and user information.

상기 이미지 내에 포함된 대상 텍스트를 번역하는 방법은, 상기 대상 텍스트에 관한 스타일 정보를 획득하는 단계를 포함하고, 상기 번역 텍스트를 삽입 배치하는 단계는, 상기 획득된 스타일 정보에 기초하여 상기 번역 텍스트의 스타일을 결정하는 단계; 및 상기 결정된 번역 텍스트의 스타일을 반영하여, 상기 번역 텍스트를 상기 대상 텍스트 영역에 삽입 배치하는 단계를 포함할 수 있다.The method of translating the target text included in the image includes acquiring style information about the target text, and inserting and arranging the translated text includes the translation of the translated text based on the acquired style information. determining the style; and inserting and arranging the translated text into the target text area by reflecting the determined style of the translated text.

상기 스타일 정보는 상기 대상 텍스트의 폰트 종류, 텍스트 색상, 텍스트 크기, 텍스트 배치 방향, 텍스트 정렬 기준, 텍스트 테두리 색상, 텍스트 자간 및 텍스트 배경 색상 중 적어도 하나 이상을 포함할 수 있다.The style information may include at least one of a font type, text color, text size, text arrangement direction, text alignment criterion, text border color, text kerning, and text background color of the target text.

상기 대상 텍스트에 관한 스타일 정보를 획득하는 단계는, 상기 대상 텍스트의 엣지를 분석하는 단계 또는 상기 대상 텍스트의 특징점을 분석하는 단계를 더 포함하고, 상기 스타일 정보는 상기 대상 텍스트 엣지 분석 또는 상기 대상 텍스트 특징점 분석을 통해 획득되는 정보일 수 있다.Acquiring style information on the target text may further include analyzing an edge of the target text or analyzing feature points of the target text, and the style information may include analyzing an edge of the target text or the target text. It may be information obtained through feature point analysis.

상기 번역 텍스트의 스타일을 결정하는 단계는, 상기 획득된 스타일 정보가 반영된 번역 텍스트를 획득하는 단계; 상기 번역 텍스트가 상기 이미지 내에 배치되는 영역인 번역 텍스트 영역을 판단하고, 상기 번역 텍스트 영역에 기초하여 상기 번역 텍스트의 스타일을 추가적으로 변형할지 결정하는 단계; 및 상기 번역 텍스트의 스타일이 추가적으로 변형될 필요가 있다고 판단되는 경우, 상기 번역 텍스트 영역을 고려하여 상기 번역 텍스트의 스타일을 추가적으로 번형한 후 상기 이미지 내에 삽입 배치하는 단계를 포함할 수 있다.The step of determining the style of the translated text may include: acquiring translated text to which the acquired style information is reflected; determining a translated text area, which is an area where the translated text is placed in the image, and determining whether to additionally modify the style of the translated text based on the translated text area; and when it is determined that the style of the translated text needs to be additionally modified, further transforming the style of the translated text in consideration of the translated text area and then inserting and arranging the style of the translated text into the image.

상기 번역 텍스트의 스타일을 추가적으로 변형할지 결정하는 단계는, 상기 번역 텍스트 영역 중 적어도 일부가 상기 이미지 내의 영역을 벗어나는 경우, 상기 번역 텍스트의 스타일을 추가적으로 변형해야 한다고 결정할 수 있다.The step of determining whether to additionally modify the style of the translated text may include determining that the style of the translated text should be additionally modified when at least some of the translated text regions are out of an area within the image.

상기 번역 텍스트의 스타일을 추가적으로 변형한 후 상기 이미지 내에 삽입 배치하는 단계는, 상기 번역 텍스트의 폰트 종류, 텍스트 색상, 텍스트 크기, 텍스트 배치 방향, 텍스트 정렬 기준, 텍스트 테두리 색상, 텍스트 자간 및 텍스트 배경 색상 중 적어도 하나 이상을 추가적으로 변형한 후, 상기 이미지 내에 삽입 배치할 수 있다.The step of inserting and arranging the translated text in the image after additionally modifying the style of the translated text includes the font type, text color, text size, text arrangement direction, text alignment criterion, text border color, text kerning, and text background color of the translated text. After additionally transforming at least one of the above, it may be inserted and arranged in the image.

상기 번역 텍스트의 스타일을 결정하는 단계는, 상기 획득된 스타일 정보가 반영된 번역 텍스트가 상기 이미지 내의 영역을 벗어나 삽입되는 경우, 상기 번역 텍스트의 스타일을 추가적으로 변형하여 상기 이미지 내에 삽입 배치하는 단계를 포함할 수 있다.The step of determining the style of the translated text may include further modifying the style of the translated text and inserting and arranging the style of the translated text when the translated text reflecting the acquired style information is inserted out of an area within the image. can

1One 전체 시스템full system

도 1은 일 실시예에 따른 이미지 내에 포함된 텍스트를 번역하고 편집하는 시스템을 설명하기 위한 도면이고, 도 2는 일 실시예에 따른 서버의 구성을 설명하기 위한 도면이다.FIG. 1 is a diagram for explaining a system for translating and editing text included in an image according to an exemplary embodiment, and FIG. 2 is a diagram for explaining a configuration of a server according to an exemplary embodiment.

도 1을 참조하면, 일 실시예에 따른 이미지 번역 및 편집 방법은 서버(1000)를 통해서 수행될 수 있다. 서버(1000)는 도 2를 참조하면, 제어부(100), 저장부(200) 및 통신부(300)를 포함할 수 있다.Referring to FIG. 1 , a method of translating and editing images according to an embodiment may be performed through a server 1000 . Referring to FIG. 2 , the server 1000 may include a control unit 100 , a storage unit 200 and a communication unit 300 .

저장부(200)는 이미지 번역 및 편집을 수행하는데 사용되는 신경망 모델, 알고리즘 및/또는 그 밖의 데이터가 저장되어 있을 수 있다. 이때, 저장부(200)에 저장되어 있는 신경망 모델, 알고리즘 및/또는 그 밖의 데이터는 외부의 서버 또는 전자 장치로부터 수신된 것일 수 있다.The storage unit 200 may store neural network models, algorithms, and/or other data used to translate and edit images. In this case, the neural network model, algorithm, and/or other data stored in the storage unit 200 may be received from an external server or electronic device.

통신부(300)는 개인 클라우드 장치, 외부 서버 및/또는 외부 장치와 통신을 수행할 수 있다. 통신부(300)는 사용자에 의해 입력된 데이터를 외부 장치, 서버 등으로부터 수신할 수 있고, 서버(1000) 상에 저장된 데이터를 외부 장치, 서버 등으로 전송할 수 있다. 이를 위해 통신부(300)는 외부 장치, 서버 등과 통신을 가능하게 하는 하나 이상의 구성 요소를 포함할 수 있으며, 예컨대, 근거리 통신 모듈, 유선 통신 모듈 및 통신 모듈 중 적어도 하나를 포함할 수 있다. The communication unit 300 may communicate with a personal cloud device, an external server, and/or an external device. The communication unit 300 may receive data input by a user from an external device or server, and may transmit data stored on the server 1000 to an external device or server. To this end, the communication unit 300 may include one or more components enabling communication with an external device, a server, and the like, and may include, for example, at least one of a short-distance communication module, a wired communication module, and a communication module.

근거리 통신 모듈은 블루투스 모듈, 적외선 통신 모듈, RFID(Radio Frequency Identification) 통신 모듈, WLAN(Wireless Local Access Network) 통신 모듈, NFC 통신 모듈, 직비(Zigbee) 통신 모듈 등 근거리에서 무선 통신망을 이용하여 신호를 송수신하는 다양한 근거리 통신 모듈을 포함할 수 있다.The short-range communication module uses a wireless communication network such as a Bluetooth module, an infrared communication module, a Radio Frequency Identification (RFID) communication module, a Wireless Local Access Network (WLAN) communication module, an NFC communication module, and a Zigbee communication module to transmit signals at a short distance. It may include various short-range communication modules that transmit and receive.

유선 통신 모듈은 캔(Controller Area Network; CAN) 통신 모듈, 지역 통신(Local Area Network; LAN) 모듈, 광역 통신(Wide Area Network; WAN) 모듈 또는 부가가치 통신(Value Added Network; VAN) 모듈 등 다양한 유선 통신 모듈뿐만 아니라, USB(Universal Serial Bus), HDMI(High Definition Multimedia Interface), DVI(Digital Visual Interface), RS-232(recommended standard232), 전력선 통신, 또는 POTS(plain old telephone service) 등 다양한 케이블 통신 모듈을 포함할 수 있다.Wired communication modules include various wired communication modules, such as Controller Area Network (CAN) communication modules, Local Area Network (LAN) modules, Wide Area Network (WAN) modules, or Value Added Network (VAN) modules. In addition to communication modules, various cable communications such as USB (Universal Serial Bus), HDMI (High Definition Multimedia Interface), DVI (Digital Visual Interface), RS-232 (recommended standard 232), power line communication, or POTS (plain old telephone service) modules may be included.

제어부(100)는 적어도 하나의 프로세서를 포함할 수 있다. 이때, 각각의 프로세서는 메모리에 저장된 적어도 하나의 명령어를 실행시킴으로써, 소정의 동작을 실행할 수 있다. 구체적으로 제어부(100)는 서버(1000)에 포함되어 있는 구성들의 전체적인 동작을 제어할 수 있다. 다시 말해, 서버(1000)는 제어부(100)에 의해 제어 또는 동작할 수 있다.The controller 100 may include at least one processor. At this time, each processor may execute a predetermined operation by executing at least one instruction stored in the memory. Specifically, the control unit 100 may control overall operations of elements included in the server 1000 . In other words, the server 1000 may be controlled or operated by the controller 100 .

일 실시예에 따르면, 제어부(100)는 이미지를 획득하고, 획득된 이미지 내에 포함된 텍스트를 번역할 수 있고, 번역된 텍스트를 사용자에게 제공할 수 있다. 또한, 제어부(100)는 상기 번역된 텍스트가 사용자에 의해 편집될 수 있도록 하는 기능을 수행할 수 있다.According to an embodiment, the controller 100 may acquire an image, translate text included in the acquired image, and provide the translated text to a user. Also, the controller 100 may perform a function of allowing the translated text to be edited by the user.

제어부(100)가 이미지 번역 및 편집을 수행하는 일련의 단계들에 대한 구체적인 설명은 후술하도록 한다.A detailed description of a series of steps in which the control unit 100 performs image translation and editing will be described later.

다시 도 1을 참조하면, 일 실시예에 따른 이미지 번역 및 편집 시스템은 서버(1000)와 전자 장치(2000)를 포함할 수 있다. 이때, 상기 전자 장치(2000)는 휴대 가능한 정보통신기기 예컨대, 스마트폰, 테블릿 등을 포함할 수 있다. 또한 전자 장치(2000)는 전자 회로를 이용하여 자동적으로 계산이나 데이터의 처리를 실행하는 다양한 장치를 포함할 수 있다.Referring back to FIG. 1 , an image translation and editing system according to an embodiment may include a server 1000 and an electronic device 2000 . In this case, the electronic device 2000 may include a portable information communication device, such as a smart phone or a tablet. In addition, the electronic device 2000 may include various devices that automatically calculate or process data using electronic circuits.

일 실시예에 따르면, 서버(1000)는 획득된 이미지에 기초하여 이미지 내에 포함된 텍스트를 번역하고 이를 편집할 수 있다. 이때, 서버(1000)는 전자 장치(2000)로부터 전송받은 데이터에 기초하여 이미지 번역 및 편집 동작을 수행할 수 있다. 예컨대, 전자 장치(2000)는 사용자의 입력에 의해 이미지를 획득할 수 있다. 전자 장치(2000)는 획득된 이미지를 서버(1000)에 전송할 수 있고, 서버(1000)는 수신된 이미지에 포함되어 있는 텍스트를 번역하고 편집하는 기능을 수행할 수 있다.According to an embodiment, the server 1000 may translate and edit text included in the image based on the obtained image. In this case, the server 1000 may perform image translation and editing operations based on data transmitted from the electronic device 2000 . For example, the electronic device 2000 may acquire an image by a user's input. The electronic device 2000 may transmit the acquired image to the server 1000, and the server 1000 may perform a function of translating and editing text included in the received image.

다른 실시예에 따르면, 전자 장치(2000)는 획득된 이미지에 기초하여 이미지 내에 포함된 텍스트를 번역하고 이를 편집할 수 있다. 전자 장치(2000)는 사용자의 입력에 의해 획득된 이미지를 이용하여 텍스트 번역 및 편집을 수행할 수 있다. 이때, 전자 장치(2000)는 사용자의 입력에 의해 획득된 이미지를 이용하여 텍스트 번역 및 편집을 수행한 후, 번역 및 편집이 수행된 이미지를 서버(1000)로 전송할 수 있다. 서버(1000)는 전자 장치(2000)로부터 획득된 이미지를 저장부(200)에 저장한 후 이를 외부의 다른 장치로 전송할 수 있다.According to another embodiment, the electronic device 2000 may translate and edit text included in the image based on the obtained image. The electronic device 2000 may perform text translation and editing using an image acquired by a user's input. In this case, the electronic device 2000 may perform text translation and editing using an image acquired by a user's input, and then transmit the translated and edited image to the server 1000 . The server 1000 may store the image obtained from the electronic device 2000 in the storage unit 200 and transmit it to another external device.

이하에서는, 이미지 내에 포함된 텍스트를 번역하고 편집하는 방법이 서버(1000)에 의해 수행되는 것으로 예시하여 설명하나, 이에 한정되는 것은 아니며, 상술한 바와 같이 전자 장치(2000)에 의해 수행될 수도 있다.Hereinafter, a method of translating and editing text included in an image is exemplarily described as being performed by the server 1000, but is not limited thereto, and may be performed by the electronic device 2000 as described above. .

22 전체 프로세스whole process

도 3은 일 실시예에 따른 이미지 번역 및 편집 방법을 설명하기 위한 도면이다. 도 3을 참조하면, 일 실시예에 따른 이미지 번역 및 편집 방법은 이미지를 획득하는 단계(S1000), 이미지로부터 대상 텍스트를 추출하는 단계(S2000), 대상 텍스트를 번역하는 단계(S3000), 대상 텍스트가 번역된 번역 텍스트를 획득하는 단계(S4000), 이미지 내에서 대상 텍스트 및 배경을 제거하는 단계(S5000), 배경을 합성하는 단계(S6000), 번역 텍스트를 삽입하는 단계(S7000) 및 번역 텍스트를 편집하는 단계(S8000)를 포함할 수 있다.3 is a diagram for explaining an image translation and editing method according to an exemplary embodiment. Referring to FIG. 3 , a method of translating and editing an image according to an exemplary embodiment includes acquiring an image (S1000), extracting target text from the image (S2000), translating the target text (S3000), and performing target text. Obtaining the translated text (S4000), removing the target text and background from the image (S5000), synthesizing the background (S6000), inserting the translated text (S7000), and An editing step (S8000) may be included.

33 이미지 획득image acquisition

서버(1000)는 이미지 획득 단계(S1000)를 통해 이미지를 획득할 수 있다. 서버(1000)는 사용자의 입력으로부터 이미지를 획득할 수 있다. 상기 이미지에는 적어도 하나 이상의 텍스트가 포함되어 있을 수 있다. 상기 이미지는 상품에 관한 정보를 포함하고 있는 이미지일 수 있으나, 이에 한정되는 것은 아니며, 웹상에 업로드되어 있는 다양한 종류의 이미지일 수 있다.The server 1000 may acquire an image through an image acquisition step (S1000). The server 1000 may obtain an image from a user's input. The image may include at least one text. The image may be an image including product-related information, but is not limited thereto, and may be various types of images uploaded on the web.

44 대상 텍스트 추출Target text extraction

서버(1000)는 대상 텍스트 추출 단계(S2000)를 통해 획득한 이미지로부터 대상 텍스트를 추출할 수 있다. 예컨대, 상기 대상 텍스트는 이미지에 포함되어 있는 텍스트일 수 있다. 상기 대상 텍스트는 이미지 내에서 텍스트에 대응되는 영역을 의미할 수 있다.The server 1000 may extract target text from an image acquired through the target text extraction step (S2000). For example, the target text may be text included in an image. The target text may refer to a region corresponding to text in an image.

도 4는 서버가 이미지로부터 대상 텍스트를 추출하는 것을 예시적으로 설명하기 위한 도면이다. 도 4를 참고하면, 이미지에는 적어도 하나 이상의 텍스트가 포함되어 있을 수 있으며, 서버(1000)는 이미지로부터 적어도 하나 이상의 대상 텍스트를 추출할 수 있다.4 is a diagram for illustratively explaining that a server extracts target text from an image. Referring to FIG. 4 , an image may include at least one text, and the server 1000 may extract at least one target text from the image.

서버(1000)는 문자 인식 알고리즘(또는, 광학식 문자 판독 장치(OCR))을 통해 이미지에 포함되어 있는 대상 텍스트를 인식한 후, 이미지 내에서 이에 대응되는 영역을 대상 텍스트 영역으로 결정할 수 있다.The server 1000 may recognize target text included in an image through a character recognition algorithm (or optical character recognition (OCR)) and then determine a region corresponding to the target text region in the image as a target text region.

획득된 이미지 내에는 텍스트에 대응되는 영역이 적어도 하나 이상 존재할 수 있으며, 서버(1000)는 이미지 내에 존재하는 텍스트 영역을 적어도 하나 이상 추출할 수 있다.At least one area corresponding to text may exist in the acquired image, and the server 1000 may extract at least one text area existing in the image.

예를 들어, 도 4와 같이 이미지에는 복수의 대상 텍스트가 존재할 수 있으며, 서버(1000)는 이미지 내에서 복수의 대상 텍스트를 추출할 수 있다. 이때, 추출된 대상 텍스트는 제1 대상 텍스트(T1), 제2 대상 텍스트(T2), 제3 대상 텍스트(T3), 제4 대상 텍스트(T4), 제5 대상 텍스트(T5), 제6 대상 텍스트(T6), 제7 대상 텍스트(T7), 제8 대상 텍스트(T8) 및 제9 대상 텍스트(T9)일 수 있다.For example, as shown in FIG. 4 , a plurality of target texts may exist in an image, and the server 1000 may extract a plurality of target texts from the image. At this time, the extracted target text includes the first target text T1, the second target text T2, the third target text T3, the fourth target text T4, the fifth target text T5, and the sixth target text. It may be the text T6, the seventh target text T7, the eighth target text T8, and the ninth target text T9.

상기 복수의 텍스트는 하나의 언어일 수 있으며, 또는 복수의 언어일 수 있다. 예컨대, 도 4에서와 같이 이미지에는 중국어로 기재된 대상 텍스트(제2 대상 텍스트 내지 제9 대상 텍스트)와 영어로 기재된 대상 텍스트(제1 대상 텍스트)가 포함되어 있을 수 있다.The plurality of texts may be in one language or may be in a plurality of languages. For example, as shown in FIG. 4 , the image may include target text written in Chinese (second to ninth target text) and target text written in English (first target text).

일 실시예에 따르면, 서버(1000)는 미리 정해진 기준으로 이미지로부터 적어도 하나 이상의 대상 텍스트 영역을 추출할 수 있다.According to an embodiment, the server 1000 may extract at least one target text area from an image based on a predetermined criterion.

예를 들어, 서버(1000)는 문자 인식 알고리즘을 통해 인식된 복수의 텍스트를 폰트 별로 분류하여 대상 텍스트 영역을 추출할 수 있다. 보다 구체적으로, 서버(1000)는 이미지 내에서 제1 폰트를 가지는 텍스트를 제1 대상 텍스트 영역으로 결정할 수 있고, 제2 폰트를 가지는 텍스트를 제2 대상 텍스트 영역으로 결정할 수 있다.For example, the server 1000 may extract a target text area by classifying a plurality of texts recognized through a character recognition algorithm for each font. More specifically, the server 1000 may determine text having a first font in the image as a first target text area, and may determine text having a second font as a second target text area.

다른 예로, 서버(1000)는 문자 인식 알고리즘을 통해 인식된 복수의 텍스트를 글자 사이즈 별로 분류하여 대상 텍스트 영역을 추출할 수 있다. 보다 구체적으로, 서버(1000)는 이미지 내에서 제1 사이즈를 가지는 텍스트를 제1 대상 텍스트 영역으로 결정할 수 있고, 제2 사이즈를 가지는 텍스트를 제2 대상 텍스트 영역으로 결정할 수 있다.As another example, the server 1000 may extract a target text area by classifying a plurality of texts recognized through a character recognition algorithm by character size. More specifically, the server 1000 may determine text having a first size in the image as a first target text area and text having a second size as a second target text area.

다른 예로, 서버(1000)는 문자 인식 알고리즘을 통해 인식된 복수의 텍스트가 이미지 내에서 존재하는 위치에 기초하여 대상 텍스트 영역을 추출할 수 있다. 보다 구체적으로, 서버(1000)는 이미지 내에서 일정 기준 이상 서로 인접해 있는 텍스트를 하나의 대상 텍스트 영역으로 결정할 수 있다. 예컨대, 이미지 내의 제1 영역에 존재하는 텍스트를 제1 대상 텍스트 영역으로 결정할 수 있고, 제2 영역에 존재하는 텍스트를 제2 대상 텍스트 영역으로 결정할 수 있다.As another example, the server 1000 may extract a target text area based on a position in an image of a plurality of texts recognized through a text recognition algorithm. More specifically, the server 1000 may determine, as one target text area, texts adjacent to each other within a predetermined standard or more in the image. For example, text existing in a first area of the image may be determined as a first target text area, and text existing in a second area may be determined as a second target text area.

도 5는 서버가 이미지로부터 정형화된 대상 텍스트 및 비정형화된 대상 텍스트를 추출하는 방법을 예시적으로 설명하기 위한 도면이다. 도 5를 참조하면 대상 텍스트 추출 단계(S2000)는 정형화된 대상 텍스트 추출 단계(S2100) 및 비정형화된 대상 텍스트 추출 단계(S2200)를 포함할 수 있다.5 is a diagram for illustratively explaining a method of extracting, by a server, standardized target text and unstructured target text from an image. Referring to FIG. 5 , the step of extracting target text ( S2000 ) may include a step of extracting standardized target text ( S2100 ) and a step of extracting unstructured target text ( S2200 ).

서버(1000)가 획득한 이미지에는 복수의 텍스트가 포함되어 있을 수 있는데, 상기 텍스트는 이미지의 배경 영역에 위치하는 정형화된 텍스트와 이미지 내의 제품 사진 등에 위치하는 비정형화된 텍스트를 포함할 수 있다.The image obtained by the server 1000 may include a plurality of texts. The text may include standard text located in the background area of the image and non-standard text located in a product photo in the image.

도 4에 예시된 이미지를 참조하면, 이미지로부터 추출되는 대상 텍스트 영역은 정형화된 대상 텍스트 영역 예컨대, 제1 대상 텍스트 영역(T1), 제2 대상 텍스트 영역(T2), 제3 대상 텍스트 영역(T3), 제4 대상 텍스트 영역(T4), 제5 대상 텍스트 영역(T5) 및 제9 대상 텍스트 영역(T9)을 포함할 수 있다. 또한, 상기 대상 텍스트 영역은 비정형화된 대상 텍스트 영역 예컨대, 제6 대상 텍스트 영역(T6), 제7 대상 텍스트 영역(T7) 및 제8 대상 텍스트 영역(T8)을 포함할 수 있다.Referring to the image illustrated in FIG. 4 , the target text area extracted from the image is a standardized target text area, for example, a first target text area T1, a second target text area T2, and a third target text area T3. ), a fourth target text area T4, a fifth target text area T5, and a ninth target text area T9. Also, the target text area may include unstructured target text areas, for example, a sixth target text area T6, a seventh target text area T7, and an eighth target text area T8.

문자 인식 알고리즘을 통해 대상 텍스트를 추출하는 보다 상세한 발명에 대하여는 도면을 참조하여 후술하도록 한다.A more detailed invention of extracting target text through a text recognition algorithm will be described later with reference to the drawings.

55 대상 텍스트 번역 (번역 텍스트 획득)Translate target text (acquire translated text)

5.1 신경망 모델5.1 neural network model

서버(1000)는 획득된 이미지로부터 정형화된 대상 텍스트 및 비정형화된 대상 텍스트을 추출한 후, 추출된 각각의 대상 텍스트로부터 번역 텍스트를 획득할 수 있다.The server 1000 may extract standardized target text and non-standardized target text from the obtained image, and then obtain translation text from each of the extracted target texts.

도 6은 일 실시예에 따른 서버가 번역 텍스트를 획득하는 방법을 예시적으로 설명하기 위한 도면이다. 도 6을 참조하면, 서버(1000)는 번역 텍스트 획득 단계(S4000)를 통해 대상 텍스트가 번역된 번역 텍스트를 추출할 수 있다. 예컨대, 서버(1000)는 이미지를 획득한 후, 대상 텍스트를 추출한 후, 추출된 대상 텍스트를 번역하여 번역 텍스트를 획득할 수 있다.6 is a diagram for illustratively describing a method for obtaining translated text by a server according to an exemplary embodiment. Referring to FIG. 6 , the server 1000 may extract the translated text in which the target text is translated through a translated text obtaining step (S4000). For example, the server 1000 may obtain translated text by acquiring an image, extracting target text, and then translating the extracted target text.

일 실시예에 따르면, 서버(1000)는 미리 학습된 신경망 모델(NN)을 통해 추출된 대상 텍스트를 번역하여 번역 텍스트를 획득할 수 있다. 서버(1000)는 딥러닝 기반 기계 번역을 통해 대상 텍스트를 번역하여 번역 텍스트를 획득할 수 있다. 다만, 이에 한정되는 것은 아니며, 도면에는 도시하지 않았으나, 서버(1000)는 기 알려진 다양한 번역 알고리즘 또는 장치를 통해 대상 텍스트를 번역하여 번역 텍스트를 획득할 수 있다.According to an embodiment, the server 1000 may obtain translated text by translating target text extracted through a pretrained neural network model (NN). The server 1000 may obtain the translated text by translating the target text through deep learning-based machine translation. However, it is not limited thereto, and although not shown in the drawing, the server 1000 may obtain the translated text by translating the target text through various known translation algorithms or devices.

한편, 도 6에 도시된 바와 같이, 서버(1000)는 사용자 입력을 추가적으로 획득하여 번역 텍스트를 획득할 수 있다. 예를 들어, 서버(1000)는 이미지로부터 대상 텍스트를 추출한 후, 신경망 모델을 통해 상기 대상 텍스트를 번역하여 적어도 하나 이상의 후보 번역 텍스트를 획득한 후, 사용자 입력에 기초하여 상기 후보 번역 텍스트 중 적어도 하나를 번역 텍스트로 결정할 수 있다.Meanwhile, as shown in FIG. 6 , the server 1000 may additionally obtain a user input to obtain translated text. For example, the server 1000 extracts target text from an image, translates the target text through a neural network model to obtain at least one candidate translated text, and then, based on a user input, at least one of the candidate translated texts. can be determined as the translated text.

5.2 번역 품질 향상 방법 - 메타데이터 활용5.2 How to Improve Translation Quality - Using Metadata

도 7은 다른 실시예에 따른 서버가 번역 텍스트를 획득하는 방법을 예시적으로 설명하기 위한 도면이다. 이하에서는, 도 7을 참조하여 메타 데이터를 활용하여 서버(1000)가 수행하는 번역의 품질 향상 방법에 대하여 설명한다.7 is a diagram for illustratively explaining a method for obtaining translated text by a server according to another embodiment. Hereinafter, a method of improving the quality of translation performed by the server 1000 using meta data will be described with reference to FIG. 7 .

도 7의 (a)를 참조하면, 서버(1000)는 메타 데이터 추출 단계(S2100)를 통해 이미지로부터 메타 데이터를 추출할 수 있다.Referring to (a) of FIG. 7 , the server 1000 may extract meta data from an image through a meta data extraction step ( S2100 ).

서버(1000)는 메타 데이터 추출 단계(S2100)를 통해 이미지로부터 메타 데이터를 추출할 수 있다. 서버(1000)는 획득된 이미지에 기초하여 상기 이미지 내에 포함된 정보를 분석하여 메타 데이터를 추출할 수 있다. 예를 들어, 서버(1000)는 획득된 이미지를 분석하여 이미지 내에 포함된 사물, 인물 및/또는 공간에 관한 정보를 획득할 수 있다.The server 1000 may extract meta data from the image through a meta data extraction step ( S2100 ). The server 1000 may extract meta data by analyzing information included in the image based on the obtained image. For example, the server 1000 may obtain information about objects, people, and/or spaces included in the image by analyzing the obtained image.

서버(1000)는 신경망 모델(NN)을 통해 대상 텍스트를 번역하는 경우, 추출된 메타 데이터를 이용할 수 있다. 서버(1000)는 추출된 대상 텍스트를 신경망 모델(NN) 등을 통해 번역하여 복수의 후보 번역 텍스트를 획득할 수 있다. 이때, 서버(1000)는 메타 데이터에 기초하여 획득된 복수의 후보 번역 텍스트 중 적어도 하나를 번역 텍스트로 결정할 수 있다.The server 1000 may use the extracted metadata when translating target text through a neural network model (NN). The server 1000 may obtain a plurality of candidate translated texts by translating the extracted target text through a neural network model (NN) or the like. In this case, the server 1000 may determine at least one of a plurality of candidate translation texts obtained based on the meta data as the translation text.

예를 들어, 번역의 대상이 되는 단어 및/또는 문장을 신경망 모델(NN) 등을 통해 번역하는 경우, 상기 단어 및/또는 문장은 적어도 하나 이상의 버전으로 번역될 수 있다. 이때, 번역된 적어도 하나 이상의 버전 중 추출된 메타 데이터에 대응되는 버전을 최종 번역본으로 결정할 수 있다.For example, when a word and/or sentence to be translated is translated through a neural network model (NN), the word and/or sentence may be translated into at least one version. At this time, a version corresponding to the extracted meta data among at least one translated version may be determined as the final translation.

보다 구체적인 예로, 서버(1000)는 획득된 이미지로부터 대상 텍스트를 추출한 후, 상기 대상 텍스트를 신경망 모델(NN) 등을 통해 번역하여 복수의 후보 번역 텍스트 예컨대, 제1 후보 번역 텍스트 및 제2 후보 번역 텍스트를 획득할 수 있다. 이때, 서버(1000)는 획득된 제1 후보 번역 텍스트 및 제2 후보 번역 텍스트 중 이미지로부터 추출된 메타 데이터와 가장 대응되는(또는, 가장 연관성이 있는) 어느 하나를 최종 번역 텍스트로 결정할 수 있다.As a more specific example, the server 1000 extracts target text from an acquired image, and then translates the target text through a neural network model (NN) to form a plurality of candidate translated texts, such as a first candidate translated text and a second candidate translated text. text can be obtained. In this case, the server 1000 may determine one of the obtained first candidate translation text and the second candidate translation text that corresponds most to (or is most related to) the meta data extracted from the image as the final translation text.

5.3 번역 품질 향상 방법 - 사용자 정보 활용5.3 How to Improve Translation Quality - Utilizing User Information

도 7의 (b)를 참조하면, 서버(1000)는 사용자 정보 획득 단계(S2300)를 통해 추가적으로 사용자 정보를 획득한 후, 이를 이용하여 번역 텍스트를 획득할 수 있다.Referring to (b) of FIG. 7 , the server 1000 may additionally acquire user information through a user information acquisition step (S2300) and then obtain translated text using this.

예시적으로, 서버(1000)는 사용자 정보 획득 단계(S2300)를 통해 사용자가 속해있는 국가 및/또는 사용자의 언어 환경 정보를 획득할 수 있다. 이후, 서버(1000)는 획득된 사용자가 속해있는 국가 및/또는 사용자의 언어 환경 정보를 고려하여 이미지에 포함된 대상 텍스트를 번역할 수 있다.Illustratively, the server 1000 may obtain information about the country to which the user belongs and/or the language environment of the user through the user information acquisition step (S2300). Thereafter, the server 1000 may translate the target text included in the image in consideration of the acquired country to which the user belongs and/or language environment information of the user.

보다 구체적인 예로, 서버(1000)는 이미지로부터 제1 언어 및/또는 제2 언어로 작성된 대상 텍스트를 추출한 후, 상기 대상 텍스트를 사용자 정보(예를 들어, 사용자 언어 환경 정보)에 기초하여 결정된 제3 언어로 번역할 수 있다.As a more specific example, the server 1000 extracts target text written in the first language and/or the second language from an image, and then converts the target text into a third language determined based on user information (eg, user language environment information). can be translated into languages.

즉, 서버(1000)는 외부로부터의 별도의 입력이나 선택 없이도, 획득된 이미지를 사용자의 언어 환경 등을 고려하여 상기 사용자가 속해 있는 국가가 사용하는 언어로 자동 번역할 수 있다.That is, the server 1000 can automatically translate the acquired image into the language used by the country to which the user belongs, taking into account the user's language environment, etc., without any external input or selection.

5.4 번역 품질 향상 방법 - 상품 정보 활용5.4 How to improve translation quality - Utilize product information

도 8은 이미지로부터 상품 정보를 분석한 후 이를 활용하여 텍스트를 번역하는 방법을 예시적으로 설명하기 위한 도면이다. 도 8을 참조하면, 서버(1000)는 이미지에 포함된 상품 정보에 기초하여 텍스트를 번역하거나, 추천 번역 텍스트를 제공할 수 있다.8 is a diagram for illustratively explaining a method of analyzing product information from an image and then using it to translate text. Referring to FIG. 8 , the server 1000 may translate text or provide recommended translation text based on product information included in an image.

서버(1000)는 상품 정보를 포함하고 있는 이미지를 획득하는 경우, 상기 이미지를 분석하여 이미지 내에 포함된 상품에 관한 정보(예컨대, 상품 이름, 상폼 옵션, 상품 규격 등)를 획득할 수 있다. When obtaining an image including product information, the server 1000 may analyze the image and obtain information (eg, product name, product option, product specification, etc.) on the product included in the image.

또한, 서버(1000)는 상기 이미지로부터 대상 텍스트를 추출한 후, 이를 번역한 후보 번역 텍스트를 획득할 수 있으며, 상기 상품에 관한 정보에 기초하여 상기 후보 번역 텍스트의 동의어 및/또는 유사어를 사용자에게 추천할 수 있다. 이후, 서버(1000)는 상기 후보 번역 텍스트와 사용자에게 제공된 추천 번역 텍스트 중 어느 하나를 최종적으로 번역 텍스트로 결정할 수 있다.In addition, the server 1000 may extract the target text from the image, obtain a translated candidate text, and recommend synonyms and/or similar words of the candidate translated text to the user based on the product-related information. can do. Thereafter, the server 1000 may finally determine one of the candidate translation text and the recommended translation text provided to the user as the translation text.

보다 구체적인 예로, 서버(1000)는 대상 텍스트를 번역하여 적어도 하나 이상의 후보 번역 텍스트를 획득할 수 있다. 또한, 서버(1000)는 이미지 분석을 통해 상품 정보를 획득한 후, 상기 상품 정보에 기초하여 상기 후보 번역 텍스트에 대응되는 적어도 하나 이상의 추천 번역 텍스트를 획득할 수 있다. 상기 후보 번역 텍스트 및 추천 번역 텍스트 중 적어도 하나 이상은 사용자에게 제공될 수 있으며, 이에 대한 사용자 응답에 기초하여 서버(1000)는 상기 후보 번역 텍스트 및 추천 번역 텍스트 중 어느 하나를 최종 번역 텍스트로 결정할 수 있다.As a more specific example, the server 1000 may obtain at least one candidate translation text by translating the target text. In addition, after acquiring product information through image analysis, the server 1000 may obtain at least one recommended translation text corresponding to the candidate translation text based on the product information. At least one of the candidate translation text and the recommended translation text may be provided to the user, and based on a user response thereto, the server 1000 may determine one of the candidate translation text and the recommended translation text as the final translation text. there is.

한편, 동의어 DB 생성 단계(S4500)에서, 서버(1000)는 이미지에 포함된 대상 텍스트를 번역하여 후보 번역 텍스트를 획득하고, 이미지로부터 추출된 상품 정보에 기초하여 상기 후보 번역 텍스트의 동의어 및/또는 유사어를 획득 및 저장함으로써 동의어 DB를 생성할 수 있다.Meanwhile, in the synonym DB generation step (S4500), the server 1000 translates the target text included in the image to obtain candidate translation text, and synonyms and/or synonyms of the candidate translation text based on product information extracted from the image. A synonym DB can be created by acquiring and storing synonyms.

서버(1000)는 상기 동의어 DB를 활용하여 이미지 번역을 수행할 수 있다. 예를 들어, 서버(1000)는 획득된 이미지를 통해 추출되는 상품 정보가 상기 동의어 DB에 저장되어 있는 상품 정보와 동일 또는 유사한 경우, 이미지 번역 시 상기 동의어 DB에 포함된 단어, 문장 등을 추천 번역 텍스트로 결정하여 사용자에게 제공할 수 있다.The server 1000 may perform image translation by utilizing the synonym DB. For example, when product information extracted through an acquired image is the same as or similar to product information stored in the synonym DB, the server 1000 recommends and translates words, sentences, etc. included in the synonym DB when translating images. It can be determined by text and provided to the user.

5.5 비 정형화된 텍스트 번역5.5 Unstructured text translation

상술한바와 같이 서버(1000)가 이미지로부터 추출한 대상 텍스트는 정형화된 텍스트 및 비정형화된 텍스트를 포함할 수 있다. 이때, 정형화된 텍스트는 상술한 문자 인식 알고리즘(또는 광학식 문자 판독 장치(OCR))을 통해 번역될 수 있다.As described above, the target text extracted from the image by the server 1000 may include standardized text and unstructured text. In this case, the standardized text may be translated through the above-described text recognition algorithm (or optical character reading device (OCR)).

또한, 비정형화된 텍스트도 상기 문자 인식 알고리즘(또는 광학식 문자 판독 장치(OCR))을 통해 번역될 수 있으나, 정확도를 향상시키기 위해 검출된 텍스트의 엣지 영역 분석을 통해 이미지 번역이 수행될 수 있다.In addition, unstructured text may also be translated through the character recognition algorithm (or optical character reading device (OCR)), but image translation may be performed through edge region analysis of the detected text to improve accuracy.

일 예로, 서버(1000)는 신경망 모델을 통해 이미지에 포함된 비정형화된 텍스트를 번역할 수 있다. 이때, 상기 신경망 모델은 이미지로부터 텍스트의 엣지 영역 스타일을 추출하여 분석한 후 상기 텍스트를 번역하도록 학습될 수 있다.For example, the server 1000 may translate unstructured text included in an image through a neural network model. In this case, the neural network model may be trained to translate the text after extracting and analyzing the style of the edge region of the text from the image.

5.6 텍스트 선택적 번역5.6 Selective text translation

일 실시예에 따른 서버(1000)는 이미지 내에 포함되어 있는 적어도 하나 이상의 텍스트를 선택적으로 번역할 수 있다. 예컨대, 서버(1000)는 이미지 내에 포함되어 있는 복수의 텍스트 중 적어도 하나 이상을 미리 정해진 기준에 따라 선택적으로 번역할 수 있다. 이하에서는 도 8을 참조하여, 서버가 이미지 내에 포함된 텍스트를 선택적으로 번역하는 방법을 설명한다.The server 1000 according to an embodiment may selectively translate at least one text included in an image. For example, the server 1000 may selectively translate at least one of a plurality of texts included in an image according to a predetermined criterion. Hereinafter, referring to FIG. 8, a method of selectively translating text included in an image by a server will be described.

도 9 및 도 10는 서버가 이미지 내에 포함된 텍스트를 선택적으로 번역하는 방법을 예시적으로 설명하기 위한 도면이다. 도 9을 참조하면, 서버(1000)는 이미지로부터 대상 텍스트를 추출한 후, 상기 대상 텍스트를 미리 정해진 기준에 따라 분류할 수 있다. 서버(1000)는 이미지 내에 포함되어 있는 텍스트를 미리 정해진 기준에 따라 적어도 하나 이상의 그룹으로 분류한 후, 상기 분류된 텍스트 그룹 중 일정 조건을 만족하는 텍스트 그룹에 대한 번역을 수행할 수 있다.9 and 10 are diagrams for illustratively explaining a method for a server to selectively translate text included in an image. Referring to FIG. 9 , after extracting target text from an image, the server 1000 may classify the target text according to a predetermined criterion. The server 1000 may classify the text included in the image into one or more groups according to a predetermined criterion, and then perform translation on a text group satisfying a certain condition among the classified text groups.

서버(1000)는 이미지에서 대상 텍스트를 추출한 후, 상기 대상 텍스트를 미리 정해진 기준에 따라 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹으로 분류할 수 있다.After extracting the target text from the image, the server 1000 may classify the target text into a first target text group and a second target text group according to a predetermined criterion.

예를 들어, 서버(1000)는 이미지 내에 포함된 대상 텍스트를 폰트 종류에 기초하여, 제1 폰트를 가지는 제1 대상 텍스트 그룹, 제2 폰트를 가지는 제2 대상 텍스트 그룹 등으로 분류할 수 있다. 이때, 서버(1000)는 미리 정해진 기준 또는 사용자의 입력 값에 기초하여 상기 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹 중 적어도 하나에 대해서 선택적으로 번역할 수 있다. 예컨대, 서버(1000)는 제1 폰트에 대해 번역하라는 사용자의 입력을 획득하는 경우, 상기 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹 중 제1 대상 텍스트 그룹에 대해서만 번역을 수행할 수 있다. 또는, 서버(1000)는 제3 폰트에 대해 번역하라는 사용자의 입력을 획득하는 경우, 상기 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹 모두에 대하여 번역을 수행하지 않을 수 있다.For example, the server 1000 may classify the target text included in the image into a first target text group having a first font, a second target text group having a second font, and the like, based on the font type. In this case, the server 1000 may selectively translate at least one of the first target text group and the second target text group based on a predetermined criterion or a user's input value. For example, when obtaining a user's input to translate a first font, the server 1000 may translate only the first target text group among the first target text group and the second target text group. Alternatively, when obtaining a user's input to translate the third font, the server 1000 may not translate both the first target text group and the second target text group.

다른 예로, 서버(1000)는 이미지 내에 포함된 대상 텍스트를 글자 사이즈에 기초하여, 제1 사이즈를 가지는 제1 대상 텍스트 그룹, 제2 사이즈를 가지는 제2 대상 텍스트 그룹 등으로 분류할 수 있다. 이때, 서버(1000)는 미리 정해진 기준 또는 사용자의 입력 값에 기초하여 상기 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹 중 적어도 하나에 대해서 선택적으로 번역할 수 있다. 예컨대, 서버(1000)는 제1 사이즈에 대해 번역하라는 사용자의 입력을 획득하는 경우, 상기 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹 중 제1 대상 텍스트 그룹에 대해서만 번역을 수행할 수 있다. 또는, 서버(1000)는 제3 사이즈에 대해 번역하라는 사용자의 입력을 획득하는 경우, 상기 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹 모두에 대하여 번역을 수행하지 않을 수 있다.As another example, the server 1000 may classify the target text included in the image into a first target text group having a first size, a second target text group having a second size, and the like, based on the character size. In this case, the server 1000 may selectively translate at least one of the first target text group and the second target text group based on a predetermined criterion or a user's input value. For example, when obtaining a user's input to translate the first size, the server 1000 may translate only the first target text group among the first target text group and the second target text group. Alternatively, when the server 1000 obtains the user's input to translate the third size, the server 1000 may not perform translation on both the first target text group and the second target text group.

다른 예로, 서버(1000)는 이미지 내에 포함된 대상 텍스트를 언어 종류에 기초하여, 제1 언어를 가지는 제1 대상 텍스트 그룹, 제2 언어를 가지는 제2 대상 텍스트 그룹 등으로 분류할 수 있다. 이때, 서버(1000)는 미리 정해진 기준 또는 사용자의 입력 값에 기초하여 상기 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹 중 적어도 하나에 대해서 선택적으로 번역할 수 있다. 예컨대, 서버(1000)는 제1 언어에 대해 번역하라는 사용자의 입력을 획득하는 경우, 상기 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹 중 제1 대상 텍스트 그룹에 대해서만 번역을 수행할 수 있다. 또는, 서버(1000)는 제3 언어에 대해 번역하라는 사용자의 입력을 획득하는 경우, 상기 제1 대상 텍스트 그룹 및 제2 대상 텍스트 그룹 모두에 대하여 번역을 수행하지 않을 수 있다.As another example, the server 1000 may classify the target text included in the image into a first target text group having a first language, a second target text group having a second language, and the like, based on the language type. In this case, the server 1000 may selectively translate at least one of the first target text group and the second target text group based on a predetermined criterion or a user's input value. For example, when obtaining a user's input to translate a first language, the server 1000 may translate only the first target text group among the first target text group and the second target text group. Alternatively, the server 1000 may not perform translation for both the first target text group and the second target text group when obtaining a user's input to translate the third language.

다른 예로, 서버(1000)는 이미지 내에 포함된 대상 텍스트를 제1 언어를 가지는 제1 대상 텍스트 그룹, 제2 언어를 가지는 제2 대상 텍스트 그룹으로 분류한 후, 대표 언어로 판단된 상기 제1 언어 및 상기 제2 언어 중 어느 하나의 언어에 관한 대상 텍스트 그룹에 대하여 번역을 수행할 수 있다.As another example, the server 1000 classifies the target text included in the image into a first target text group having a first language and a second target text group having a second language, and then the first language determined as the representative language. And translation may be performed on a target text group related to any one of the second languages.

이 경우, 상기 대표 언어는 상기 이미지를 분석하여 얻어지는 정보에 기초하여 결정될 수 있다. 예컨대, 상기 이미지에 포함된 대상 텍스트가 제1 언어로 작성된 제1 대상 텍스트 그룹 및 제2 언어로 작성된 제2 대상 텍스트 그룹으로 분류되고, 상기 제1 대상 텍스트 그룹에 속한 텍스트 수가 상기 제2 대상 텍스트 그룹에 속한 텍스트 수보다 많을 경우, 상기 제1 언어를 상기 대표 언어로 결정할 수 있다. 또는, 상기 제1 대상 텍스트 그룹에 속한 텍스트의 평균 사이즈가 상기 제2 대상 텍스트 그룹에 속한 텍스트의 평균 사이즈보다 큰 경우, 상기 제1 언어를 상기 대표 언어로 결정할 수 있다.In this case, the representative language may be determined based on information obtained by analyzing the image. For example, the target text included in the image is classified into a first target text group written in a first language and a second target text group written in a second language, and the number of texts belonging to the first target text group is the second target text group. If the number of texts belonging to the group is greater than the number of texts belonging to the group, the first language may be determined as the representative language. Alternatively, when the average size of texts belonging to the first target text group is greater than the average size of texts belonging to the second target text group, the first language may be determined as the representative language.

대상 텍스트를 분류하는 상술한 방법들은 예시적인 것이며, 그 외의 다양한 기준에 따라 추출된 대상 텍스트를 적어도 하나 이상의 텍스트 그룹으로 분류할 수 있다. 예를 들어, 추출된 대상 텍스트는 텍스트가 이미지 내에서 위치하는 좌표, 텍스트의 색상 등의 기준으로 적어도 하나 이상의 텍스트 그룹으로 분류될 수 있다. The above methods of classifying the target text are exemplary, and the target text extracted according to various other criteria may be classified into one or more text groups. For example, the extracted target text may be classified into at least one text group based on the coordinates of the text location in the image, the color of the text, and the like.

한편, 상술한 바와 같이 분류된 적어도 하나 이상의 대상 텍스트 그룹은 신경망 모델을 통해 번역될 수 있다. 이 경우, 도 10의 (a)에 도시된 바와 같이 분류된 각각의 텍스트 그룹은 하나의 신경망 모델(NN)을 통해 번역될 수 있다. 또는, 도 10의 (b)에 도시된 바와 같이 분류된 각각의 텍스트 그룹은 서로 다른 신경망 모델을 통해 번역될 수 있다.Meanwhile, at least one target text group classified as described above may be translated through a neural network model. In this case, each text group classified as shown in (a) of FIG. 10 may be translated through one neural network model (NN). Alternatively, each text group classified as shown in (b) of FIG. 10 may be translated through different neural network models.

예를 들어, 이미지 내에 포함된 대상 텍스트가 언어 종류에 기초하여, 제1 언어를 가지는 제1 대상 텍스트 그룹, 제2 언어를 가지는 제2 대상 텍스트 그룹 및 제3 언어를 가지는 제3 대상 텍스트 그룹으로 분류되는 경우, 상기 제1 대상 텍스트 그룹은 제1 언어를 번역하도록 학습된 제1 신경망 모델(NN1)을 통해 번역될 수 있고, 상기 제2 대상 텍스트 그룹은 제2 언어를 번역하도록 학습된 제2 신경망 모델(NN2)을 통해 번역될 수 있으며, 상기 제3 대상 텍스트 그룹은 제3 언어를 번역하도록 학습된 제3 신경망 모델(NN3)을 통해 번역될 수 있다.For example, target text included in an image is divided into a first target text group having a first language, a second target text group having a second language, and a third target text group having a third language, based on the language type. If classified, the first target text group may be translated through a first neural network model NN1 trained to translate a first language, and the second target text group may be translated through a second target text group trained to translate a second language. It can be translated through a neural network model NN2, and the third target text group can be translated through a third neural network model NN3 trained to translate a third language.

5.7 예시적인 텍스트 번역 방법5.7 Exemplary Text Translation Methods

도 11 및 도 12은 서버가 대상 텍스트를 번역하는 구체적인 방법을 예시적으로 설명하기 위한 도면이다. 도 10 및 도 11을 참조하면, 서버(1000)는 이미지에 포함되어 있는 텍스트들을 각 문자, 단어 및/또는 문장 별로 인식한 후 해당 텍스트의 스타일을 분석할 수 있다.11 and 12 are diagrams for illustratively explaining a specific method for a server to translate target text. Referring to FIGS. 10 and 11 , the server 1000 may recognize text included in an image for each character, word, and/or sentence, and then analyze the style of the corresponding text.

도 11의 (a)를 참조하면, 이미지에 포함되어 있는 텍스트는 그 언어, 폰트 및 사이즈 등이 상이할 수 있다. 이 경우, 도 11의 (b)를 참조하면, 서버(1000)는 이미지에 포함되어 있는 각각의 문자를 하나씩 인식한 후, 상기 각각의 문자에 관한 텍스트 정보를 획득할 수 있다. 다만, 이는 예시적인 것이며, 도 11의 (b)와 다르게 2 이상의 문자를 한 번에 인식한 후, 상기 문자에 관한 텍스트 정보를 획득할 수 있다.Referring to (a) of FIG. 11 , the text included in the image may have different languages, fonts, and sizes. In this case, referring to (b) of FIG. 11 , the server 1000 may acquire text information about each character after recognizing each character included in the image one by one. However, this is just an example, and unlike FIG. 11(b), after recognizing two or more characters at once, text information about the characters can be obtained.

도 12의 (a) 및 (b)를 참조하면, 서버(1000)는 획득된 상기 텍스트 정보에 기초하여 이미지 내에 포함되어 있는 텍스트를 적어도 하나 이상의 그룹으로 분류 및또는 재분류할 수 있다. 예를 들어, 서버(1000)는 이미지 내에서 텍스트가 위치하는 좌표 정보에 기초하여 대상 텍스트를 적어도 하나 이상의 그룹으로 분류 및/또는 재분류할 수 있다.Referring to (a) and (b) of FIG. 12 , the server 1000 may classify and/or reclassify text included in an image into one or more groups based on the obtained text information. For example, the server 1000 may classify and/or reclassify the target text into at least one or more groups based on information on the coordinates of the text in the image.

서버(1000)는 상술한 방법으로 분류 및/또는 재분류된 그룹 단위로 이미지에 포함되어 있는 대상 텍스트를 번역할 수 있다.The server 1000 may translate the target text included in the image in units of groups classified and/or reclassified in the above-described method.

66 대상 텍스트 제거(Inpaint) 및 배경 합성(Rasterize)Remove target text (Inpaint) and composite background (Rasterize)

서버(1000)는 이미지를 획득하고, 획득된 이미지에 포함된 텍스트를 인식하여 상기 텍스트에 대한 번역을 수행하는 것과 더불어, 상기 이미지에서 번역의 대상이 되는 텍스트 영역을 제거하고, 상기 텍스트 영역의 배경을 복원하는 기능을 수행할 수 있다.The server 1000 acquires an image, recognizes text included in the acquired image, and translates the text, removes a text area to be translated from the image, and removes a text area from the image, and the background of the text area. can perform the function of restoring.

서버(1000)는 이미지 내에 텍스트가 위치하는 텍스트 영역을 판단한 후, 상기 텍스트 영역을 상기 이미지 내에서 제거하는 기능을 수행할 수 있다. 또한, 서버(1000)는 상기 이미지에서 제거된 상기 텍스트 영역에 해당하는 부분의 배경을 복원하는 기능을 수행할 수 있다.The server 1000 may perform a function of determining a text area where text is located in an image and then removing the text area from the image. In addition, the server 1000 may perform a function of restoring a background of a part corresponding to the text area removed from the image.

예를 들어, 서버(1000)는 상기 이미지 내에서 상기 텍스트 영역을 제거하고, 상기 텍스트 영역의 주변 영역을 분석하여 대표 색상을 추출한 후, 상기 대표 색상에 기초하여 상기 텍스트 영역의 배경을 복원하는 기능을 수행할 수 있다.For example, the server 1000 removes the text area from the image, extracts a representative color by analyzing an area around the text area, and restores the background of the text area based on the representative color. can be performed.

다른 예로, 서버(1000)는 상기 이미지 내에서 상기 텍스트 영역을 제거하되, 완전히 제거되지 않은 텍스트 영역(예컨대, 이미지 잡음에 해당하는 영역)을 추가적으로 제거해주는 기능을 수행할 수 있다.As another example, the server 1000 may perform a function of removing the text area from the image and additionally removing a text area that is not completely removed (eg, an area corresponding to image noise).

상술한 이미지 내의 특정 영역(예컨대, 텍스트 영역)을 제거하고, 제거된 부분의 배경을 복원하는 방법은 기 알려진 다양한 알고리즘 등을 통해 수행될 수 있다.The above-described method of removing a specific region (eg, text region) in an image and restoring the background of the removed portion may be performed through various known algorithms or the like.

77 번역 텍스트 삽입Insert translated text

7.1 대상 텍스트 정보 추출7.1 Extract target text information

일 실시예에 따른 서버(1000)는 이미지에 포함된 대상 텍스트를 번역하여 번역 텍스트를 획득한 후, 상기 번역 텍스트를 상기 이미지에 삽입할 수 있다. The server 1000 according to an embodiment may obtain translated text by translating target text included in an image, and then insert the translated text into the image.

도 13은 획득된 번역 텍스트를 이미지에 삽입하는 방법을 예시적으로 설명하기 위한 도면이다. 도 13을 참조하면, 서버(1000)는 번역 텍스트를 삽입하는 단계(S7000)를 통해 획득된 번역 텍스트를 이미지에 삽입할 수 있다.13 is a diagram for illustratively explaining a method of inserting the acquired translated text into an image. Referring to FIG. 13 , the server 1000 may insert the translated text obtained through the step of inserting the translated text (S7000) into an image.

서버(1000)는 이미지로부터 추출된 대상 텍스트를 번역하여 번역 텍스트를 획득한 후, 상기 번역 텍스트를 상기 이미지 내의 상기 대상 텍스트에 대응되는 영역에 삽입할 수 있다. The server 1000 may obtain translated text by translating target text extracted from an image, and then insert the translated text into a region corresponding to the target text in the image.

서버(1000)는 번역 텍스트에 대해 후보정을 한 후, 상기 후보정된 번역 텍스트를 상기 이미지에 삽입할 수 있다. 예컨대, 서버(1000)는 상기 대상 텍스트에 관한 특징 정보(상기 대상 텍스트에 관한 스타일 정보)에 기초하여 상기 번역 텍스트를 상기 이미지에 삽입할 수 있다.The server 1000 may perform post-correction on the translated text and then insert the post-corrected translated text into the image. For example, the server 1000 may insert the translated text into the image based on feature information (style information on the target text) of the target text.

도 14 및 15는 대상 텍스트의 특징 정보(스타일 정보)가 반영된 번역 텍스트를 이미지에 삽입하는 방법을 예시적으로 설명하기 위한 도면이다.14 and 15 are diagrams for illustratively explaining a method of inserting translated text reflecting feature information (style information) of target text into an image.

도 14를 참조하면, 서버(1000)는 대상 텍스트 특징 정보 획득 단계(S7100)를 통해, 이미지로부터 추출된 대상 텍스트를 분석하여 특징 정보를 획득할 수 있다. 도면에는 대상 텍스트를 추출한 후, 상기 대상 텍스트를 신경망 모델을 통해 분석하여 상기 특징 정보를 획득하는 것으로 도시하였으나, 이에 한정되는 것은 아니고, 기 알려진 다양한 이미지 분석 알고리즘을 통해 상기 특징 정보를 획득할 수 있다.Referring to FIG. 14 , the server 1000 may acquire feature information by analyzing target text extracted from an image through a target text feature information acquisition step ( S7100 ). In the figure, after extracting the target text, it is shown that the target text is analyzed through a neural network model to acquire the feature information, but it is not limited thereto, and the feature information can be obtained through various known image analysis algorithms. .

서버(1000)는 번역 텍스트에 대상 텍스트 특징 정보를 적용하는 단계(S7300)를 통해, 획득된 번역 텍스트에 상기 특징 정보를 반영할 수 있고, 상기 특징 정보가 반영된 번역 텍스트를 상기 이미지에 삽입할 수 있다.The server 1000 may reflect the characteristic information to the obtained translation text through the step of applying target text characteristic information to the translated text (S7300), and may insert the translated text reflecting the characteristic information into the image. there is.

도 15를 참조하면, 상기 특징 정보는 상기 대상 텍스트에 관한 정보 예컨대, 텍스트 색상 정보, 텍스트 크기 정보, 폰트 정보, 텍스트 방향 정보, 텍스트 테두리 색상 정보, 텍스트 자간 정보, 텍스트 배경 색상 정보 및 텍스트 정렬 정보를 포함할 수 있다.Referring to FIG. 15, the feature information is information about the target text, for example, text color information, text size information, font information, text direction information, text border color information, text kerning information, text background color information, and text alignment information. can include

도 16 및 도 17은 일 실시예에 따른 대상 텍스트 특징 정보를 획득하는 방법을 설명하기 위한 도면이다. 도 16을 참조하면, 이미지에 포함되어 있는 대상 텍스트가 정형화된 텍스트인 경우, 서버(1000)는 이미지를 획득하고, 이미지에서 텍스트가 위치하는 대상 텍스트 영역을 검출한 후, 상기 대상 텍스트 영역에 존재하는 텍스트에 관한 정보 예컨대, 대상 텍스트 특징 정보를 획득할 수 있다. 이후 서버(1000)는 상기 대상 텍스트 특징 정보가 반영된 번역 텍스트를 획득하여 이미지에 삽입할 수 있다.16 and 17 are diagrams for explaining a method of obtaining target text feature information according to an exemplary embodiment. Referring to FIG. 16 , when the target text included in the image is standardized text, the server 1000 obtains the image, detects a target text area where the text is located in the image, and exists in the target text area. Information about the text to be played may be obtained, for example, target text feature information. Thereafter, the server 1000 may obtain translated text reflecting the target text feature information and insert it into an image.

도 17을 참조하면, 이미지에 포함되어 있는 대상 텍스트가 비정형화된 텍스트인 경우, 서버(1000)는 엣지 영역 검출 및 분석 단계(S7110), 엣지 영역 정보 획득 단계(S7130)를 통해 대상 텍스트 특징 정보를 획득할 수 있다.Referring to FIG. 17 , when target text included in an image is unstructured text, the server 1000 detects and analyzes an edge region (S7110) and obtains edge region information (S7130) to perform target text feature information. can be obtained.

또한, 이미지에 포함되어 있는 대상 텍스트가 비정형화된 텍스트인 경우, 서버(1000)는 이미지 내의 대상 텍스트 영역을 검출하고, 상기 대상 텍스트 영역에 포함된 텍스트의 특징점을 검출하고, 해당 특징점을 수치화하여 상기 대상 텍스트 특징 정보를 획득할 수 있다.In addition, when the target text included in the image is unstructured text, the server 1000 detects a target text area in the image, detects feature points of the text included in the target text area, digitizes the feature points, and The target text feature information may be obtained.

예시적으로, 서버(1000)는 상술한 방법으로 대상 텍스트 특징 정보를 획득함으로써 상기 대상 텍스트의 폰트를 판단할 수 있는데, 이후 번역 텍스트를 이미지에 삽입하는 경우, 상기 대상 텍스트의 폰트가 반영된 번역 텍스트를 이미지에 삽입할 수 있다.Exemplarily, the server 1000 may determine the font of the target text by obtaining target text characteristic information in the above-described method. Then, when the translated text is inserted into an image, the translated text reflecting the font of the target text can be inserted into the image.

7.2 번역 텍스트의 추가적인 수정7.2 Further correction of the translated text

도 18은 서버가 번역 텍스트를 추가적으로 수정하여 삽입하는 방법을 예시적으로 설명하기 위한 도면이다. 도 18을 참조하면, 서버(1000)는 번역 텍스트에 상술한 대상 텍스트 특징 정보를 적용한 후, 번역 텍스트 수정 필요 여부 판단 단계(S7500)를 통해, 상기 번역 텍스트가 추가적으로 수정될 필요가 있는지 판단할 수 있다.18 is a diagram for exemplarily explaining a method for a server to additionally modify and insert translated text. Referring to FIG. 18 , the server 1000 may determine whether the translated text needs to be additionally modified through a step of determining whether the translated text needs to be corrected (S7500) after applying the target text characteristic information to the translated text. there is.

서버(1000)가 대상 텍스트 특징 정보가 반영된 번역 텍스트가 추가적으로 수정이 필요하다고 판단되는 경우, 텍스트 사이즈, 정렬 또는 배치 변경 단계(S7700)를 통해 상기 번역 텍스트를 추가적으로 수정할 수 있다.When the server 1000 determines that the translated text reflecting the target text feature information needs to be additionally modified, the translated text may be additionally modified through a text size, alignment, or arrangement change step (S7700).

서버(1000)는 번역 텍스트에 대상 텍스트 특징 정보를 반영한 후, 상기 번역 텍스트가 이미지에 삽입될 경우 미리 정해진 조건을 만족하지 못하는 경우, 서버(1000)는 상기 번역 텍스트가 추가적으로 수정될 필요가 있다고 판단할 수 있다.After the server 1000 reflects target text feature information in the translated text, when the translated text is inserted into an image and does not satisfy a predetermined condition, the server 1000 determines that the translated text needs to be additionally modified. can do.

예를 들어, 서버(1000)가 이미지에 포함된 복수의 대상 텍스트를 번역하여, 제1 번역 텍스트 및 제2 번역 텍스트를 획득하고, 이를 상기 이미지에 삽입 배치할 때, 상기 제1 번역 텍스트 및 제2 번역 텍스트가 상기 이미지 상에서 중첩되는 영역에서 삽입 배치되는 경우, 서버(1000)는 상기 상기 제1 번역 텍스트 및 제2 번역 텍스트 중 적어도 어느 하나에 추가적인 수정이 필요한 것으로 판단할 수 있다.For example, when the server 1000 translates a plurality of target texts included in an image, obtains first translated text and second translated text, and inserts and arranges them into the image, the first translated text and the second translated text are obtained. When two translated texts are inserted and placed in an overlapping region on the image, the server 1000 may determine that at least one of the first translated text and the second translated text requires additional modification.

이 경우, 서버(1000)는 상기 제1 번역 텍스트 및 제2 번역 텍스트 중 적어도 하나의 스타일을 추가적으로 변경(예컨대, 상기 제1 번역 텍스트 및 제2 번역 텍스트의 사이즈, 배치 방법, 정렬 방법 등의 변경)하여 이미지에 삽입 배치할 수 있다.In this case, the server 1000 additionally changes the style of at least one of the first translated text and the second translated text (eg, changes in size, arrangement method, alignment method, etc. of the first translated text and the second translated text). ) to insert into the image.

다른 예로, 서버(1000)는 이미지에 포함된 적어도 하나 이상의 대상 텍스트를 번역하여 번역 텍스트를 획득하고, 이를 상기 이미지에 삽입 배치할 때, 상기 번역 텍스트 중 적어도 일부가 상기 이미지 내의 영역을 벗어나 삽입되는 경우, 서버(1000)는 상기 번역 텍스트에 추가적인 수정이 필요한 것으로 판단할 수 있다.As another example, the server 1000 obtains the translated text by translating at least one target text included in the image, and when inserting and arranging the translated text into the image, at least a part of the translated text is inserted out of an area within the image. In this case, the server 1000 may determine that additional correction is required for the translated text.

이 경우, 서버(1000)는 상기 번역 텍스트의 스타일을 추가적으로 변경(예컨대, 상기 번역 텍스트의 사이즈, 배치 방법, 정렬 방법 등의 변경)하여 이미지에 삽입 배치할 수 있다.In this case, the server 1000 may additionally change the style of the translated text (eg, change the size, arrangement method, arrangement method, etc. of the translated text) and insert it into the image.

도 19는 번역 텍스트가 이미지 내의 영역을 벗어나 삽입되는 경우 서버가 번역 텍스트를 추가적으로 수정하는 방법을 설명하기 위한 도면이다. 이하에서는 도 19를 참조하여, 번역 텍스트를 추가적으로 수정하는 방법에 관하여 설명한다.FIG. 19 is a diagram for explaining a method for a server to additionally modify translated text when the translated text is inserted out of an area within an image. Hereinafter, referring to FIG. 19, a method of additionally modifying translated text will be described.

도 19의 (a)를 참조하면, 이미지 내에는 대상 텍스트가 포함되어 있을 수 있으며, 상술한 바와 같이 서버(1000)는 상기 대상 텍스트를 적어도 하나 이상의 그룹으로 분류하여 번역할 수 있고, 동시에(또는 순차적으로) 상기 대상 텍스트를 분석하여 특징 정보를 추출할 수 있다.Referring to (a) of FIG. 19 , target text may be included in an image, and as described above, the server 1000 classifies the target text into at least one group and may translate the target text, and at the same time (or Sequentially), feature information may be extracted by analyzing the target text.

이후, 서버(1000)는 도 19의 (b)와 같이 상기 번역 텍스트에 상기 특징 정보를 반영한 후, 이를 이미지에 삽입 배치할 수 있다. 이 경우, 이미지에 삽입 배치된 번역 텍스트 영역의 적어도 일부가 상기 이미지 내의 영역을 벗어나서 삽입될 수 있는데, 이 경우 서버(1000)는 상기 번역 텍스트의 스타일을 추가적으로 변경하여 이미지에 삽입 배치할 수 있다.Thereafter, the server 1000 may reflect the characteristic information on the translated text as shown in (b) of FIG. 19 and then insert and arrange it into the image. In this case, at least a part of the translated text area inserted into the image may be inserted beyond the area within the image. In this case, the server 1000 may additionally change the style of the translated text and insert it into the image.

예컨대, 서버(1000)는 도 19의 (c)와 같이 이미지에 배치된 번역 텍스트가 차지하는 영역 중 적어도 일부 영역이 상기 이미지 내의 영역을 벗어나는 경우, 상기 번역 텍스트의 크기, 자간, 장평 및/또는 배치 방법 등의 변경을 통해, 상기 번역 텍스트가 차지하는 영역 모두가 상기 이미지 내의 영역에 포함되도록 할 수 있다.For example, as shown in (c) of FIG. 19, the server 1000 determines the size, spacing, length and/or arrangement of the translated text when at least some of the regions occupied by the translated text disposed on the image is out of the region within the image. Through a change in the method or the like, all of the area occupied by the translated text may be included in the area within the image.

88 에디터 서비스 제공Provide editor service

도 20 및 도 21은 삽입 배치된 번역 텍스트를 추가적으로 편집할 수 있도록 하는 소프트웨어의 UX/UI를 예시적으로 설명하기 위한 도면이다.20 and 21 are diagrams for illustratively explaining UX/UI of software that enables additional editing of inserted and arranged translated text.

도 20 및 도 21을 참조하면, 상술한 방법으로 획득된 번역 텍스트를 이미지 내에 삽입 배치하는 경우, 서버(1000)는 삽입 배치된 번역 텍스트를 사용자가 추가적으로 편집하거나 편집된 이미지를 저장 및 관리할 수 있는 기능을 제공할 수 있다.Referring to FIGS. 20 and 21 , when the translated text obtained by the above method is inserted into an image, the server 1000 may additionally edit the inserted translated text by the user or store and manage the edited image. function can be provided.

예를 들어, 서버(1000)는 사용자의 입력에 기초하여 획득된 번역 텍스트의 폰트 종류, 텍스트 색상, 텍스트 크기, 텍스트 배치 방향, 텍스트 정렬 기준, 텍스트 테두리 색상, 텍스트 자간 및 텍스트 배경 색상 중 적어도 하나 이상을 추가적으로 변형하여 상기 이미지 내에 반영될 수 있도록 하는 기능을 제공할 수 있다.For example, the server 1000 may perform at least one of the font type, text color, text size, text arrangement direction, text alignment criterion, text border color, text kerning, and text background color of the translated text obtained based on the user's input. It is possible to provide a function of additionally transforming the above to be reflected in the image.

보다 구체적인 예로, 서버(1000)는 전자 장치의 표시부를 통해 획득된 번역 텍스트의 동의어 및/또는 유의어를 사용자에게 제공할 수 있고, 이에 대한 사용자의 입력을 획득하여 상기 번역 텍스트를 변경할 수 있다.As a more specific example, the server 1000 may provide synonyms and/or synonyms of the obtained translation text to the user through the display unit of the electronic device, and may change the translated text by acquiring the user's input.

이 외에도, 서버(1000)는 일반적인 텍스트 편집 툴, 소프트웨어 등에서 제공하는 다양한 텍스트 편집 기능을 사용자에게 제공할 수 있으며, 이를 통해 사용자는 번역된 이미지의 스타일을 원하는 대로 편집하여 활용할 수 있다.In addition to this, the server 1000 may provide the user with various text editing functions provided by general text editing tools and software, through which the user may edit and utilize the style of the translated image as desired.

99 대상 텍스트 추출 방법Target text extraction method

일 실시예에 따른, 서버(1000) 또는 제어부(100)는 알고리즘을 이용하여 이미지 내에 포함된 텍스트를 추출한 후, 적어도 하나 이상의 그룹으로 분류 및 결정하여 관리할 수 있다.According to an embodiment, the server 1000 or the control unit 100 may extract text included in an image using an algorithm, classify and determine one or more groups, and then manage the text.

서버(1000) 또는 제어부(100)는 이미지 내에 포함된 텍스트를 추출한 후, 미리 정해진 기준에 의해 적어도 하나 이상의 그룹으로 분류하고 관리할 수 있으며, 이후, 이미지 내에 포함된 텍스트의 번역을 진행할 때에, 상기 분류된 그룹 정보에 기초하여 수행하는 경우 번역 품질이 더욱 향상될 수 있는 효과를 제공할 수 있다.After extracting the text included in the image, the server 1000 or the control unit 100 may classify and manage text into at least one or more groups based on a predetermined criterion, and then, when translating the text included in the image, the When performed based on classified group information, the effect of further improving translation quality can be provided.

예컨대, 단순히 이미지 내에 포함된 텍스트를 추출한 후 이미지 번역을 수행하는 것과 비교하여, 이미지 내에 포함된 텍스트를 복수의 그룹(예를 들어, 단어 그룹, 문장 그룹, 문단 그룹 등)으로 분류한 후, 분류된 정보에 기초하여 이미지 번역이 수행되는 경우 번역 품질이 더욱 향상될 수 있다.For example, compared to performing image translation after simply extracting text included in an image, text included in an image is classified into a plurality of groups (eg word group, sentence group, paragraph group, etc.) and then classified. When image translation is performed based on the received information, the translation quality can be further improved.

이미지 내에 포함된 텍스트를 추출한 후 적어도 하나 이상의 그룹으로 분류 및 결정하여 관리하는 동작은 서버(1000)에 의해서 수행될 수도 있고, 전자 장치의 제어부(100)에 의해서 수행될 수 있다. 이하에서는, 설명의 편의를 위해 제어부(100)에서 수행되는 것으로 설명한다.An operation of extracting text included in an image, classifying, determining, and managing text into at least one or more groups may be performed by the server 1000 or by the control unit 100 of the electronic device. Hereinafter, it will be described as being performed in the control unit 100 for convenience of description.

도 22는 문자 인식 알고리즘을 이용하여 이미지로부터 대상 텍스트를 추출하는 방법을 예시적으로 설명하기 위한 도면이다. 도 22를 참조하면, 서버(1000) 또는 제어부(100)는 알고리즘을 이용하여 이미지 내에 포함된 Symbol 정보를 획득하고(S9100), Symbol 정보를 이용하여 Word 그룹을 결정한 후(S9300), Line 그룹을 결정하고(S9500), Paragraph 그룹을 결정(S9700)함으로써, 이미지 내에 포함된 텍스트를 추출할 수 있다.22 is a diagram for exemplarily explaining a method of extracting target text from an image using a text recognition algorithm. Referring to FIG. 22, the server 1000 or the control unit 100 obtains symbol information included in an image using an algorithm (S9100), determines a word group using the symbol information (S9300), and selects a line group. By determining (S9500) and determining a Paragraph group (S9700), the text included in the image can be extracted.

도 23은 이미지 내에 포함되어 있는 텍스트를 적어도 하나 이상의 그룹으로 정의하는 방법을 예시적으로 설명하기 위한 도면이다. 이미지에는 적어도 하나 이상의 텍스트가 포함되어 있을 수 있는데, 상기 적어도 하나 이상의 텍스트는 복수의 그룹으로 분류될 수 있다. 예컨대, 상기 적어도 하나 이상의 텍스트는 Symbol, Word, Line, Paragraph 단위로 결정될 수 있다.23 is a diagram for illustratively explaining a method of defining text included in an image as one or more groups. The image may include at least one text, and the at least one text may be classified into a plurality of groups. For example, the at least one text may be determined in units of symbols, words, lines, and paragraphs.

예를 들어, 도 23을 참조하면, 문자 하나 하나를 Symbol로 정의할 수 있고, 인접한 Symbol들의 집합을 Word로 정의할 수 있으며, 인접한 Word의 집합을 Line으로 정의할 수 있다. 또한, 인접한 Line의 집합을 Paragraph로 정의할 수 있다. 다만, 이는 예시적인 것이며, 이미지 내에 포함된 텍스트는 다양한 방법 및 기준에 의해 Symbol, Word 그룹, Line 그룹, Paragraph 그룹으로 분류되고 결정될 수 있다. 이하에서는, 이미지 내에 포함된 텍스트를 복수의 그룹으로 분류하는 방법에 대하여 검토한다.For example, referring to FIG. 23 , each character can be defined as a symbol, a set of adjacent symbols can be defined as a word, and a set of adjacent words can be defined as a line. Also, a set of adjacent lines can be defined as a Paragraph. However, this is an example, and the text included in the image may be classified and determined into Symbol, Word, Line, and Paragraph groups by various methods and standards. Hereinafter, a method of classifying text included in an image into a plurality of groups will be reviewed.

도 24는 알고리즘을 이용하여 이미지로부터 Symbol 정보 및 텍스트 영역 정보를 획득하는 방법을 예시적으로 설명하기 위한 도면이다. 도 24를 참조하면, 일 실시예에 따른 제어부(100)는 알고리즘을 이용하여 Symbol 정보 및 텍스트 영역 정보를 획득할 수 있다.24 is a diagram for exemplarily explaining a method of acquiring symbol information and text area information from an image using an algorithm. Referring to FIG. 24 , the control unit 100 according to an embodiment may obtain symbol information and text area information using an algorithm.

도 24의 (a)를 참조하면, 제어부(100)는 제1 알고리즘을 이용하여 이미지로부터 Symbol을 추출할 수 있다. 이후, 제어부(100)는 추출된 Symbol로부터 Symbol 정보를 획득할 수 있다.Referring to (a) of FIG. 24 , the controller 100 may extract a symbol from an image using a first algorithm. Then, the control unit 100 may obtain symbol information from the extracted symbols.

이때, 상기 Symbol은 도 23을 통해 상술한 바와 같이, 이미지 내에 포함되어 있는 각각의 개별 문자를 의미할 수 있다. 또한, 상기 Symbol 정보는 추출된 Symbol로부터 획득될 수 있는 정보이며, 예를 들어, 이미지 내에 포함된 모든 Symbol의 리스트, Symbol이 이미지 내에 위치하는 위치 정보(좌표 정보) 등일 수 있다.In this case, the symbol may mean each individual character included in the image, as described above with reference to FIG. 23 . In addition, the symbol information is information that can be obtained from the extracted symbol, and may be, for example, a list of all symbols included in an image, location information (coordinate information) where symbols are located in an image, and the like.

제어부(100)는 제1 알고리즘을 이용하여 이미지로부터 텍스트 영역을 추출한 후, 텍스트 영역 정보를 획득할 수 있다. 이때, 상기 텍스트 영역은 이미지 내에 포함된 텍스트들이 이미지 내에서 차지하고 있는 영역을 의미할 수 있다. 예를 들어, 상기 텍스트 영역은 이미지 내에서 각각의 개별 문자, 단어, 문장 및 문단 중 적어도 어느 하나가 차지하고 있는 영역을 의미할 수 있다. 보다 구체적인 예로, 상기 텍스트 영역은 이미지 내에서 각각의 Symbol, Word, Line 및 Paragraph 중 적어도 어느 하나가 차지하고 있는 영역을 의미할 수 있다.The controller 100 may obtain text area information after extracting a text area from an image using a first algorithm. In this case, the text area may refer to an area occupied by texts included in the image in the image. For example, the text area may refer to an area occupied by at least one of individual characters, words, sentences, and paragraphs in an image. As a more specific example, the text area may refer to an area occupied by at least one of symbols, words, lines, and paragraphs in an image.

도 24의 (a)를 통해서는 제어부(100)가 하나의 알고리즘을 이용하여 Symbol 정보 및 텍스트 영역 정보를 획득하는 것을 설명하였으나, 이에 한정되는 것은 아니며 서로 다른 알고리즘을 이용하여 Symbol 정보 및 텍스트 영역 정보를 획득할 수 있다.In (a) of FIG. 24, it has been described that the control unit 100 acquires symbol information and text area information using one algorithm, but is not limited thereto and uses different algorithms to obtain symbol information and text area information. can be obtained.

예컨대, 도 24의 (b)를 참조하면, 제어부(100)는 제1 알고리즘을 통해 이미지로부터 Symbol을 추출할 수 있고, 추출된 Symbol에 기초하여 Symbol 정보를 획득할 수 있다. 또한, 제어부(100)는 제2 알고리즘을 이용하여 이미지로부터 텍스트 영역을 추출할 수 있고, 추출된 텍스트 영역에 관한 정보를 획득할 수 있다. 이때, 상기 제1 알고리즘과 상기 제2 알고리즘은 서로 다른 알고리즘일 수 있다. 이 경우, 예시적으로, 제1 알고리즘이 이미지로부터 텍스트를 추출하는 성능, 즉, 이미지로부터 Symbol을 추출하는 성능이 제2 알고리즘에 비해 더욱 좋은 알고리즘일 수 있다.For example, referring to (b) of FIG. 24 , the controller 100 may extract a symbol from an image through a first algorithm and obtain symbol information based on the extracted symbol. Also, the controller 100 may extract a text area from an image using a second algorithm and obtain information about the extracted text area. In this case, the first algorithm and the second algorithm may be different algorithms. In this case, illustratively, the first algorithm may be a better algorithm than the second algorithm in performance of extracting text from an image, that is, performance of extracting a symbol from an image.

도 25는 일 실시예에 따른 이미지 내에 포함된 텍스트를 Word 그룹으로 분류 및 결정하는 방법을 예시적으로 설명하기 위한 도면이다. 제어부(100)는 회득된 Symbol 정보와 텍스트 영역 정보를 이용하여 각각의 개별 Symbol을 Word 그룹으로 결정할 수 있다.25 is a diagram for exemplarily explaining a method of classifying and determining text included in an image into word groups according to an exemplary embodiment. The control unit 100 may determine each individual symbol as a word group using the acquired symbol information and text area information.

도 25를 참조하면, 제어부(100)는 Symbol 위치 정보를 획득하고, Word 영역 정보를 획득한 후, Symbol의 위치와 Word 영역을 비교한 후, Word 영역 내에 위치하는 Symbol의 집합을 Word 그룹으로 결정할 수 있다.Referring to FIG. 25, the control unit 100 obtains symbol location information, obtains word area information, compares the location of the symbol with the word area, and determines a set of symbols located within the word area as a word group. can

제어부(100)는 이미지 내의 Word 영역 정보를 획득할 수 있고, 동시에 Symbol의 위치 정보를 획득할 수 있다. 상기 Symbol의 위치 정보는 좌표 정보일 수 있다. 제어부(100)는 이미지 내의 제1 Word 영역 내에 상기 Symbol의 좌표가 포함되는 경우, 이를 제1 Word 그룹으로 분류할 수 있다.The controller 100 may obtain word area information in the image and simultaneously obtain symbol location information. The location information of the symbol may be coordinate information. When the coordinates of the Symbol are included in the first word area in the image, the controller 100 can classify them as a first word group.

이때, Symbol의 좌표 정보는 제1 좌표, 제2 좌표, 제3 좌표 및 제4 좌표를 포함할 수 있다. 예를 들어, Symbol의 좌표는 Symbol 하나에 관한 좌측 상단 좌표, 좌측 하단 좌표, 우측 상단 좌표 및 우측 하단 좌표를 포함할 수 있다.At this time, the coordinate information of the symbol may include the first coordinate, the second coordinate, the third coordinate, and the fourth coordinate. For example, the coordinates of a symbol may include upper left coordinates, lower left coordinates, upper right coordinates, and lower right coordinates for one Symbol.

도 26은 이미지 내에 포함된 복수의 Symbol을 적어도 하나 이상의 Word 그룹으로 결정하는 방법을 예시적으로 설명하기 위한 도면이다.26 is a diagram for illustratively explaining a method of determining a plurality of symbols included in an image as at least one word group.

도 26을 참조하면, 제어부(100)는 미리 정해진 알고리즘을 이용하여 이미지 내에 포함된 Word들이 이미지 내에서 차지하고 있는 영역(예컨데, 제1 Word 영역(WA1) 및 제2 Word 영역(WA2))을 획득할 수 있다. 또한, 제어부(100)는 미리 정해진 알고리즘을 이용하여 이미지 내에 포함된 복수의 Symbol(S1 내지 S9)을 추출할 수 있다.Referring to FIG. 26, the controller 100 obtains areas occupied in the image by words included in the image (eg, first word area WA1 and second word area WA2) by using a predetermined algorithm. can do. In addition, the controller 100 may extract a plurality of symbols (S1 to S9) included in the image using a predetermined algorithm.

제어부(100)는 추출된 복수의 Symbol 중 제1 Word 영역(WA1)에 포함되는 Symbol(S1 내지 S9)은 제1 Word 그룹으로 결정할 수 있고, 추출된 복수의 Symbol 중 제2 Word 영역(WA2)에 포함되는 Symbol에 대하여는 제2 Word 그룹으로 결정할 수 있다.The control unit 100 may determine symbols (S1 to S9) included in the first word area WA1 among the plurality of extracted symbols as the first word group, and select the second word area WA2 among the plurality of extracted symbols. Symbols included in can be determined as the second word group.

보다 구체적인 예로, 제어부(100)는 제1 Symbol(S1)이 이미지 내에서 제1 Word 영역(WA1)에 위치하는 경우, 상기 제1 Symbol(S1)이 제1 Word 그룹에 포함되는 것으로 결정할 수 있다.As a more specific example, if the first Symbol S1 is located in the first word area WA1 in the image, the controller 100 may determine that the first Symbol S1 is included in the first word group. .

다른 예로, 제어부(100)는 제1 Symbol이 이미지 내에서 차지하는 제1 영역 중 적어도 일부가 제1 Word 영역(WA1) 및 제2 Word 영역(WA2) 모두에 포함되어 있는 경우, 상기 제1 영역의 적어도 일부가 상기 제1 Word 영역(WA1)과 중첩되는 정도와 상기 제1 영역의 적어도 일부가 상기 제2 Word 영역(WA2)과 중첩되는 정도의 비율을 고려하여 상기 제1 Symbol이 제1 Word 그룹에 속할지 제2 Word 그룹에 속할지 결정할 수 있다.As another example, the controller 100 may, when at least a part of the first area occupied by the first symbol in the image is included in both the first word area WA1 and the second word area WA2, the first area In consideration of the ratio of the degree to which at least a portion overlaps with the first word area WA1 and the degree to which at least a portion of the first area overlaps with the second word area WA2, the first symbol is selected as a first word group. or the second word group.

예컨대, 제어부(100)는 제1 Symbol이 이미지 내에서 차지하는 제1 영역 중 적어도 일부가 제1 Word 영역(WA1)과 중첩되는 정도가 상기 제1 Symbol이 이미지 내에서 차지하는 제2 영역(이때, 상기 제2 영역은 상기 제1 영역에 대응되는 영역일 수도 있고, 상기 제1 영역과 동일한 영역일 수도 있으며, 상기 제1 영역과 다른 영역일 수도 있음) 중 적어도 일부가 제2 Word 영역(WA2)과 중첩되는 정도보다 큰 경우, 제1 Symbol이 제1 Word 그룹에 포함되는 것으로 결정할 수 있다.For example, the control unit 100 determines the extent to which at least a part of the first area occupied by the first symbol in the image overlaps with the first word area WA1 in the second area occupied by the first symbol in the image (at this time, the The second area may correspond to the first area, may be the same area as the first area, or may be a different area from the first area.) If it is greater than the degree of overlap, it may be determined that the first symbol is included in the first word group.

도 27은 일 실시예에 따른 이미지 내에 포함된 텍스트를 Line 그룹으로 분류 및 결정하는 방법을 예시적으로 설명하기 위한 도면이다. 제어부(100)는 회득된 적어도 하나 이상의 Word 그룹에 관한 정보를 이용하여 2 이상의 Word 그룹의 집합인 Line 그룹을 결정할 수 있다.27 is a diagram for exemplarily explaining a method of classifying and determining text included in an image into a line group according to an exemplary embodiment. The controller 100 may determine a line group, which is a set of two or more word groups, by using the obtained information on one or more word groups.

도 27을 참조하면, 제어부(100)는 상술한 방법으로 제1 Word 그룹을 획득하고, 제2 Word 그룹을 획득할 수 있다. 이후, 제어부(100)는 제1 Word 그룹 및 제2 Word 그룹을 비교하여, 미리 정해진 조건을 만족하는 경우, 제1 Word 그룹과 제2 Word 그룹의 집합을 Line 그룹으로 결정할 수 있다.Referring to FIG. 27 , the controller 100 may acquire the first word group and the second word group in the above-described method. Then, the controller 100 compares the first word group and the second word group, and determines a set of the first word group and the second word group as a line group when a predetermined condition is satisfied.

도 28 내지 도 31은 복수의 Word 그룹을 Line 그룹으로 결정하기 위한 조건을 예시적으로 설명하기 위한 도면이다. 이하에서는, 각 도면을 참조하여 복수의 Word 그룹이 Line 그룹으로 결정되기 위한 조건에 관하여 설명한다. 이때, 복수의 Word 그룹이 이후 설명될 각각의 조건을 모두 만족하는 경우에만 Line 그룹으로 결정될 수 있으나, 이에 한정되는 것은 아니며, 복수의 Word 그룹이 복수의 조건 중 적어도 어느 하나를 만족하는 경우 Line 그룹으로 결정될 수도 있다.28 to 31 are diagrams for explaining conditions for determining a plurality of word groups as a line group by way of example. Hereinafter, conditions for determining a plurality of word groups as a line group will be described with reference to each drawing. In this case, a line group may be determined only when a plurality of word groups satisfy all conditions to be described later, but is not limited thereto, and a line group when a plurality of word groups satisfy at least one of a plurality of conditions. may be determined.

도 28을 참조하면, 제어부(100)는 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 획득한 후, 각 그룹이 이미지 내에서 차지하는 영역에 관한 정보에 기초하여 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할지 여부를 결정할 수 있다.Referring to FIG. 28 , the controller 100 obtains a first word group (WG1) and a second word group (WG2), and then, based on information about the area occupied by each group in the image, the first word group ( WG1) and the second word group (WG2) may be determined as a line group.

예컨대, 제어부(100)는 제1 Word 그룹(WG1)이 이미지 내에서 차지하는 영역의 좌표 값을 통해 도출된 제1 높이 값(d1) 및 제2 Word 그룹(WG2)이 이미지 내에서 차지하는 영역의 좌표 값을 통해 도출된 제2 높이 값(d2)에 기초하여 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할지 여부를 결정할 수 있다.For example, the controller 100 determines the coordinates of the area occupied by the first height value d1 and the second word group WG2 in the image derived from the coordinate values of the area occupied by the first word group WG1 in the image. Based on the second height value d2 derived through the value, it may be determined whether to determine the first word group WG1 and the second word group WG2 as the line group.

보다 구체적인 예로, 제어부(100)는 상기 제1 높이 값(d1)이 상기 제2 높이 값(d2) 이상이 되는 경우, 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할 수 있다.As a more specific example, the controller 100 classifies the first word group WG1 and the second word group WG2 as a Line group when the first height value d1 is greater than or equal to the second height value d2. can decide

도 29를 참조하면, 제어부(100)는 제1 Word 그룹(WG1)의 각도 값과 제2 Word 그룹(WG2)의 각도 값을 고려하여 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할지 여부를 결정할 수 있다.Referring to FIG. 29 , the controller 100 determines the first word group (WG1) and the second word group (WG2) by considering the angle value of the first word group (WG1) and the angle value of the second word group (WG2). It can be determined whether to determine as a Line group.

예컨대, 제어부(100)는 제1 Word 그룹(WG1)이 이미지 내에서 차지하는 영역의 각도 정보로부터 허용 각도 범위를 산출한 후, 상기 허용 각도 범위와 상기 제2 Word 그룹(WG2)이 이미지 내에 위치하는 각도 값을 비교하여 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할지 여부를 결정할 수 있다.For example, the control unit 100 calculates an allowable angular range from angle information of a region occupied by the first word group WG1 in the image, and then calculates the allowable angular range and the second word group WG2 located in the image. It is possible to determine whether to determine the first word group (WG1) and the second word group (WG2) as a line group by comparing angle values.

보다 구체적인 예로, 제어부(100)는 제1 Word 그룹(WG1)의 좌표 값으로부터 제1 각도 값을 계산하고, 상기 제1 각도 값에 기초하여 허용 각도 범위를 결정할 수 있다. 이후, 제어부(100)는 제2 Word 그룹(WG2)의 좌표 값으로부터 제2 각도 값을 계산한 후, 상기 제2 각도 값이 상기 허용 각도 범위에 포함되는 경우 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할 수 있다.As a more specific example, the controller 100 may calculate a first angle value from the coordinate values of the first word group WG1 and determine an allowable angle range based on the first angle value. Thereafter, the controller 100 calculates a second angle value from the coordinate values of the second word group WG2, and when the second angle value is included in the allowable angle range, the first word group WG1 and the second angle value 2 Word group (WG2) can be determined as Line group.

도 30을 참조하면, 제어부(100)는 제1 Word 그룹(WG1)에 포함된 어느 하나의 Symbol의 폭에 기초하여, 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할지 여부를 결정할 수 있다.Referring to FIG. 30 , the controller 100 divides the first word group WG1 and the second word group WG2 into a line group based on the width of any one symbol included in the first word group WG1. You can decide whether or not to.

예컨대, 제어부(100)는 제1 Word 그룹(WG1)에 포함된 어느 하나의 Symbol의 폭(d1)을 계산하고, 제1 Word 그룹(WG1)과 제2 Word 그룹(WG2) 사이의 거리 값(d2)을 계산할 수 있다. 이후, 제어부(100)는 제1 Word 그룹(WG1)과 제2 Word 그룹(WG2) 사이의 거리 값(d2)이 상기 Symbol의 폭(d1) 이하인 경우, 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할 수 있다.For example, the controller 100 calculates the width d1 of any one symbol included in the first word group WG1, and the distance value between the first word group WG1 and the second word group WG2 ( d2) can be calculated. Thereafter, the controller 100 controls the first word group WG1 and the second word group WG1 and the second word group WG2 when the distance value d2 between the first word group WG1 and the second word group WG2 is equal to or less than the width d1 of the symbol. The word group (WG2) can be determined as a line group.

도 31을 참조하면, 제어부(100)는 제1 Word 그룹(WG1)의 MBR(Minimum Bounding Rectangle) 값에 기초하여, 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할지 여부를 결정할 수 있다.Referring to FIG. 31 , the controller 100 determines the first word group WG1 and the second word group WG2 as a line group based on the MBR (Minimum Bounding Rectangle) value of the first word group WG1. can decide whether to do it or not.

예컨대, 제어부(100)는 제1 Word 그룹(WG1)의 MBR(Minimum Bounding Rectangle) 값을 계산하고, 제2 Word 그룹(WG1)의 제2 높이 값을 계산한 후, 상기 MBR 값 및 상기 제2 높이 값에 기초하여 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할지 여부를 결정할 수 있다.For example, the controller 100 calculates a Minimum Bounding Rectangle (MBR) value of the first word group WG1, calculates a second height value of the second word group WG1, and then calculates the MBR value and the second height value of the second word group WG1. Based on the height value, it may be determined whether to determine the first word group (WG1) and the second word group (WG2) as the line group.

이때, 상기 MBR(Minimum Bounding Rectangle)은 Word 그룹, Line 그룹 또는 Paragraph 그룹에 포함되어 있는 모든 Symbol을 포함할 수 있는 최소 사이즈의 직사각형을 의미할 수 있다.In this case, the MBR (Minimum Bounding Rectangle) may mean a rectangle having a minimum size capable of including all symbols included in a Word group, a Line group, or a Paragraph group.

보다 구체적인 예로, 제어부(100)는 미리 정해진 알고리즘을 이용하여 제1 Word 그룹(WG1)의 MBR(Minimum Bounding Rectangle) 값을 계산하고, 상기 MBR 값으로부터 최대 허용 높이 값(d1)를 결정할 수 있다. 이후, 제어부(100)는 제2 Word 그룹(WG2)의 제2 높이 값(d2)을 계산한 후, 상기 제2 높이 값(d2)이 상기 최대 허용 높이 값(d1) 이하인 경우, 제1 Word 그룹(WG1) 및 제2 Word 그룹(WG2)을 Line 그룹으로 결정할지 여부를 결정할 수 있다.As a more specific example, the controller 100 may calculate a Minimum Bounding Rectangle (MBR) value of the first word group WG1 using a predetermined algorithm, and determine the maximum allowable height value d1 from the MBR value. Thereafter, the controller 100 calculates the second height value d2 of the second word group WG2, and when the second height value d2 is equal to or less than the maximum allowable height value d1, the first word It is possible to determine whether to determine the group WG1 and the second word group WG2 as a line group.

도 32는 일 실시예에 따른 이미지 내에 포함된 텍스트를 Paragraph 그룹으로 분류 및 결정하는 방법을 예시적으로 설명하기 위한 도면이다. 제어부(100)는 획득된 적어도 하나 이상의 Line 그룹에 관한 정보를 이용하여 2 이상의 Line 그룹의 집합인 Paragraph 그룹을 결정할 수 있다.도 32를 참조하면, 제어부(100)는 상술한 방법으로, 제1 Line 그룹을 획득하고, 제2 Line 그룹을 획득하고, 상기 제1 Line 그룹의 정보와 제2 Line 그룹의 정보를 비교하여, 미리 정해진 Paragraph 통합 조건을 만족하는 경우, 상기 제1 Line 그룹과 제2 Line 그룹을 Paragraph 그룹으로 결정할 수 있다.32 is a diagram for exemplarily explaining a method of classifying and determining text included in an image into paragraph groups according to an exemplary embodiment. The control unit 100 may determine a paragraph group, which is a set of two or more line groups, by using the acquired information about one or more line groups. Referring to FIG. A line group is obtained, a second line group is obtained, information of the first line group and information of the second line group are compared, and when a predetermined paragraph integration condition is satisfied, the first line group and the second line group are obtained. A Line group can be determined as a Paragraph group.

도 33은 복수의 Line 그룹을 Paragraph 그룹으로 결정하기 위한 조건을 예시적으로 설명하기 위한 도면이다. 도 33을 참조하면, 제어부(100)는 제1 Line 그룹(G1)을 획득할 수 있고, 제2 Line 그룹(G2)을 획득할 수 있다. 이 경우, 제어부(100)는 제1 Line 그룹(G1)과 제2 Line 그룹(G2)을 비교하여, 미리 정해진 조건을 만족하는 경우, 제1 Line 그룹(G1)과 제2 Line 그룹(G2)의 집합을 Paragraph 그룹으로 결정할 수 있다.33 is a diagram for illustratively explaining conditions for determining a plurality of line groups as a paragraph group. Referring to FIG. 33 , the controller 100 may obtain a first line group G1 and a second line group G2. In this case, the controller 100 compares the first line group (G1) and the second line group (G2), and when a predetermined condition is satisfied, the first line group (G1) and the second line group (G2) A set of can be determined as a Paragraph group.

제어부(100)는 제1 Line 그룹(G1)이 이미지 내에서 차지하는 영역의 좌표 값을 통해 도출된 제1 높이 값(d1)을 추출할 수 있고, 제2 Line 그룹(G2)이 이미지 내에서 차지하는 영역의 좌표 값을 통해 도출된 제2 높이 값을 추출할 수 있다.The controller 100 may extract the first height value d1 derived from the coordinate value of the area occupied by the first line group G1 in the image, and the second line group G2 occupied in the image. A second height value derived through the coordinate values of the region may be extracted.

제어부(100)는 제1 Line 그룹(G1)과 제2 Line 그룹(G2) 사이의 거리 값(d2)를 계산할 수 있다. 이때, 상기 거리 값은 제1 Line 그룹(G1)과 제2 Line 그룹(G2) 사이의 거리일 수도 있으나, 이에 한정되는 것은 아니며, 제1 Line 그룹(G1)의 MBR과 제2 Line 그룹(G2)의 MBR에 기초하여 결정된 값일 수 있다.The controller 100 may calculate a distance value d2 between the first line group G1 and the second line group G2. At this time, the distance value may be the distance between the first line group G1 and the second line group G2, but is not limited thereto, and the MBR of the first line group G1 and the second line group G2 ) may be a value determined based on the MBR of

제어부(100)는 상기 제1 Line 그룹(G1)과 제2 Line 그룹(G2) 사이의 거리 값(d2)이 상기 제1 Line 그룹(G1)의 제1 높이 값(d1) 이하인 경우, 제1 Line 그룹(G1)과 제2 Line 그룹(G2)을 하나의 Paragraph 그룹으로 결정할 수 있다.When the distance value d2 between the first line group G1 and the second line group G2 is equal to or less than the first height value d1 of the first line group G1, the control unit 100 determines the first The line group G1 and the second line group G2 may be determined as one paragraph group.

이상에서 실시 형태들에 설명된 특징, 구조, 효과 등은 본 발명의 적어도 하나의 실시 형태에 포함되며, 반드시 하나의 실시 형태에만 한정되는 것은 아니다. 나아가, 각 실시 형태에서 예시된 특징, 구조, 효과 등은 실시 형태들이 속하는 분야의 통상의 지식을 가지는 자에 의해 다른 실시 형태들에 대해서도 조합 또는 변형되어 실시 가능하다. 따라서 이러한 조합과 변형에 관계된 내용들은 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.The features, structures, effects, etc. described in the embodiments above are included in at least one embodiment of the present invention, and are not necessarily limited to only one embodiment. Furthermore, the features, structures, effects, etc. illustrated in each embodiment can be combined or modified with respect to other embodiments by those skilled in the art in the field to which the embodiments belong. Therefore, contents related to these combinations and variations should be construed as being included in the scope of the present invention.

또한, 이상에서 실시 형태를 중심으로 설명하였으나 이는 단지 예시일 뿐 본 발명을 한정하는 것이 아니며, 본 발명이 속하는 분야의 통상의 지식을 가진 자라면 본 실시 형태의 본질적인 특성을 벗어나지 않는 범위에서 이상에 예시되지 않은 여러 가지의 변형과 응용이 가능함을 알 수 있을 것이다. 즉, 실시 형태에 구체적으로 나타난 각 구성 요소는 변형하여 실시할 수 있는 것이다. 그리고 이러한 변형과 응용에 관계된 차이점들은 첨부된 청구 범위에서 규정하는 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.In addition, although the embodiment has been described above, this is only an example and does not limit the present invention, and those skilled in the art to the present invention pertain to the above to the extent that does not deviate from the essential characteristics of the present embodiment. It will be appreciated that various modifications and applications not exemplified are possible. That is, each component specifically shown in the embodiment can be implemented by modifying it. And differences related to these modifications and applications should be construed as being included in the scope of the present invention as defined in the appended claims.

Claims

a storage unit for storing an image including text; and
Including; a control unit for extracting text included in the image;
The control unit,
Classifying and extracting text included in the image into at least one group,
Obtaining symbol information on a plurality of symbols included in the image through a first algorithm, wherein the plurality of symbols are individual unit characters constituting the text;
obtaining text area information about a text area in the image through a second algorithm, wherein the text area is an area corresponding to the text in the image;
Classifying text included in the image into at least one group based on the symbol information and the text area information;
A device for extracting and grouping text contained within images.

According to claim 1,
The symbol information includes at least one of the type, list, number, coordinates, and location of a plurality of symbols included in the image.
A device for extracting and grouping text contained within images.

According to claim 1,
The first algorithm and the second algorithm are different algorithms,
A device for extracting and grouping text contained within images.

According to claim 1,
The control unit,
From the image, word area information, which is an area corresponding to a word included in the image, is obtained through the second algorithm, and the word is a region of the symbol that satisfies a predetermined condition. Set Im -,
determining at least one or more of the plurality of symbols as a word group based on the symbol information and the word region information;
A device for extracting and grouping text contained within images.

According to claim 4,
The control unit,
Determine at least one or more of the plurality of symbols as a word group based on positions or coordinates occupied by the plurality of symbols included in the symbol information in the image;
Determining at least one symbol located in the word area among the plurality of symbols as a word group,
A device for extracting and grouping text contained within images.

According to claim 4,
The symbol information includes information on coordinates at which a first symbol, which is any one of the plurality of symbols, is located in the image, wherein the coordinates include a first coordinate that is an upper left coordinate of the first symbol and a lower left corner of the first symbol. Including a second coordinate that is a coordinate, a third coordinate that is an upper right coordinate, and a fourth coordinate that is a lower right coordinate,
The control unit,
Determining the first symbol as a word group when any one of the first to fourth coordinates is located in the word area,
A device for extracting and grouping text contained within images.

According to claim 4,
The word area includes a first word area and a second word area,
The control unit,
determining a symbol located in the first word area among the plurality of symbols as a first word group, and determining a symbol located in the second word area as a second word group;
A device for extracting and grouping text contained within images.

According to claim 7,
The control unit,
When a first symbol, which is any one of the plurality of symbols, is included in both the first word area and the second word area, the area occupied by the first symbol in the image is the first word area and the second word area. determining the first symbol as a first word group when the degree of overlap is greater than the degree of overlap with the second word region;
A device for extracting and grouping text contained within images.

According to claim 4,
The control unit,
Based on the symbol information and the word region information, at least one of the plurality of symbols is determined as a word group, the word group including a first word group and a second word group;
When a predetermined condition is satisfied, the first word group and the second word group are determined as a line group;
The line group refers to a set of words that satisfy a predetermined condition.
A device for extracting and grouping text contained within images.

According to claim 9,
The control unit,
A first height value of an area occupied by the first word group in the image is compared with a second height value of an area occupied by the second word group in the image, and the first word group and the second height value are compared. determining whether to determine a word group as the line group;
A device for extracting and grouping text contained within images

According to claim 9,
The control unit,
Calculating a first angle value related to the arrangement of the first word group in the image from the coordinate values of the first word group;
After determining the allowable angle range based on the first angle value,
After calculating a second angle value related to the arrangement of the second word group in the image from the coordinate value of the second word group,
determining the first word group and the second word group as the line group when the second angle value is included in the allowable angle range;
A device for extracting and grouping text contained within images.

According to claim 9,
The control unit,
determining whether to determine the first word group and the second word group as the line group based on the width of any one symbol included in the first word group;
A device for extracting and grouping text contained within images.

According to claim 12,
The control unit,
Calculate a width of a first symbol, which is any one symbol included in the first word group, and calculate a spacing value between the first word group and the second word group,
determining the first word group and the second word group as the line group when an interval value between the first word group and the second word group is less than or equal to a width of the first symbol;
A device for extracting and grouping text contained within images.

According to claim 9,
The control unit,
Calculate a Minimum Bounding Rectangle (MBR) value of the first word group, wherein the Minimum Bounding Rectangle (MBR) is a rectangle having a minimum size that can include all symbols included in the first word group;
After determining the maximum allowable height value based on the MBR value,
determining the first word group and the second word group as the line group when a height value of an area occupied by the second word group in the image is less than or equal to the maximum allowable height value;
A device for extracting and grouping text contained within images.

According to claim 9,
The control unit,
When a predetermined condition is satisfied, the first word group and the second word group are determined as a line group, the line group including a first line group and a second line group;
determining the first line group and the second line group as a paragraph group when a predetermined condition is satisfied;
A device for extracting and grouping text contained within images.

According to claim 15,
The control unit,
A first height value of an area occupied by the first line group in the image is compared with a second height value of an area occupied by the second line group in the image, and the first line group and the second height value are compared. determining whether to determine a line group as the paragraph group;
A device for extracting and grouping text contained within images.

According to claim 15,
The control unit,
Calculate a first height value for an area occupied by the first line group in the image, and calculate a distance value between the first line group and the second line group,
determining the first line group and the second line group as the paragraph group when a distance value between the first line group and the second line group is less than or equal to the first height value;
A device for extracting and grouping text contained within images.

A method for extracting text included in an image,
obtaining an image, wherein the image is an image including text;
obtaining symbol information on a plurality of symbols included in the image through a first algorithm, wherein the plurality of symbols are individual unit characters constituting the text;
acquiring text area information about a text area in the image through a second algorithm, wherein the text area corresponds to the text in the image;
classifying text included in the image into at least one group based on the symbol information and the text area information; and
Classifying and extracting text included in the image into the at least one or more groups;
How to extract and group text contained within images.