KR20220079431A

KR20220079431A - Method for extracting tag information from screenshot image and system thereof

Info

Publication number: KR20220079431A
Application number: KR1020210146335A
Authority: KR
Inventors: 정구일; 홍건표
Original assignee: 주식회사 마이너
Priority date: 2020-12-04
Filing date: 2021-10-29
Publication date: 2022-06-13
Also published as: WO2022119136A1; KR20220079433A; KR20220079432A

Abstract

본 출원의 일 실시예에 따른 스크린샷 이미지로부터 태그 정보를 추출하는 시스템은, 적어도 하나 이상의 스크린샷 이미지가 저장되어 있는 사용자 단말기; 및 상기 사용자 단말기와 상기 스크린샷 이미지에 관한 정보를 송수신하는 서버;를 포함하되, 상기 스크린샷 이미지는 적어도 하나 이상의 텍스트를 포함하고, 상기 사용자 단말기는 상기 스크린샷 이미지 내에서 상기 텍스트가 존재하는 텍스트 영역과 상기 텍스트가 존재하지 않는 비텍스트 영역을 획득한 후, 상기 텍스트 영역으로부터 문자 판독 모듈을 통해 텍스트 정보를 추출한 후, 상기 텍스트 정보를 상기 스크린샷 이미지에 맵핑하여 상기 서버에 전송하고, 상기 서버는 상기 사용자 단말기로부터 상기 텍스트 정보를 수신한 후, 상기 텍스트 정보로부터 키워드를 추출하고, 상기 키워드에 기초하여 도출된 태그 정보를 상기 스크린샷 이미지에 맵핑한 후 상기 사용자 단말기에 전송할 수 있다.A system for extracting tag information from a screenshot image according to an embodiment of the present application includes: a user terminal in which at least one screenshot image is stored; and a server for transmitting and receiving information about the screenshot image with the user terminal, wherein the screenshot image includes at least one text, and the user terminal includes text in which the text exists in the screenshot image. After obtaining an area and a non-text area in which the text does not exist, extracting text information from the text area through a character reading module, mapping the text information to the screenshot image and sending it to the server, may receive the text information from the user terminal, extract a keyword from the text information, map tag information derived based on the keyword to the screenshot image, and then transmit it to the user terminal.

Description

Method for extracting tag information from screenshot image and system thereof}

본 발명은 스크린샷 이미지로부터 추출된 문자 또는 이미지 정보에 기초하여 단말기에 저장된 스크린샷 이미지를 분류하고 검색할 수 있도록 하는 방법에 관한 것이다.The present invention relates to a method for classifying and searching for a screenshot image stored in a terminal based on text or image information extracted from the screenshot image.

단말기를 통해 사용자는 영상을 촬영하여 단말기에 저장하고, 또한 필요한 경우 단말기를 통해 획득되는 다양한 콘텐츠들을 스크린샷 등으로 캡처하여 단말기에 저장하고 있다. 한편, 사용자가 직접 촬영한 영상(이하, 촬영 영상이라고 한다)과 정보 획득 목적으로 캡처하여 저장한 스크린샷 이미지는 모두 단일 콘텐츠 저장소에 저장되고 있다.Through the terminal, a user captures an image and stores it in the terminal, and, if necessary, captures various contents obtained through the terminal as a screenshot and stores the captured image in the terminal. On the other hand, both an image captured by the user (hereinafter referred to as a captured image) and a screenshot image captured and stored for the purpose of obtaining information are all stored in a single content storage.

시간이 지날수록 무수히 많은 촬영 영상과 스크린샷 이미지는 단일 콘텐츠 저장소에 저장되게 되며, 사용자가 필요에 의해 특정 스크린샷 이미지를 검색하기를 희망하는 경우, 사용자는 해당 스크린샷 이미지를 발견하기 위해 콘텐츠 저장소에 저장된 모든 이미지들을 직접 탐색해야 한다.Over time, a myriad of shot videos and screenshot images will be stored in a single content repository, and when a user wishes to search for a specific screenshot image according to their needs, the user can search the content repository to find the screenshot image. You have to search all the images stored in .

이처럼 사용자가 원하는 스크린샷 이미지를 발견하기 위해서는 수십 심지어 수백 개의 이미지를 직접 탐색해야하는 점에서 이러한 탐색 프로세스는 시간 소모적이며 번거롭다는 한계점을 가지고 있는 상황이다.In this way, in order to find a screenshot image that a user wants, tens or even hundreds of images must be directly searched for, so this search process is time-consuming and cumbersome.

본 발명의 일 과제는, 스크린샷 이미지로부터 태그 정보를 추출하는 방법을 제공하는 것이다.One object of the present invention is to provide a method of extracting tag information from a screenshot image.

본 발명의 일 과제는, 스크린샷 이미지에 맵핑된 태그 정보에 기초하여 스크린샷 이미지를 검색하는 방법을 제공하는 것이다.An object of the present invention is to provide a method of searching for a screenshot image based on tag information mapped to the screenshot image.

본 발명의 일 과제는, 스크린샷 이미지에 맵핑된 태그 정보에 기초하여 스크린샷 이미지 검색 결과를 제공하는 방법에 관한 것이다.One object of the present invention relates to a method of providing a screenshot image search result based on tag information mapped to a screenshot image.

본 발명이 해결하고자 하는 과제가 상술한 과제로 제한되는 것은 아니며, 언급되지 아니한 과제들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The problem to be solved by the present invention is not limited to the above-mentioned problems, and the problems not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the present specification and the accompanying drawings. .

본 출원에 개시된 스크린샷 이미지로부터 태그 정보를 추출하는 시스템은 적어도 하나 이상의 스크린샷 이미지가 저장되어 있는 사용자 단말기; 및 상기 사용자 단말기와 상기 스크린샷 이미지에 관한 정보를 송수신하는 서버;를 포함하되, 상기 스크린샷 이미지는 적어도 하나 이상의 텍스트를 포함하고, 상기 사용자 단말기는 상기 스크린샷 이미지 내에서 상기 텍스트가 존재하는 텍스트 영역과 상기 텍스트가 존재하지 않는 비텍스트 영역을 획득한 후, 상기 텍스트 영역으로부터 문자 판독 모듈을 통해 텍스트 정보를 추출한 후, 상기 텍스트 정보를 상기 스크린샷 이미지에 부여된 식별번호에 맵핑 한 후 상기 서버에 전송하고, 상기 서버는 상기 사용자 단말기로부터 상기 텍스트 정보를 수신한 후, 상기 텍스트 정보로부터 키워드를 추출하고, 상기 키워드에 기초하여 도출된 태그 정보를 상기 식별번호에 맵핑 한 후 상기 사용자 단말기에 전송할 수 있다.A system for extracting tag information from a screenshot image disclosed in the present application includes: a user terminal in which at least one screenshot image is stored; and a server for transmitting and receiving information about the screenshot image with the user terminal, wherein the screenshot image includes at least one text, and the user terminal includes text in which the text exists in the screenshot image. After obtaining an area and a non-text area in which the text does not exist, extracting text information from the text area through a character reading module, mapping the text information to an identification number assigned to the screenshot image, and then the server After receiving the text information from the user terminal, the server extracts a keyword from the text information, maps the tag information derived based on the keyword to the identification number, and transmits it to the user terminal can

본 발명의 과제의 해결 수단이 상술한 해결 수단들로 제한되는 것은 아니며, 언급되지 아니한 해결 수단들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The solutions to the problems of the present invention are not limited to the above-described solutions, and solutions not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the present specification and the accompanying drawings. will be able

본 출원의 실시예에 의하면, 사용자가 캡처하여 저장한 스크린샷 이미지에는 태그 정보가 맵핑되어 있으므로, 사용자는 맵핑된 태그 정보에 기초하여 단말기에 저장된 수많은 스크린샷 이미지 중 필요로하는 스크린샷 이미지를 신속하고 정확하게 검색할 수 있다.According to the embodiment of the present application, since tag information is mapped to the screenshot image captured and stored by the user, the user can quickly select a necessary screenshot image from among numerous screenshot images stored in the terminal based on the mapped tag information. and search accurately.

본 발명의 효과가 상술한 효과들로 제한되는 것은 아니며, 언급되지 아니한 효과들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.Effects of the present invention are not limited to the above-described effects, and effects not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention pertains from the present specification and accompanying drawings.

도 1은 일 실시예에 따른 스크린샷 이미지로부터 태그 정보를 추출하는 방법을 설명하기 위한 도면이다.
도 2는 사용자 단말기의 구성을 설명하기 위한 도면이다.
도 3은 사용자 단말기에 구비된 저장부에 저장되어 있는 다양한 이미지들을 예시적으로 나타낸 도면이다.
도 4는 스크린샷 이미지를 예시적으로 설명하기 위한 도면이다.
도 5는 일 실시예에 따른 스크린샷 이미지로부터 태그 정보를 추출하는 방법의 전체 프로세스를 설명하기 위한 도면이다.
도 6은 신경망 모델을 통해 스크린샷 이미지를 텍스트 영역 또는 비텍스트 영역으로 구분하는 것을 설명하기 위한 도면이다.
도 7 및 8은 텍스트 영역으로부터 텍스트를 추출하는 방법을 설명하기 위한 도면이다.
도 9 내지 도 11은 텍스트 영역으로부터 추출된 텍스트 정보를 예시적으로 설명하기 위한 도면이다.
도 12는 비텍스트 영역으로부터 이미지 정보를 추출하기 위한 방법을 설명하기 위한 도면이다.
도 13은 서버에서 텍스트 정보로부터 키워드를 추출하는 전체 프로세스를 설명하기 위한 도면이다.
도 14는 일 실시예에 따른 제2 신경망 모델을 통해 텍스트 정보로부터 키워드를 획득하는 방법을 설명하기 위한 도면이다.
도 15는 스크린샷 이미지로부터 추출된 키워드 및 타겟 정보를 예시적으로 설명하기 위한 도면이다.
도 16은 사용자 단말기의 출력부를 통해 표시되는 화면을 예시적으로 설명하기 위한 도면이다.
도 17은 추천 태그를 선정하는 방법 및 사용자에게 제공하는 방법을 설명하기 위한 도면이다.
도 18은 태그 정보에 기초하여 스크린샷 이미지를 검색하는 방법 및 검색 결과를 사용자에게 제공하는 방법을 설명하기 위한 도면이다.1 is a diagram for explaining a method of extracting tag information from a screenshot image according to an embodiment.
2 is a diagram for explaining the configuration of a user terminal.
3 is a diagram illustrating various images stored in a storage unit provided in a user terminal by way of example.
4 is a diagram for explaining a screenshot image by way of example.
5 is a diagram for explaining an overall process of a method of extracting tag information from a screenshot image according to an embodiment.
6 is a diagram for explaining the classification of a screenshot image into a text area or a non-text area through a neural network model.
7 and 8 are diagrams for explaining a method of extracting text from a text area.
9 to 11 are diagrams for exemplarily explaining text information extracted from a text area.
12 is a diagram for explaining a method for extracting image information from a non-text area.
13 is a diagram for explaining the entire process of extracting a keyword from text information in a server.
14 is a diagram for explaining a method of acquiring a keyword from text information through a second neural network model according to an embodiment.
15 is a diagram for exemplarily explaining keywords and target information extracted from a screenshot image.
16 is a view for explaining a screen displayed through an output unit of a user terminal by way of example.
17 is a diagram for explaining a method of selecting a recommendation tag and providing it to a user.
18 is a diagram for explaining a method of searching for a screenshot image based on tag information and a method of providing a search result to a user.

본 출원의 상술한 목적, 특징들 및 장점은 첨부된 도면과 관련된 다음의 상세한 설명을 통해 보다 분명해질 것이다. 다만, 본 출원은 다양한 변경을 가할 수 있고 여러 가지 실시예들을 가질 수 있는 바, 이하에서는 특정 실시예들을 도면에 예시하고 이를 상세히 설명하고자 한다.The above-described objects, features and advantages of the present application will become more apparent from the following detailed description in conjunction with the accompanying drawings. However, since the present application may have various changes and may have various embodiments, specific embodiments will be exemplified in the drawings and described in detail below.

명세서 전체에 걸쳐서 동일한 참조번호들은 원칙적으로 동일한 구성요소들을 나타낸다. 또한, 각 실시예의 도면에 나타나는 동일한 사상의 범위 내의 기능이 동일한 구성요소는 동일한 참조부호를 사용하여 설명하며, 이에 대한 중복되는 설명은 생략하기로 한다.Throughout the specification, like reference numerals refer to like elements in principle. In addition, components having the same function within the scope of the same idea shown in the drawings of each embodiment will be described using the same reference numerals, and overlapping descriptions thereof will be omitted.

본 출원과 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 출원의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 본 명세서의 설명 과정에서 이용되는 숫자(예를 들어, 제1, 제2 등)는 하나의 구성요소를 다른 구성요소와 구분하기 위한 식별기호에 불과하다.If it is determined that a detailed description of a known function or configuration related to the present application may unnecessarily obscure the gist of the present application, the detailed description thereof will be omitted. In addition, numbers (eg, first, second, etc.) used in the description process of the present specification are only identification symbols for distinguishing one component from other components.

또한, 이하의 실시예에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다.In addition, the suffixes "module" and "part" for the components used in the following embodiments are given or mixed in consideration of only the ease of writing the specification, and do not have distinct meanings or roles by themselves.

이하의 실시예에서, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.In the following examples, the singular expression includes the plural expression unless the context clearly dictates otherwise.

이하의 실시예에서, 포함하다 또는 가지다 등의 용어는 명세서상에 기재된 특징, 또는 구성요소가 존재함을 의미하는 것이고, 하나 이상의 다른 특징들 또는 구성요소가 부가될 가능성을 미리 배제하는 것은 아니다.In the following embodiments, terms such as include or have means that the features or components described in the specification are present, and the possibility that one or more other features or components will be added is not excluded in advance.

도면에서는 설명의 편의를 위하여 구성 요소들이 그 크기가 과장 또는 축소될 수 있다. 예컨대, 도면에서 나타난 각 구성의 크기 및 두께는 설명의 편의를 위해 임의로 나타낸 것으로, 본 발명이 반드시 도시된 바에 한정되지 않는다.In the drawings, the size of the components may be exaggerated or reduced for convenience of description. For example, the size and thickness of each component shown in the drawings are arbitrarily indicated for convenience of description, and the present invention is not necessarily limited to the illustrated bar.

어떤 실시예가 달리 구현 가능한 경우에 특정한 프로세스의 순서는 설명되는 순서와 다르게 수행될 수도 있다. 예를 들어, 연속하여 설명되는 두 프로세스가 실질적으로 동시에 수행될 수도 있고, 설명되는 순서와 반대의 순서로 진행될 수 있다.In cases where certain embodiments are otherwise implementable, the order of specific processes may be performed different from the order in which they are described. For example, two processes described in succession may be performed substantially simultaneously, or may be performed in an order opposite to the order described.

이하의 실시예에서, 구성 요소 등이 연결되었다고 할 때, 구성 요소들이 직접적으로 연결된 경우뿐만 아니라 구성요소들 중간에 구성 요소들이 개재되어 간접적으로 연결된 경우도 포함한다.In the following embodiments, when components are connected, it includes not only cases in which components are directly connected but also cases in which components are interposed between components and connected indirectly.

예컨대, 본 명세서에서 구성 요소 등이 전기적으로 연결되었다고 할 때, 구성 요소 등이 직접 전기적으로 연결된 경우뿐만 아니라, 그 중간에 구성 요소 등이 개재되어 간접적으로 전기적 연결된 경우도 포함한다.For example, in the present specification, when it is said that components and the like are electrically connected, it includes not only the case where the components are directly electrically connected, but also the case where the components are interposed in the middle and electrically connected indirectly.

일 실시예에 따르면, 스크린샷 이미지로부터 태그 정보를 추출하는 시스템에 있어서, 적어도 하나 이상의 스크린샷 이미지가 저장되어 있는 사용자 단말기; 및 상기 사용자 단말기와 상기 스크린샷 이미지에 관한 정보를 송수신하는 서버;를 포함하되, 상기 스크린샷 이미지는 적어도 하나 이상의 텍스트를 포함하고, 상기 사용자 단말기는 상기 스크린샷 이미지 내에서 상기 텍스트가 존재하는 텍스트 영역과 상기 텍스트가 존재하지 않는 비텍스트 영역을 획득한 후, 상기 텍스트 영역으로부터 문자 판독 모듈을 통해 텍스트 정보를 추출한 후, 상기 텍스트 정보를 상기 스크린샷 이미지에 부여된 식별번호에 맵핑 한 후 상기 서버에 전송하고, 상기 서버는 상기 사용자 단말기로부터 상기 텍스트 정보를 수신한 후, 상기 텍스트 정보로부터 키워드를 추출하고, 상기 키워드에 기초하여 도출된 태그 정보를 상기 식별번호에 맵핑 한 후 상기 사용자 단말기에 전송할 수 있다.According to one embodiment, there is provided a system for extracting tag information from a screenshot image, comprising: a user terminal in which at least one screenshot image is stored; and a server for transmitting and receiving information about the screenshot image with the user terminal, wherein the screenshot image includes at least one text, and the user terminal includes text in which the text exists in the screenshot image. After obtaining an area and a non-text area in which the text does not exist, extracting text information from the text area through a character reading module, mapping the text information to an identification number assigned to the screenshot image, and then the server After receiving the text information from the user terminal, the server extracts a keyword from the text information, maps the tag information derived based on the keyword to the identification number, and transmits it to the user terminal can

상기 스크린샷 이미지는 적어도 하나 이상의 텍스트 영역을 포함할 수 있으며, 상기 텍스트 영역은 제1 속성을 가지는 제1 텍스트 영역 및 제2 속성을 가지는 제2 텍스트 영역 내지 제N 속성을 가지는 제N 텍스트 영역을 포함하고, 상기 제1 속성 내지 제N 속성은 텍스트의 사이즈 또는 폰트 중 적어도 하나에 의해 결정될 수 있다.The screenshot image may include at least one text area, wherein the text area includes a first text area having a first property and a second text area having a second property to an N-th text area having an N-th property. and the first to Nth properties may be determined by at least one of a text size and a font.

상기 텍스트 영역은 상기 스크린샷 이미지 내의 텍스트 중 오차 범위 내에서 제1 사이즈를 가지는 텍스트가 존재하는 영역인 제1 텍스트 영역 및 오차 범위 내에서 제2 사이즈를 가지는 텍스트가 존재하는 영역인 제2 텍스트 영역을 포함할 수 있다.The text area includes a first text area, an area in which text having a first size within an error range, and a second text area, an area in which text having a second size within an error range, among texts in the screenshot image. may include

상기 제1 텍스트 영역은, 상기 스크린샷 이미지 내에서 상기 제1 사이즈를 가지는 텍스트 중 일정 범위 내에서 인접하고 있는 텍스트의 영역이고, 상기 제2 텍스트 영역은, 상기 스크린샷 이미지 내에서 상기 제2 사이즈를 가지는 텍스트 중 일정 범위 내에서 인접하고 있는 텍스트의 영역일 수 있다.The first text area is an area of text adjacent within a predetermined range among texts having the first size in the screenshot image, and the second text area is the second size text in the screenshot image. It may be a text area that is adjacent within a certain range among texts having .

상기 텍스트 영역은 상기 스크린샷 이미지 내의 텍스트 중 제1 폰트를 가지는 텍스트가 존재하는 영역인 제1 텍스트 영역 및 제2 폰트를 가지는 텍스트가 존재하는 영역인 제2 텍스트 영역을 포함할 수 있다.The text area may include a first text area, which is an area in which text having a first font, among texts in the screenshot image, and a second text area, which is an area in which text having a second font exists.

상기 제1 텍스트 영역은, 상기 스크린샷 이미지 내에서 상기 제1 폰트를 가지는 텍스트 중 일정 범위 내에서 인접하고 있는 텍스트의 영역이고, 상기 제2 텍스트 영역은, 상기 스크린샷 이미지 내에서 상기 제2 폰트를 가지는 텍스트 중 일정 범위 내에서 인접하고 있는 텍스트의 영역일 수 있다.The first text area is an area of text adjacent within a certain range among texts having the first font in the screenshot image, and the second text area is the second font in the screenshot image. It may be a text area that is adjacent within a certain range among texts having .

상기 텍스트 정보는 제1 텍스트 정보 및 제2 텍스트 정보를 포함하고, 상기 사용자 단말기는 상기 제1 텍스트 영역으로부터 상기 문자 판독 모듈을 통해 상기 제1 텍스트 정보를 추출하고, 상기 제2 텍스트 영역으로부터 상기 문자 판독 모듈을 통해 상기 제2 텍스트 정보를 추출한 후, 상기 제1 텍스트 정보 및 상기 제2 텍스트 정보를 추출하되, 상기 제1 텍스트 영역 및 상기 제2 텍스트 영역은 상기 문자 판독 모듈에 독립적으로 입력될 수 있다.The text information includes first text information and second text information, the user terminal extracts the first text information from the first text area through the character reading module, and the character from the second text area After extracting the second text information through a reading module, the first text information and the second text information are extracted, wherein the first text area and the second text area can be independently input to the character reading module have.

상기 사용자 단말기는 미리 학습된 신경망 모델을 이용하여 상기 스크린샷 이미지로부터 상기 텍스트 영역을 획득하되, 상기 미리 학습된 신경망 모델은 상기 텍스트 영역을 상기 제1 텍스트 영역 및 상기 제2 텍스트 영역 내지 상기 제N 텍스트 영역을 구분하도록 학습될 수 있다.The user terminal obtains the text area from the screenshot image by using a pre-trained neural network model, wherein the pre-trained neural network model divides the text area into the first text area and the second text area to the Nth text area. It can be learned to distinguish text areas.

상기 텍스트 정보로부터 추출되는 상기 키워드는 복수일 수 있으며, 상기 키워드는 상기 텍스트 정보에 포함되어 있는 텍스트 중 상기 스크린샷 이미지를 대표하는 단어, 숫자, 문장 또는 이들의 조합일 수 있다.The keywords extracted from the text information may be plural, and the keywords may be words, numbers, sentences, or a combination thereof representing the screenshot image among texts included in the text information.

상기 서버는 상기 텍스트 정보로부터 복수의 키워드를 추출하되, 상기 복수의 키워드에는 중요도가 반영되어 있고, 상기 중요도는 상기 텍스트 정보에 포함되어 있는 상기 스크린샷 이미지를 대표하는 단어, 숫자, 문장 또는 이들의 조합 중에서 상기 스크린샷 이미지를 대표할 확률에 기초하여 결정될 수 있다.The server extracts a plurality of keywords from the text information, wherein importance is reflected in the plurality of keywords, and the importance is a word, number, sentence or a word representing the screenshot image included in the text information. It may be determined based on a probability of representing the screenshot image among combinations.

상기 서버는 미리 학습된 신경망 모델을 통해 상기 텍스트 정보로부터 상기 키워드를 추출하고, 상기 미리 학습된 신경망 모델은 상기 텍스트 정보에 기초하여 상기 키워드를 획득하도록 학습될 수 있다.The server may extract the keyword from the text information through a pre-trained neural network model, and the pre-trained neural network model may be trained to obtain the keyword based on the text information.

상기 사용자 단말기는 상기 비텍스트 영역으로부터 객체 정보를 추출한 후, 상기 객체 정보를 상기 스크린샷 이미지에 맵핑 하여 상기 서버에 전송하고, 상기 서버는 상기 객체 정보 및 상기 키워드에 기초하여 상기 태그 정보를 도출한 후, 상기 도출된 태그 정보를 상기 스크린샷 이미지에 맵핑 한 후 상기 사용자 단말기에 전송할 수 있다.After the user terminal extracts the object information from the non-text area, the object information is mapped to the screenshot image and transmitted to the server, and the server derives the tag information based on the object information and the keyword Thereafter, the derived tag information may be mapped to the screenshot image and then transmitted to the user terminal.

일 실시예에 따르면, 스크린샷 이미지로부터 추출된 태그 정보를 사용자에게 제공하는 시스템에 있어서, 복수의 스크린샷 이미지가 저장되어 있는 사용자 단말기- 상기 스크린샷 이미지는 제1 스크린샷 이미지 및 제2 스크린샷 이미지를 포함함 -; 및 상기 사용자 단말기와 상기 스크린샷 이미지에 관한 정보를 송수신하는 서버;를 포함하되, 상기 스크린샷 이미지는 적어도 하나 이상의 텍스트를 포함하고, 상기 사용자 단말기는, 상기 제1 스크린샷 이미지로부터 문자 판독 모듈을 통해 제1 텍스트 정보를 추출한 후, 상기 제1 텍스트 정보를 상기 서버에 전송하고, 상기 제2 스크린샷 이미지로부터 상기 문자 판독 모듈을 통해 제2 텍스트 정보를 추출한 후, 상기 제2 텍스트 정보를 상기 서버에 전송하고, 상기 서버는, 상기 제1 텍스트 정보에 기초하여 제1 태그 정보를 추출하고, 상기 제2 텍스트 정보에 기초하여 제2 태그 정보를 추출한 후, 추출된 상기 제1 태그 정보 및 상기 제2 태그 정보를 상기 사용자 단말기에 전송하되, 상기 사용자 단말기는 미리 정해진 기준에 따라 상기 제1 태그 정보 및 상기 제2 태그 정보 중 적어도 하나를 추천 태그로 결정하여 사용자에게 제공할 수 있다. According to an embodiment, in a system for providing tag information extracted from a screenshot image to a user, a user terminal in which a plurality of screenshot images are stored - the screenshot image is a first screenshot image and a second screenshot image Contains images -; and a server for transmitting and receiving information about the screenshot image with the user terminal, wherein the screenshot image includes at least one text, and the user terminal comprises: a character reading module from the first screenshot image After extracting the first text information through and, the server extracts first tag information based on the first text information, extracts second tag information based on the second text information, and then extracts the extracted first tag information and the second tag information. The second tag information may be transmitted to the user terminal, and the user terminal may determine at least one of the first tag information and the second tag information as a recommendation tag according to a predetermined criterion and provide it to the user.

상기 사용자 단말기는 상기 서버로부터 제1 시점에 상기 제1 태그 정보를 수신하고, 제2 시점에 상기 제2 태그 정보를 수신하며, 상기 사용자 단말기는 상기 제1 태그 정보를 상기 추천 태그로 결정하되, 상기 제1 시점은 상기 제2 시점보다 이른 시점일 수 있다.wherein the user terminal receives the first tag information from the server at a first time and receives the second tag information at a second time, and the user terminal determines the first tag information as the recommendation tag; The first time point may be earlier than the second time point.

상기 사용자 단말기는 상기 복수의 스크린샷 이미지로부터 추출된 복수의 태그 정보를 상기 서버로부터 수신하며, 상기 사용자 단말기는 상기 복수의 태그 정보가 상기 서버로부터 수신되는 빈도에 근거하여 상기 추천 태그를 결정할 수 있다.The user terminal may receive a plurality of tag information extracted from the plurality of screenshot images from the server, and the user terminal may determine the recommendation tag based on a frequency at which the plurality of tag information is received from the server. .

상기 사용자 단말기는 상기 복수의 태그 정보 중 상기 서버로부터 수신되는 빈도가 가장 높은 태그 정보를 상기 추천 태그로 결정하여 사용자에게 제공할 수 있다.The user terminal may determine, as the recommendation tag, tag information with the highest frequency received from the server among the plurality of tag information, and provide it to the user.

상기 사용자 단말기는, 상기 제1 스크린샷 이미지 내에서 텍스트가 존재하는 제1 텍스트 영역을 획득하고, 상기 제1 텍스트 영역으로부터 상기 문자 판독 모듈을 통해 상기 제1 텍스트 정보를 추출하고, 상기 제2 스크린샷 이미지 내에서 텍스트가 존재하는 제2 텍스트 영역을 획득하고, 상기 제2 텍스트 영역으로부터 상기 문자 판독 모듈을 통해 상기 제2 텍스트 정보를 추출하되, 상기 제1 텍스트 영역 및 상기 제2 텍스트 영역은 상기 문자 판독 모듈에 독립적으로 입력될 수 있다.The user terminal acquires a first text area in which text exists in the first screenshot image, extracts the first text information from the first text area through the character reading module, and the second screen acquire a second text area in which text exists in the shot image, and extract the second text information from the second text area through the character reading module, wherein the first text area and the second text area are It can be independently input to the character reading module.

상기 제1 텍스트 영역은 상기 제1 스크린샷 이미지 내에서 텍스트가 존재하는 영역이되, 제1 속성을 가지는 텍스트 영역 및 제2 속성을 가지는 텍스트 영역 내지 제N 속성을 가지는 텍스트 영을 포함하며 상기 제1 속성 내지 제N 속성은 상기 텍스트의 사이즈 또는 폰트 중 적어도 하나에 기초하여 결정될 수 있다.The first text area is an area in which text exists in the first screenshot image, and includes a text area having a first property and a text area having a second property to a text zero having an Nth property, The first to Nth properties may be determined based on at least one of a size and a font of the text.

상기 제2 텍스트 영역은 상기 제2 스크린샷 이미지 내에서 텍스트가 존재하는 영역이되, 제1 속성을 가지는 텍스트 영역 및 제2 속성을 가지는 텍스트 영역 내지 제N 속성을 가지는 텍스트 영을 포함하며 상기 제1 속성 내지 제N 속성은 상기 텍스트의 사이즈 또는 폰트 중 적어도 하나에 기초하여 결정될 수 있다. The second text area is an area in which text exists in the second screenshot image, and includes a text area having a first property and a text area having a second property to a text zero having an Nth property, The first to Nth properties may be determined based on at least one of a size and a font of the text.

상기 제1 텍스트 영역은 상기 제1 스크린샷 이미지 내의 텍스트 중 일정 범위 내에서 인접하고 있는 텍스트의 영역이고, 상기 제2 텍스트 영역은 상기 제2 스크린샷 이미지 내의 텍스트 중 일정 범위 내에서 인접하고 있는 텍스트의 영역일 수 있다.The first text area is an area of text adjacent within a certain range among texts in the first screenshot image, and the second text area is text adjacent within a certain range of texts in the second screenshot image. may be the area of

상기 서버는, 상기 사용자 단말기로부터 상기 제1 텍스트 정보를 수신한 후, 상기 제1 텍스트 정보로부터 제1 키워드를 추출하고, 상기 제1 키워드에 기초하여 상기 제1 태그 정보를 추출하고, 상기 사용자 단말기로부터 상기 제2 텍스트 정보를 수신한 후, 상기 제2 텍스트 정보로부터 제2 키워드를 추출하고, 상기 제2 키워드에 기초하여 상기 제2 태그 정보를 추출할 수 있다.After receiving the first text information from the user terminal, the server extracts a first keyword from the first text information, extracts the first tag information based on the first keyword, and the user terminal After receiving the second text information from , a second keyword may be extracted from the second text information, and the second tag information may be extracted based on the second keyword.

상기 제1 키워드는 상기 제1 텍스트 정보에 포함되어 있는 텍스트 중 상기 제1 스크린샷 이미지를 대표하는 적어도 하나 이상의 단어, 숫자, 문장 또는 이들의 조합이고, 상기 제2 키워드는 상기 제2 텍스트 정보에 포함되어 있는 텍스트 중 상기 제2 스크린샷 이미지를 대표하는 적어도 하나 이상의 단어, 숫자, 문장 또는 이들의 조합일 수 있다.The first keyword is at least one word, number, sentence, or a combination thereof representing the first screenshot image among texts included in the first text information, and the second keyword is included in the second text information. Among the included text, it may be at least one word, number, sentence, or a combination thereof representing the second screenshot image.

상기 제1 태그 정보는 상기 제1 키워드에 기초하여 정해지는 제1 대표 이미지를 포함할 수 있고, 상기 제2 태그 정보는 상기 제2 키워드에 기초하여 정해지는 제2 대표 이미지를 포함할 수 있으며, 상기 사용자 단말기는 상기 추천 태그를 사용자에게 제공하는 경우, 상기 제1 대표 이미지 또는 상기 제2 대표 이미지 중 적어도 하나를 함께 사용자에게 제공할 수 있다.The first tag information may include a first representative image determined based on the first keyword, and the second tag information may include a second representative image determined based on the second keyword, When providing the recommendation tag to the user, the user terminal may provide the user with at least one of the first representative image and the second representative image.

상기 제1 키워드에는 중요도가 반영되어 있고, 상기 중요도는 상기 제1 텍스트 정보에 포함되어 있는 텍스트 중 상기 제1 스크린샷 이미지를 대표하는 단어, 숫자, 문장 또는 이들의 조합 중 상기 제1 스크린샷 이미지를 대표할 확률에 기초하여 결정되고, 상기 제2 키워드에는 중요도가 반영되어 있고, 상기 중요도는 상기 제2 텍스트 정보에 포함되어 있는 텍스트 중 상기 제2 스크린샷 이미지를 대표하는 단어, 숫자, 문장 또는 이들의 조합 중 상기 제2 스크린샷 이미지를 대표할 확률에 기초하여 결정될 수 있다.Importance is reflected in the first keyword, and the importance is the first screenshot image among words, numbers, sentences, or combinations thereof representing the first screenshot image among texts included in the first text information. is determined based on a probability of representing Among these combinations, it may be determined based on a probability of representing the second screenshot image.

일 실시예에 따르면, 사용자 단말기에 있어서, 복수의 스크린샷 이미지를 저장하는 저장부- 상기 복수의 스크린샷 이미지 각각에는 적어도 하나 이상의 태그 정보가 맵핑 되어 있음 -; 상기 복수의 스크린샷 이미지 중 적어도 하나 이상의 스크린샷 이미지를 디스플레이하는 디스플레이부; 사용자 입력을 입력 받는 입력부; 외부 서버와 통신을 수행하는 통신부; 및 상기 사용자 입력에 기초하여 상기 복수의 스크린샷 이미지 중 상기 디스플레이부에 표시할 적어도 하나 이상의 스크린샷 이미지를 결정하는 제어부;를 포함하되, 상기 제어부는, 상기 사용자 입력이 상기 입력부를 통해 입력되면, 상기 저장부에 저장되어 있는 상기 태그 정보 중 상기 사용자 입력에 대응되는 매칭 태그 정보를 결정한 후, 상기 매칭 태그 정보가 맵핑 되어 있는 스크린샷 이미지가 상기 디스플레이부를 통해 출력되도록 제어하되, 상기 제어부는 문자 판독 모듈을 통해 상기 스크린샷 이미지로부터 텍스트 정보를 추출한 후, 상기 텍스트 정보를 상기 서버로 전송하고, 상기 서버는 상기 텍스트 정보로부터 상기 태그 정보를 추출하여 상기 사용자 단말기에 전송할 수 있다.According to an embodiment, in a user terminal, a storage unit for storing a plurality of screenshot images, wherein at least one or more tag information is mapped to each of the plurality of screenshot images; a display unit configured to display at least one screenshot image among the plurality of screenshot images; an input unit receiving a user input; a communication unit for communicating with an external server; and a control unit configured to determine at least one screen shot image to be displayed on the display unit among the plurality of screenshot images based on the user input, wherein the control unit includes, when the user input is input through the input unit, After determining matching tag information corresponding to the user input from among the tag information stored in the storage unit, a screenshot image to which the matching tag information is mapped is controlled to be output through the display unit, wherein the control unit reads characters After extracting text information from the screenshot image through a module, the text information may be transmitted to the server, and the server may extract the tag information from the text information and transmit it to the user terminal.

상기 사용자 입력은 제1 사용자 입력 및 제2 사용자 입력을 포함하고, 상기 입력부는 제1 시점에 상기 제1 사용자 입력을 입력 받고, 상기 제1 시점과 다른 제2 시점에 상기 제2 사용자 입력을 입력 받으며, 상기 제어부는 상기 제1 사용자 입력 및 상기 제2 사용자 입력이 상기 입력부에 입력된 시점을 고려하여 상기 매칭 태그 정보를 결정할 수 있다.The user input includes a first user input and a second user input, and the input unit receives the first user input at a first time point and inputs the second user input at a second time point different from the first time point. received, the control unit may determine the matching tag information in consideration of a time point at which the first user input and the second user input are input to the input unit.

상기 제어부는 상기 제1 사용자 입력 및 상기 제2 사용자 입력에 기초하여 상기 매칭 태그 정보를 결정하되, 상기 제1 사용자 입력에 중요도를 두어 결정하고, 상기 제1 시점은 상기 제2 시점보다 이른 시점일 수 있다.The control unit determines the matching tag information based on the first user input and the second user input, and determines the matching tag information by giving importance to the first user input, wherein the first time point is earlier than the second time point can

상기 매칭 태그 정보는 상기 복수의 태그 정보 중 상기 제1 사용자 입력 또는 상기 제2 사용자 입력 중 적어도 하나에 대응되는 태그 정보일 수 있다.The matching tag information may be tag information corresponding to at least one of the first user input and the second user input among the plurality of tag information.

상기 매칭 태그 정보는 상기 복수의 태그 정보 중 상기 제1 사용자 입력에 대응되는 태그 정보 중에서 상기 제2 사용자 입력에 대응되는 태그 정보일 수 있다.The matching tag information may be tag information corresponding to the second user input from among the plurality of tag information corresponding to the first user input.

상기 제어부는 상기 매칭 태그 정보가 맵핑 되어 있는 스크린샷 이미지가 상기 디스플레이부를 통해 출력되도록 제어하되, 상기 스크린샷 이미지가 상기 저장부에 저장된 시간 순서에 따라 상기 디스플레이부에 표시되도록 제어할 수 있다.The control unit may control the screen shot image to which the matching tag information is mapped to be output through the display unit, and control so that the screenshot image is displayed on the display unit according to a time sequence stored in the storage unit.

상기 제어부는 상기 매칭 태그 정보가 맵핑 되어 있는 스크린샷 이미지가 상기 디스플레이부를 통해 출력되도록 제어하되, 상기 매칭 태그 정보가 상기 사용자 입력에 대응될 확률 값이 높은 순서에 따라 상기 디스플레이부에 표시되도록 제어할 수 있다.The control unit controls the screen shot image to which the matching tag information is mapped to be output through the display unit, and controls the matching tag information to be displayed on the display unit in order of increasing probability values corresponding to the user input. can

1 전체 프로세스One whole process

이하에서는 일 실시예에 따른 스크린샷 이미지로부터 태그 정보를 추출하는 방법에 관하여 설명한다.Hereinafter, a method of extracting tag information from a screenshot image according to an embodiment will be described.

도 1은 일 실시예에 따른 스크린샷 이미지로부터 태그 정보를 추출하는 방법을 설명하기 위한 도면이고, 도 2는 사용자 단말기의 구성을 설명하기 위한 도면이다.1 is a diagram for explaining a method of extracting tag information from a screenshot image according to an embodiment, and FIG. 2 is a diagram for explaining the configuration of a user terminal.

도 1을 참조하면, 일 실시예에 따른 스크린샷 이미지로부터 태그 정보를 추출하는 방법은 사용자 단말기(1000) 및 서버(2000)를 통해 수행될 수 있다.Referring to FIG. 1 , a method of extracting tag information from a screenshot image according to an embodiment may be performed through a user terminal 1000 and a server 2000 .

사용자 단말기(1000)는 도 2를 참조하면, 제어부(100), 이미지 촬영부(200), 저장부(300), 사용자 입력부(400), 출력부(500), 전원 공급부(600) 및 통신부(700)를 포함할 수 있다. 이때, 사용자 단말기(1000)는 휴대 가능한 정보통신기기 예컨대, 스마트폰, 테블릿 등을 포함할 수 있다.Referring to FIG. 2, the user terminal 1000 includes a control unit 100, an image capturing unit 200, a storage unit 300, a user input unit 400, an output unit 500, a power supply unit 600, and a communication unit ( 700) may be included. In this case, the user terminal 1000 may include a portable information and communication device, for example, a smart phone or a tablet.

이미지 촬영부(200)는 디지털 카메라로, 이미지 센서와 영상처리부를 포함할 수 있다. 이미지 센서는 광학 영상(image)을 전기적 신호로 변환하는 장치로, 다수개의 광 다이오드(photo diode)가 직접된 칩으로 구성될 수 있다. 예시적으로, 이미지 센서는 CCD(Charge Coupled Device), CMOS(Complementary Metal Oxide Semiconductor) 등을 포함할 수 있다. 한편, 영상처리부는 촬영된 결과를 영상 처리하여, 영상 정보를 생성할 수 있다. The image capturing unit 200 is a digital camera and may include an image sensor and an image processing unit. An image sensor is a device that converts an optical image into an electrical signal, and may be configured as a chip in which a plurality of photodiodes are integrated. For example, the image sensor may include a charge coupled device (CCD), a complementary metal oxide semiconductor (CMOS), or the like. Meanwhile, the image processing unit may generate image information by image processing the captured result.

저장부(300)는 마이크로 프로세서(micro processor)에 의해 읽힐 수 있는 데이터를 저장하는 저장수단으로, HDD(Hard Disk Drive), SSD(Solid State Disk), SDD(Silicon Disk Drive), ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치를 포함할 수 있다. The storage unit 300 is a storage means for storing data that can be read by a microprocessor, and includes a hard disk drive (HDD), a solid state disk (SSD), a silicon disk drive (SDD), a ROM, a RAM, CD-ROMs, magnetic tapes, floppy disks, and optical data storage devices may be included.

보다 구체적으로, 저장부(300)에는 사용자 단말기(1000)에 수신되는 데이터가 저장될 수 있다. 예컨대, 저장부(300)에는 이미지 촬영부(200)를 통해 사용자가 직접 촬영한 영상이 저장될 수 있고, 온라인으로 획득되어 출력부(500) 상에 출력된 정보를 사용자가 캡처한 스크린샷 이미지가 저장될 수 있다.More specifically, data received by the user terminal 1000 may be stored in the storage unit 300 . For example, the storage unit 300 may store an image directly captured by the user through the image capturing unit 200 , and a screenshot image obtained by a user capturing information obtained online and output on the output unit 500 . can be stored.

사용자 입력부(400)는 사용자 단말기(1000)에 대한 사용자의 입력을 수신한다. 수신된 입력은 제어부(100)에 전달될 수 있다. 일 실시예에 따르면, 사용자 입력부(400)는 터치 디스플레이를 통해 사용자의 입력을 수신할 수 있다. 또한, 사용자 입력부(400)는 사용자로부터 명령이 입력되는 사용자 인터페이스 화면을 의미할 수 있다.The user input unit 400 receives a user input for the user terminal 1000 . The received input may be transmitted to the controller 100 . According to an embodiment, the user input unit 400 may receive a user input through a touch display. Also, the user input unit 400 may refer to a user interface screen on which a command is input from a user.

출력부(500)는 제어부(100)의 제어 명령에 따라 각종 정보를 출력한다. 일 실시예에 따르면, 출력부(500)는 디스플레이 패널을 통해 정보를 출력할 수 있다. 보다 구체적으로, 출력부(500)는 디스플레이 패널을 통해 사용자의 탈모 상태와 관련된 정보를 출력할 수 있다. 다만, 출력부(500)는 디스플레이 패널로 한정되지 않으며, 스피커 등 정보를 출력할 수 있는 다양한 수단을 포함할 수 있다.The output unit 500 outputs various types of information according to a control command of the control unit 100 . According to an embodiment, the output unit 500 may output information through the display panel. More specifically, the output unit 500 may output information related to the user's hair loss state through the display panel. However, the output unit 500 is not limited to a display panel, and may include various means for outputting information, such as a speaker.

전원 공급부(600)는 배터리를 포함하며, 상기 배터리는 사용자 단말기(1000)에 내장되거나 외부에서 착탈이 가능하게 구비될 수 있다. 전원 공급부(600)는 사용자 단말기(1000)의 각 구성 요소에서 필요로 하는 전력을 공급할 수 있다.The power supply unit 600 includes a battery, and the battery may be embedded in the user terminal 1000 or may be detachably provided from the outside. The power supply unit 600 may supply power required by each component of the user terminal 1000 .

통신부(700)는 무선 통신 모듈 및/또는 유선 통신 모듈을 포함할 수 있다. 여기서, 무선 통신 모듈은 와이파이(Wi-Fi) 통신 모듈, 셀룰러 통신 모듈 등을 포함할 수 있다.The communication unit 700 may include a wireless communication module and/or a wired communication module. Here, the wireless communication module may include a Wi-Fi communication module, a cellular communication module, and the like.

제어부(100)는 적어도 하나의 프로세서를 포함할 수 있다. 이때, 각각의 프로세서는 메모리에 저장된 적어도 하나의 명령어를 실행시킴으로써, 소정의 동작을 실행할 수 있다. 구체적으로, 제어부(100)는 사용자 단말기(1000)에 포함되어 있는 구성들의 전체적인 동작을 제어할 수 있다. 다시 말해, 사용자 단말기(1000)는 제어부(100)에 의해 제어 또는 동작될 수 있다.The controller 100 may include at least one processor. In this case, each processor may execute a predetermined operation by executing at least one instruction stored in the memory. Specifically, the controller 100 may control the overall operation of components included in the user terminal 1000 . In other words, the user terminal 1000 may be controlled or operated by the controller 100 .

일 실시예에 따르면, 사용자는 사용자 단말기(1000)로 웹 검색을 할 수 있고, 검색 내용을 통해 필요한 정보를 획득할 수 있다. 사용자는 필요한 경우 획득한 정보를 단말기에 저장해둘 수 있으며, 이때 사용자 단말기(1000)에 구비된 스크린샷 기능을 활용할 수 있다.According to an embodiment, the user may perform a web search through the user terminal 1000 and obtain necessary information through the search contents. If necessary, the user may store the acquired information in the terminal, and in this case, the screen shot function provided in the user terminal 1000 may be utilized.

한편, 저장부(300)에는 사용자가 캡처한 스크린샷 이미지 외에도 사용자가 대상을 직접 촬영함으로써 저장된 다수의 이미지(이하, 촬영 이미지라고 한다)들이 포함되어 있을 수 있다. 이때, 저장부(300)에는 촬영 이미지와 스크린샷 이미지가 누적되어 저장되어 있을 수 있다.Meanwhile, the storage unit 300 may include a plurality of images (hereinafter, referred to as captured images) stored by the user directly photographing an object in addition to the screenshot image captured by the user. At this time, the storage unit 300 may store the captured image and the screenshot image accumulated.

도 3은 사용자 단말기에 구비된 저장부에 저장되어 있는 다양한 이미지들을 예시적으로 나타낸 도면이다. 도 3을 참조하면, 예를 들어, 사용자가 캡처한 스크린샷 이미지 및 촬영 이미지는 단일한 저장부(300)에 저장되어 있을 수 있다. 다른 예로, 스크린샷 이미지 및 촬영 이미지는 저장부(300)의 서로 다른 위치에 분류된 후 저장되어 있을 수 있다.3 is a diagram illustrating various images stored in a storage unit provided in a user terminal by way of example. Referring to FIG. 3 , for example, a screenshot image captured by a user and a photographed image may be stored in a single storage unit 300 . As another example, the screenshot image and the captured image may be stored after being classified in different locations of the storage unit 300 .

이때, 일 실시예에 따르면, 저장부(300)에 저장된 다수의 이미지들 중에서 스크린샷 이미지만을 분리할 수 있고, 또는 저장부(300)에 저장된 다수의 스크린샷 이미지들 중에서 원하는 스크린샷 이미지만을 검색할 수 있다. In this case, according to an embodiment, only a screenshot image may be separated from among a plurality of images stored in the storage unit 300 , or only a desired screenshot image may be searched from among a plurality of screenshot images stored in the storage unit 300 . can do.

이하에서는, 저장부(300)에 저장된 다수의 스크린샷 이미지들 중에서 원하는 스크린샷 이미지를 검색하기 위한 일 방법으로 스크린샷 이미지로부터 태그 정보를 추출하는 방법에 대하여 설명한다.Hereinafter, a method of extracting tag information from a screenshot image as a method for searching for a desired screenshot image among a plurality of screenshot images stored in the storage unit 300 will be described.

2 스크린샷 이미지로부터 태그 정보를 추출하는 방법2 How to extract tag information from screenshot image

2.1 스크린샷 이미지 설명2.1 screenshot image description

도 4는 스크린샷 이미지를 예시적으로 설명하기 위한 도면이다. 도 4를 참조하여 스크린샷 이미지에 대하여 예시적으로 설명한다.4 is a diagram for illustratively explaining a screenshot image. A screenshot image will be described by way of example with reference to FIG. 4 .

일 실시예에 따르면, 저장부(300)에는 다수의 스크린샷 이미지들이 저장되어 있을 수 있다. 이때, 스크린샷 이미지는 사용자 단말기(1000)의 출력부(500)에 표시되는 화면이 사용자에 의해 캡처된 이미지를 의미할 수 있다.According to an embodiment, a plurality of screenshot images may be stored in the storage unit 300 . In this case, the screenshot image may mean an image captured by the user of the screen displayed on the output unit 500 of the user terminal 1000 .

스크린샷 이미지는 웹서핑 화면(예를 들어, 쇼핑몰, 뉴스 등)의 일부가 캡처된 이미지 또는 단말기에 저장되어 있는 어플리케이션의 실행 화면(예를 들어, 기프티콘 이미지 등)의 일부가 캡처된 이미지 일 수 있다.The screenshot image may be an image in which a part of the web surfing screen (eg, shopping mall, news, etc.) is captured or an image in which a part of the execution screen (eg, gifticon image, etc.) of an application stored in the terminal is captured. have.

캡처된 스크린샷 이미지에는 다양한 정보가 포함되어 있으며, 해당 정보는 이미지 또는 텍스트 형식으로 구비되어 있을 수 있다. 예컨대, 도 4를 참조하면, 스크린샷 이미지에는 복수의 이미지 영역(NTA1, NTA2)이 포함되어 있을 수 있고, 복수의 텍스트 영역(TA1 내지 TA8)이 포함되어 있을 수 있다.The captured screenshot image includes various information, and the information may be provided in the form of an image or text. For example, referring to FIG. 4 , the screenshot image may include a plurality of image areas NTA1 and NTA2 and may include a plurality of text areas TA1 to TA8 .

이 경우, 복수의 텍스트 영역(TA1 내지 TA8)에 포함되어 있는 텍스트는 동일한 폰트로 구성되어 있을 수 있고, 동일한 크기로 구성되어 있을 수 있다. 또는, 복수의 텍스트 영역(TA1 내지 TA8)에 포함되어 있는 텍스트는 서로 다른 폰트로 구성되어 있을 수 있고, 서로 다른 크기로 구성되어 있을 수 있다. 또는, 복수의 텍스트 영역(TA1 내지 TA8)에 포함되어 있는 텍스트는 동일한 폰트로 구성되어 있으나 서로 다른 크기로 구성되어 있을 수 있다. 또는, 복수의 텍스트 영역(TA1 내지 TA8)에 포함되어 있는 텍스트는 서로 다른 폰트로 구성되어 있으나 동일한 크기로 구성되어 있을 수 있다.In this case, the text included in the plurality of text areas TA1 to TA8 may be configured with the same font or may be configured with the same size. Alternatively, texts included in the plurality of text areas TA1 to TA8 may be configured in different fonts or have different sizes. Alternatively, texts included in the plurality of text areas TA1 to TA8 may have the same font but different sizes. Alternatively, texts included in the plurality of text areas TA1 to TA8 may be configured in different fonts but have the same size.

일 실시예에 따른 스크린샷 이미지로부터 태그 정보를 추출하는 방법은 스크린샷 이미지에 포함되어 있는 복수의 텍스트 영역(TA1 내지 TA8)에 기초하여 태그 정보를 추출할 수 있으며, 이에 대하여는 이하에서 상세히 설명한다.The method of extracting tag information from a screenshot image according to an embodiment may extract tag information based on a plurality of text areas TA1 to TA8 included in the screenshot image, which will be described in detail below. .

2.2 단말기에서 수행되는 전체 프로세스2.2 The whole process performed on the terminal

도 5는 일 실시예에 따른 스크린샷 이미지로부터 태그 정보를 추출하는 방법의 전체 프로세스를 설명하기 위한 도면이다. 5 is a diagram for explaining an overall process of a method of extracting tag information from a screenshot image according to an embodiment.

도 5를 참조하면, 일 실시예에 따른 스크린샷 이미지로부터 태그 정보를 추출하는 방법은 스크린샷 이미지를 획득하는 단계(S1100), 스크린샷 이미지를 분석하는 단계(S1200), 스크린샷 이미지를 복수의 영역으로 구분하는 단계(S1300), 복수의 영역 중 텍스트 영역으로부터 텍스트를 추출하는 단계(S1400), 추출된 텍스트를 스크린샷 이미지와 맵핑 후 저장하는 단계(S1500), 추출된 텍스트를 서버로 전송하는 단계(S1600)를 포함할 수 있다.Referring to FIG. 5 , the method for extracting tag information from a screenshot image according to an embodiment includes the steps of obtaining a screenshot image (S1100), analyzing the screenshot image (S1200), and converting the screenshot image into a plurality of Separating into regions (S1300), extracting text from a text region among a plurality of regions (S1400), mapping the extracted text with a screenshot image and storing it (S1500), sending the extracted text to a server It may include a step (S1600).

스크린샷 이미지를 획득하는 단계(S1100)는 상술한 바와 같이 사용자의 동작에 의해 캡처되어 저장부(300)에 저장되어 있는 복수의 스크린샷 이미지 중 적어도 하나의 스크린샷 이미지 또는 2이상의 스크린샷 이미지를 획득하는 단계를 포함할 수 있다.The step of obtaining a screenshot image (S1100) includes at least one screenshot image or two or more screenshot images among a plurality of screenshot images captured by the user's action and stored in the storage unit 300 as described above. It may include the step of obtaining.

또한, 스크린샷 이미지를 획득하는 단계(S1100)는 외부로부터 스크린샷 이미지를 획득하는 단계를 포함할 수 있다. 예를 들어, 스크린샷 이미지 획득 단계(S1100)에서 획득되는 스크린샷 이미지는 외부 장치에 의해 캡처된 스크린샷 이미지가 타인에 의해 공유된 이미지일 수 있다.In addition, the step of acquiring the screenshot image ( S1100 ) may include acquiring the screenshot image from the outside. For example, the screenshot image acquired in the screenshot image acquisition step S1100 may be an image in which a screenshot image captured by an external device is shared by others.

스크린샷 이미지를 분석하는 단계(S1200)에서는 획득된 스크린샷 이미지의 분석이 수행될 수 있다. 또한, 스크린샷 이미지를 분석하는 단계(S1200)는 획득된 스크린샷 이미지가 분석하기에 적합하도록 이미지 전처리를 수행하는 단계를 포함할 수 있다. 예컨대, 스크린샷 이미지 분석 단계(S1200)를 통해, 획득된 스크린샷 이미지가 이미지 분석에 적합한 해상도, 선명도, 밝기, 채도 등을 구비하도록 이미지 전처리가 수행될 수 있다.In the step of analyzing the screenshot image ( S1200 ), the analysis of the acquired screenshot image may be performed. In addition, analyzing the screenshot image ( S1200 ) may include performing image pre-processing so that the obtained screenshot image is suitable for analysis. For example, through the screenshot image analysis step ( S1200 ), image preprocessing may be performed so that the obtained screenshot image has a resolution, sharpness, brightness, saturation, etc. suitable for image analysis.

스크린샷 이미지를 복수의 영역으로 구분하는 단계(S1300)를 통해 스크린샷 이미지는 복수의 영역으로 구분될 수 있다. 스크린샷 이미지를 복수의 영역으로 구분하는 단계(S1300)는 획득된 스크린샷 이미지 중 적어도 일부 영역을 텍스트 영역 또는 비텍스트 영역으로 구분하는 단계를 포함할 수 있다. Through the step of dividing the screenshot image into a plurality of regions ( S1300 ), the screenshot image may be divided into a plurality of regions. Separating the screenshot image into a plurality of regions ( S1300 ) may include dividing at least some regions of the obtained screenshot image into a text region or a non-text region.

도 6은 신경망 모델을 통해 스크린샷 이미지를 텍스트 영역 또는 비텍스트 영역으로 구분하는 것을 설명하기 위한 도면이다. 도 6을 참조하면, 획득된 스크린샷 이미지(SI)는 제1 신경망 모델(NN1)을 통해 텍스트 영역(TA) 또는 비텍스트 영역(NTA)로 구분될 수 있다.6 is a diagram for explaining the classification of a screenshot image into a text area or a non-text area through a neural network model. Referring to FIG. 6 , the obtained screenshot image SI may be divided into a text area TA or a non-text area NTA through the first neural network model NN1 .

텍스트 영역(TA)은 획득된 스크린샷 이미지 내에서 텍스트가 포함되어 있는 영역을 의미할 수 있으며, 텍스트 영역(TA)은 복수의 텍스트 영역 예컨대, 제1 텍스트 영역, 제2 텍스트 영역 등을 포함할 수 있다.The text area TA may mean an area including text within the acquired screenshot image, and the text area TA may include a plurality of text areas, for example, a first text area, a second text area, and the like. can

이때, 텍스트 영역(TA)은 스크린샷 이미지에 포함된 텍스트들의 사이즈 및/또는 폰트를 고려하여 구분되는 제1 텍스트 영역 및 제2 텍스트 영역을 포함할 수 있다.In this case, the text area TA may include a first text area and a second text area divided in consideration of the size and/or font of texts included in the screenshot image.

일 예로, 텍스트 영역(TA)은 텍스트의 크기(또는 사이즈)에 기초하여 구분되는 제1 텍스트 영역, 제2 텍스트 영역 등을 포함할 수 있다. 예를 들어, 텍스트 영역(TA)은 스크린샷 이미지 내에 포함되어 있는 텍스트 중 제1 사이즈를 가지는 텍스트들이 포함되어 있는 제1 텍스트 영역, 제2 사이즈를 가지는 텍스트들이 포함되어 있는 제2 텍스트 영역을 포함할 수 있다.For example, the text area TA may include a first text area, a second text area, and the like, which are divided based on the size (or size) of the text. For example, the text area TA includes a first text area including texts having a first size among texts included in the screenshot image, and a second text area including texts having a second size. can do.

다른 예로, 텍스트 영역(TA)은 텍스트의 폰트(글자체)에 기초하여 구분되는 제1 텍스트 영역, 제2 텍스트 영역 등을 포함할 수 있다. 예를 들어, 텍스트 영역(TA)은 스크린샷 이미지 내에 포함되어 있는 텍스트 중 제1 폰트를 가지는 텍스트들이 포함되어 있는 제1 텍스트 영역, 제2 폰트를 가지는 텍스트들이 포함되어 있는 제2 텍스트 영역을 포함할 수 있다.As another example, the text area TA may include a first text area, a second text area, and the like, which are divided based on a font (typeface) of the text. For example, the text area TA includes a first text area including texts having a first font among texts included in the screenshot image, and a second text area including texts having a second font. can do.

다른 예로, 텍스트 영역(TA)은 텍스트의 사이즈 및 폰트에 기초하여 구비되는 제1 텍스트 영역, 제2 텍스트 영역 등을 포함할 수 있다. 예를 들어, 텍스트 영역(TA)은 스크린샷 이미지 내에 포함되어 있는 텍스트 중 제1 사이즈를 가지며 제1 폰트에 해당하는 텍스트들이 포함된 제1 텍스트 영역, 제2 사이즈를 가지며 제2 폰트에 해당하는 텍스트들이 포함된 제2 텍스트 영역을 포함할 수 있다. As another example, the text area TA may include a first text area, a second text area, etc. provided based on the size and font of the text. For example, the text area TA has a first size among texts included in the screenshot image and includes a first text area including texts corresponding to the first font and a second size corresponding to the second font. A second text area including texts may be included.

다시 말해, 텍스트 영역(TA)은 제1 속성을 가지는 제1 텍스트 영역 및 제2 속성을 가지는 제2 텍스트 영역 내지 제N 속성을 가지는 제N 텍스트 영역을 포함할 수 있다. 이 때, 상기 제1 속성 내지 제N 속성은 텍스트의 사이즈 또는 폰트 중 적어도 하나에 기초하여 결정될 수 있다.In other words, the text area TA may include a first text area having a first property and a second text area having a second property to an N-th text area having an N-th property. In this case, the first to Nth properties may be determined based on at least one of a text size and a font.

결국, 획득된 스크린샷 이미지(SI)는 제1 속성을 가지는 제1 텍스트 영역, 제2 속성을 가지는 제2 텍스트 영역 내지 제N 속성을 가지는 제N 텍스트 영역 및 비텍스트(NTA) 영역으로 구분될 수 있다.As a result, the obtained screenshot image SI may be divided into a first text area having a first property, a second text area having a second property, an N-th text area having an N-th property, and a non-text area (NTA). can

상기 획득된 스크린샷 이미지(SI)는 미리 학습된 신경망 모델을 통해 제1 텍스트 영역, 제2 텍스트 영역 내지 제N 텍스트 영역, 비텍스트 영역으로 구분될 수 있으나, 이에 한정되는 것은 아니다.The obtained screenshot image SI may be divided into a first text area, a second text area to an N-th text area, and a non-text area through a pre-trained neural network model, but is not limited thereto.

비텍스트 영역(NTA)은 스크린샷 이미지에 포함되어 있는 콘텐츠 중 텍스트가 아닌 부분으로 구성된 영역을 의미할 수 있다. 예를 들어, 비텍스트 영역(NTA)은 스크린샷 이미지에 포함되어 있는 사물 또는 인물 등의 사진, 로고, 상표, 제품 사진 등이 포함된 영역을 의미할 수 있다.The non-text area (NTA) may mean an area composed of a non-text portion of content included in the screenshot image. For example, the non-text area NTA may mean an area including photos of objects or people included in the screenshot image, logos, trademarks, product photos, and the like.

이때, 비텍스트 영역(NTA)은 스크린샷 이미지에 포함되어 있는 콘텐츠 중 텍스트가 아닌 부분의 개수에 대응되어 복수개로 형성될 수 있다. 즉, 비텍스트 영역(NTA)은 제1 비텍스트 영역, 제2 비텍스트 영역 등을 포함할 수 있다. In this case, the non-text area NTA may be formed in plurality to correspond to the number of non-text parts of the content included in the screenshot image. That is, the non-text area NTA may include a first non-text area, a second non-text area, and the like.

도 6을 참조하면, 일 실시예에 따르면, 제어부(100)는 미리 학습된 제1 신경망 모델(NN1)을 통해 스크린샷 이미지(SI)를 텍스트 영역(TA) 또는 비텍스트 영역(NTA)로 구분할 수 있다.Referring to FIG. 6 , according to an embodiment, the controller 100 divides a screenshot image SI into a text area TA or a non-text area NTA through the first neural network model NN1 trained in advance. can

제1 신경망 모델(NN1)은 스크린샷 이미지를 입력 받아 스크린샷 이미지로부터 텍스트 영역(TA) 및/또는 비텍스트 영역(NTA)을 획득하도록 학습될 수 있다. 제1 신경망 모델(NN1)은 스크린샷 이미지에 기초하여 텍스트 영역(TA) 및 비텍스트 영역(NTA)을 획득하도록 학습될 수 있다.The first neural network model NN1 may be trained to receive a screenshot image and obtain a text area TA and/or a non-text area NTA from the screenshot image. The first neural network model NN1 may be trained to obtain a text area TA and a non-text area NTA based on the screenshot image.

제1 신경망 모델(NN1)은 스크린샷 이미지와 라벨링 데이터를 포함하는 학습 데이터를 이용하여 학습될 수 있다. 이때, 상기 라벨링 데이터는 텍스트 영역(TA)에 대응되는 제1 라벨링 값을 포함할 수 있다. 또한, 상기 라벨링 데이터는 비텍스트 영역(NTA)에 대응되는 제2 라벨링 값을 포함할 수 있다.The first neural network model NN1 may be trained using training data including a screenshot image and labeling data. In this case, the labeling data may include a first labeling value corresponding to the text area TA. Also, the labeling data may include a second labeling value corresponding to the non-text area NTA.

구체적으로, 제1 신경망 모델(NN1)은 스크린샷 이미지를 입력 받은 후 출력 값을 획득할 수 있다. 이후, 제1 신경망 모델(NN1)은 출력 값과 라벨링 데이터의 차이를 고려하여 산출된 오차 값에 기초하여 상기 제1 신경망 모델(NN1)을 갱신하는 방법으로 학습될 수 있다. 이때, 상기 출력 값은 상기 제1 라벨링 값에 대응되는 제1 출력 값 및 상기 제2 라벨링 값에 대응되는 제2 출력 값을 포함할 수 있다.Specifically, the first neural network model NN1 may obtain an output value after receiving a screenshot image. Thereafter, the first neural network model NN1 may be trained by a method of updating the first neural network model NN1 based on an error value calculated in consideration of the difference between the output value and the labeling data. In this case, the output value may include a first output value corresponding to the first labeling value and a second output value corresponding to the second labeling value.

제어부(100)는 제1 신경망 모델(NN1)을 이용하여 복수의 텍스트 영역(TA)을 획득할 수 있다. 제어부(100)는 제1 신경망 모델(NN1)을 이용하여 제1 텍스트 영역, 제2 텍스트 영역을 획득할 수 있다.The controller 100 may acquire a plurality of text areas TA by using the first neural network model NN1 . The controller 100 may obtain the first text area and the second text area by using the first neural network model NN1 .

예시적으로, 제어부(100)는 제1 신경망 모델(NN1)을 이용하여 스크린샷 이미지내에 포함된 텍스트 중에서 제1 사이즈를 가진 텍스트의 영역(예컨대, 제1 텍스트 영역)을 획득할 수 있고, 제2 사이즈를 가진 텍스트의 영역(예컨대, 제2 텍스트 영역)을 획득할 수 있다.Exemplarily, the controller 100 may obtain a text area (eg, a first text area) having a first size among texts included in the screenshot image by using the first neural network model NN1, A text area (eg, a second text area) having 2 sizes may be obtained.

이 경우, 상기 제1 텍스트 영역은 스크린샷 이미지 내에서 미리 정해진 범위 내에 위치한 텍스트 중에서 제1 사이즈를 가진 텍스트의 영역을 의미할 수 있고, 제2 텍스트 영역은 스크린샷 이미지내에서 미리 정해진 범위 내에 위치한 텍스트 중에서 제2 사이즈를 가진 텍스트의 영역을 의미할 수 있다. 보다 구체적으로, 제1 텍스트 영역은 스크린샷 이미지 내의 제1 사이즈를 가진 텍스트들 중 일정 범위 내에서 인접하고 있는 텍스트들의 영역을 의미할 수 있다. 마찬가지로, 제2 텍스트 영역은 스크린샷 이미지 내의 제2 사이즈를 가진 텍스트들 중 일정 범위 내에서 인접하고 있는 텍스트들의 영역을 의미할 수 있다.In this case, the first text area may mean a text area having a first size among texts located within a predetermined range within the screenshot image, and the second text area may be located within a predetermined range within the screenshot image. It may mean a text area having a second size among texts. More specifically, the first text area may mean an area of adjacent texts within a predetermined range among texts having the first size in the screenshot image. Similarly, the second text area may mean an area of adjacent texts within a predetermined range among texts having the second size in the screenshot image.

다른 예로, 제어부(100)는 제1 신경망 모델(NN1)을 이용하여 스크린샷 이미지에 포함된 텍스트 중에서 제1 폰트를 가지는 텍스트의 영역(예컨대, 제1 텍스트 영역)을 획득할 수 있고, 제2 폰트를 가지는 텍스트의 영역(예컨대, 제2 텍스트 영역)을 획득할 수 있다.As another example, the controller 100 may obtain a text area (eg, a first text area) having a first font from among texts included in the screenshot image by using the first neural network model NN1 , and the second A text area having a font (eg, a second text area) may be acquired.

이 경우, 상기 제1 텍스트 영역은 스크린샷 이미지내에서 미리 정해진 범위 내에 위치한 텍스트 중에서 제1 폰트를 가진 텍스트의 영역을 의미할 수 있고, 제2 텍스트 영역은 스크린샷 이미지내에서 미리 정해진 범위 내에 위치한 텍스트 중에서 제2 폰트를 가진 텍스트의 영역을 의미할 수 있다. 보다 구체적으로, 제1 텍스트 영역은 스크린샷 이미지 내의 제1 폰트를 가진 텍스트들 중 일정 범위 내에서 인접하고 있는 텍스트들의 영역을 의미할 수 있다. 마찬가지로, 제2 텍스트 영역은 스크린샷 이미지 내의 제2 폰트를 가진 텍스트들 중 일정 범위 내에서 인접하고 있는 텍스트들의 영역을 의미할 수 있다.In this case, the first text area may mean a text area having a first font among texts located within a predetermined range within the screenshot image, and the second text area may be located within a predetermined range within the screenshot image. It may mean a text area having the second font among texts. More specifically, the first text area may mean an area of adjacent texts within a predetermined range among texts having the first font in the screenshot image. Similarly, the second text area may mean an area of adjacent texts within a predetermined range among texts having the second font in the screenshot image.

한편, 스크린샷 이미지내에 포함된 텍스트 중 제1 사이즈를 가진 텍스트 영역 또는 제1 폰트를 가지는 텍스트 영역은 복수일 수 있다. 예컨대, 스크린샷 이미지 내에는 제1 사이즈를 가진 복수의 텍스트 영역이 있을 수 있으며, 상기 복수의 텍스트 영역은 스크린샷 이미지 내에서 서로 다른 위치에 존재할 수 있다. 마찬가지로, 스크린샷 이미지 내에는 제1 폰트를 가진 복수의 텍스트 영역이 있을 수 있으며, 상기 복수의 폰트 영역은 스크린샷 이미지 내에서 서로 다른 위치에 존재할 수 있다.Meanwhile, among texts included in the screenshot image, a text area having a first size or a text area having a first font may be plural. For example, there may be a plurality of text areas having a first size in the screenshot image, and the plurality of text areas may exist at different positions in the screenshot image. Similarly, there may be a plurality of text areas having the first font in the screenshot image, and the plurality of font areas may exist in different positions within the screenshot image.

도 4를 참조하면, 제어부(100)는 제1 신경망 모델(NN1)을 이용하여 스크린샷 이미지(SI)에 포함된 텍스트 중 제1 사이즈를 가지는 제1 텍스트 영역(TA1), 제2 사이즈를 가지는 제2 텍스트 영역(TA2) 및 제3 사이즈를 가지는 제3 텍스트 영역(TA3)을 획득할 수 있다. 이 경우, 제1 사이즈를 가지는 텍스트 영역은 복수개(예를 들어, 제3 텍스트 영역(TA3), 제5 텍스트 영역(TA5), 제6 텍스트 영역(TA6))일 수 있고, 제2 사이즈를 가지는 텍스트 영역은 복수개(예를 들어, 제7 텍스트 영역(TA7) 및 제8 텍스트 영역(TA8))일 수 있다.Referring to FIG. 4 , the controller 100 uses a first neural network model NN1 to have a first text area TA1 having a first size among texts included in a screenshot image SI and a second size. A second text area TA2 and a third text area TA3 having a third size may be obtained. In this case, the text area having the first size may be a plurality (eg, the third text area TA3 , the fifth text area TA5 , and the sixth text area TA6 ), and the text area having the second size may be plural. There may be a plurality of text areas (eg, a seventh text area TA7 and an eighth text area TA8).

마찬가지로, 제어부(100)는 제1 신경망 모델(NN1)을 이용하여 스크린샷 이미지(SI)에 포함된 텍스트 중 제1 폰트를 가지는 제1 텍스트 영역(TA1), 제2 폰트를 가지는 제4 텍스트 영역(TA4), 제3 폰트를 가지는 제3 텍스트 영역(TA3)을 획득할 수 있다. 이 경우, 제3 폰트를 가지는 텍스트 영역은 복수개(예를 들어, 제3 텍스트 영역(TA3), 제5 텍스트 영역(TA5), 제6 텍스트 영역(TA6))일 수 있다.Similarly, the controller 100 uses the first neural network model NN1 to use a first text area TA1 having a first font among texts included in the screenshot image SI, and a fourth text area having a second font. (TA4), a third text area TA3 having a third font may be acquired. In this case, there may be a plurality of text areas having the third font (eg, the third text area TA3 , the fifth text area TA5 , and the sixth text area TA6 ).

한편, 텍스트 영역은 획득된 스크린샷 이미지 내에서 텍스트가 포함되어 있는 영역을 의미할 수 있다. 예컨대, 텍스트 영역은 스크린샷 이미지 내의 텍스트 관련 이미지 영역(text-related image area)을 의미할 수 있다. 본 명세서에서 설명하는 스크린샷 이미지 내에서 텍스트가 포함되어 있는 영역은 텍스트 영역이라고 정의되나, 이는 텍스트 관련 이미지 영역으로 정의될 수도 있다. Meanwhile, the text area may mean an area in which text is included in the acquired screenshot image. For example, the text area may mean a text-related image area in the screenshot image. In the screenshot image described in this specification, an area including text is defined as a text area, but it may also be defined as a text-related image area.

상술한 스크린샷 이미지를 복수의 영역으로 구분하는 단계(S1300)가 수행된 후, 상기 복수의 영역 중 텍스트 영역으로부터 텍스트를 추출하는 단계(S1400)가 수행될 수 있다.After the step (S1300) of dividing the above-described screenshot image into a plurality of areas is performed, the step of extracting text from the text area among the plurality of areas (S1400) may be performed.

텍스트를 추출하는 단계(S1400)를 통해 텍스트 영역에 포함되어 있는 텍스트 정보가 추출될 수 있으며, 상기 텍스트를 추출하는 단계(S1400)는 사용자 단말기(1000) 상에서 수행될 수 있다. 즉, 사용자 단말기(1000)의 제어부(100)는 스크린샷 이미지로부터 복수의 텍스트 영역을 획득하고, 상기 복수의 텍스트 영역에서 텍스트 정보를 추출할 수 있다.Text information included in the text area may be extracted through the step of extracting the text ( S1400 ), and the step of extracting the text ( S1400 ) may be performed on the user terminal 1000 . That is, the controller 100 of the user terminal 1000 may obtain a plurality of text areas from the screenshot image and extract text information from the plurality of text areas.

도 7 및 8은 텍스트 영역으로부터 텍스트를 추출하는 방법을 설명하기 위한 도면이다. 7 and 8 are diagrams for explaining a method of extracting text from a text area.

도 7의 (a)를 참조하면, 제어부(100)는 스크린샷 이미지를 분석하여 텍스트 정보를 추출할 수 있다. 제어부(100)는 문자 판독 모듈(TM)을 통해 스크린샷 이미지로부터 텍스트 정보를 추출할 수 있다. 다시 말해, 제어부(100)는 스크린샷 이미지 전체를 문자 판독 모듈(TM)의 입력 데이터로 사용하여 출력 데이터로써 텍스트 정보를 출력할 수 있다.Referring to FIG. 7A , the controller 100 may extract text information by analyzing the screenshot image. The controller 100 may extract text information from the screenshot image through the character reading module TM. In other words, the controller 100 may output text information as output data by using the entire screenshot image as input data of the character reading module TM.

도 7의 (a)와 같이 문자 판독 모듈(TM)의 입력 데이터로 스크린샷 이미지 전체를 사용하는 경우 문자 인식률이 낮아 추출되는 텍스트 정보의 정확도가 떨어질 수 있다. 또한, 문자 판독 모듈(TM)의 입력 데이터로 스크린샷 이미지 전체를 사용하는 경우 문자 판독 모듈(TM)의 데이터 처리량이 많이 요구되므로 데이터 처리 속도가 느려질 수 있다.As shown in (a) of FIG. 7 , when the entire screenshot image is used as input data of the character reading module TM, the character recognition rate is low, and thus the accuracy of extracted text information may be deteriorated. In addition, when the entire screen shot image is used as input data of the character reading module TM, the data processing speed of the character reading module TM is required because a large amount of data processing is required.

이에 반하여, 도 7의 (b)를 참조하면, 제어부(100)는 스크린샷 이미지의 일부 영역을 분석하여 텍스트 정보를 추출할 수 있다. 제어부(100)는 문자 판독 모듈(TM)을 통해 스크린샷 이미지 내에 포함된 복수의 텍스트 영역(TA) 중 적어도 일부 영역을 분석하여 텍스트 정보를 추출할 수 있다. 다시 말해, 제어부(100)는 스크린샷 이미지 내에 포함된 복수의 텍스트 영역(TA) 중 적어도 일부 영역을 문자 판독 모듈(TM)의 입력 데이터로 사용하여 출력 데이터로써 텍스트 정보를 출력할 수 있다.In contrast, referring to FIG. 7B , the controller 100 may extract text information by analyzing a partial region of the screenshot image. The controller 100 may extract text information by analyzing at least some of the plurality of text areas TA included in the screenshot image through the character reading module TM. In other words, the controller 100 may output text information as output data by using at least some of the plurality of text areas TA included in the screenshot image as input data of the character reading module TM.

도 7의 (b)와 같이 문자 판독 모듈(TM)의 입력 데이터로 스크린샷 이미지에 포함된 복수의 텍스트 영역 중 적어도 일부를 사용하는 경우 문자 인식률이 높아 추출되는 텍스트 정보의 정확도가 향상될 수 있다. 또한, 문자 판독 모듈(TM)의 입력 데이터로 스크린샷 이미지에 포함된 복수의 텍스트 영역 중 적어도 일부를 사용하는 경우 문자 판독 모듈(TM)의 데이터 처리 속도가 향상될 수 있다.When at least a portion of a plurality of text areas included in the screenshot image is used as input data of the character reading module TM as shown in FIG. . In addition, when at least a portion of a plurality of text areas included in the screenshot image is used as input data of the character reading module TM, the data processing speed of the character reading module TM may be improved.

상기 문자 판독 모듈(TM)은 문자 인식 알고리즘(프로그램 또는 기능)을 포함할 수 있으며, 문자 인식 알고리즘(프로그램 또는 기능)을 이용하여 이미지 데이터에 기반하여 문자 이미지를 획득하고, 획득된 문자 이미지를 화소 단위로 분석하여 문자를 인식할 수 있다. 또한, 상기 문자 판독 모듈(TM)은 광학 문자 판독 알고리즘(프로그램 또는 기능)을 포함할 수 있으며, 광학 문자 판독 알고리즘(프로그램 또는 기능)을 이용하여 스캔된 문자에 대한 이미지를 판독한 결과 데이터를 획득할 수 있다. 또한, 상기 문자 판독 모듈(TM)은 문자 판독 알고리즘으로 경량 알고리즘일 수 있다. The character reading module TM may include a character recognition algorithm (program or function), and obtain a character image based on image data using the character recognition algorithm (program or function), and convert the acquired character image into pixels Characters can be recognized by analyzing units. In addition, the character reading module TM may include an optical character reading algorithm (program or function), and using the optical character reading algorithm (program or function) to obtain data as a result of reading an image for a scanned character can do. In addition, the character reading module (TM) may be a lightweight algorithm as a character reading algorithm.

도 8을 참조하면, 제어부(100)는 문자 판독 모듈(TM)을 통해 복수의 텍스트 영역 각각으로부터 복수의 텍스트 정보가 추출될 수 있다.Referring to FIG. 8 , the controller 100 may extract a plurality of text information from each of a plurality of text areas through the character reading module TM.

제어부(100)는 문자 판독 모듈(TM)을 통해 제1 텍스트 영역(TA1)으로부터 제1 텍스트 정보를 추출할 수 있다. 제어부(100)는 문자 판독 모듈(TM)을 통해 제2 텍스트 영역(TA2)으로부터 제2 텍스트 정보를 추출할 수 있다. 즉, 복수의 텍스트 영역은 각각 개별적으로 문자 판독 모듈(TM)에 입력되고, 문자 판독 모듈(TM)은 각각의 텍스트 영역으로부터 각각의 텍스트 정보를 추출할 수 있다.The controller 100 may extract the first text information from the first text area TA1 through the character reading module TM. The controller 100 may extract the second text information from the second text area TA2 through the character reading module TM. That is, each of the plurality of text areas may be individually input to the text reading module TM, and the text reading module TM may extract respective text information from each text area.

예를 들어, 제어부(100)는 스크린샷 이미지 내에 포함된 텍스트 중 제1 사이즈를 가지는 텍스트의 영역으로부터 문자 판독 모듈(TM)을 통해 텍스트 정보를 추출할 수 있다. 다른 예로, 제어부(100)는 스크린샷 이미지 내에 포함된 텍스트 중 제1 폰트를 가지는 텍스트의 영역으로부터 문자 판독 모듈(TM)을 통해 텍스트 정보를 출력할 수 있다.For example, the controller 100 may extract text information from a text area having a first size among texts included in the screenshot image through the character reading module TM. As another example, the controller 100 may output text information from the text area having the first font among texts included in the screenshot image through the character reading module TM.

도 9 내지 도 11은 텍스트 영역으로부터 추출된 텍스트 정보를 예시적으로 설명하기 위한 도면이다. 도 9 내지 도 11을 참조하면, 사용자에 의해 캡처된 스크린샷 이미지로부터 텍스트 정보가 추출될 수 있다.9 to 11 are diagrams for exemplarily explaining text information extracted from a text area. 9 to 11 , text information may be extracted from a screenshot image captured by a user.

예를 들어, 도 9를 참조하면, 도 9의 (a)와 같이 사용자 단말기(1000)의 출력부(500)를 통해 표시되는 SNS 화면을 사용자는 캡처할 수 있고, 제어부(100)는 캡처된 이미지를 획득하여 도 9의 (b)와 같은 텍스트 정보를 추출할 수 있다. For example, referring to FIG. 9 , the user may capture the SNS screen displayed through the output unit 500 of the user terminal 1000 as shown in FIG. By acquiring an image, text information as shown in FIG. 9(b) may be extracted.

보다 구체적으로, 도 9의 (a)와 같이, 제어부(100)는 획득된 스크린샷 이미지로부터 복수의 텍스트 영역(예컨대, 제1 텍스트 영역(TA1), 제2 텍스트 영역(TA2) 및 제3 텍스트 영역(TA3))을 획득할 수 있다. 이후, 제어부(100)는 문자 판독 모듈(TM)을 이용하여, 획득된 복수의 텍스트 영역 중 적어도 하나로부터 도 9의 (b)와 같은 텍스트 정보를 추출할 수 있다.More specifically, as shown in FIG. 9A , the controller 100 controls a plurality of text areas (eg, a first text area TA1 , a second text area TA2 , and a third text area from the obtained screenshot image). area TA3) may be obtained. Thereafter, the controller 100 may extract text information as shown in FIG. 9B from at least one of the plurality of acquired text areas by using the character reading module TM.

다른 예로, 도 10을 참조하면, 도 10의 (a)와 같이 사용자 단말기(1000)의 출력부(500)를 통해 표시되는 메신져 화면을 사용자는 캡처할 수 있고, 제어부(100)는 캡처된 이미지를 획득하여 도 10의 (b)와 같은 텍스트 정보를 추출할 수 있다. As another example, referring to FIG. 10 , the user may capture the messenger screen displayed through the output unit 500 of the user terminal 1000 as shown in FIG. It is possible to extract text information as shown in (b) of FIG.

보다 구체적으로, 도 10의 (a)와 같이, 제어부(100)는 획득된 스크린샷 이미지로부터 복수의 텍스트 영역(예컨대, 제1 텍스트 영역(TA1) 내지 제6 텍스트 영역(TA6))을 획득할 수 있다. 이후, 제어부(100)는 문자 판독 모듈(TM)을 이용하여, 획득된 복수의 텍스트 영역 중 적어도 하나로부터 도 10의 (b)와 같은 텍스트 정보를 추출할 수 있다.More specifically, as shown in (a) of FIG. 10 , the controller 100 may obtain a plurality of text areas (eg, the first text areas TA1 to TA6) from the obtained screenshot image. can Thereafter, the controller 100 may extract text information as shown in FIG. 10B from at least one of the plurality of acquired text areas by using the character reading module TM.

다른 예로, 도 11을 참조하면, 도 11의 (a)와 같이 사용자 단말기(1000)의 출력부(500)를 통해 표시되는 뉴스 화면을 사용자는 캡처할 수 있고, 제어부(100)는 캡처된 이미지를 획득하여 도 11의 (b)와 같은 텍스트 정보를 추출할 수 있다. As another example, referring to FIG. 11 , as shown in FIG. 11A , the user may capture a news screen displayed through the output unit 500 of the user terminal 1000 , and the controller 100 may control the captured image It is possible to extract text information as shown in (b) of FIG. 11 by obtaining .

보다 구체적으로, 도 11의 (a)와 같이, 제어부(100)는 획득된 스크린샷 이미지로부터 복수의 텍스트 영역(예컨대, 제1 텍스트 영역(TA1) 내지 제9 텍스트 영역(TA9))을 획득할 수 있다. 이후, 제어부(100)는 문자 판독 모듈(TM)을 이용하여, 획득된 복수의 텍스트 영역 중 적어도 하나로부터 도 11의 (b)와 같은 텍스트 정보를 추출할 수 있다.More specifically, as shown in FIG. 11A , the controller 100 may obtain a plurality of text areas (eg, first text areas TA1 to ninth text areas TA9) from the obtained screenshot image. can Thereafter, the controller 100 may extract text information as shown in FIG. 11B from at least one of the plurality of acquired text areas by using the character reading module TM.

도 12는 비텍스트 영역(NTA)으로부터 이미지 정보(IN)를 추출하기 위한 방법을 설명하기 위한 도면이다. 도 12를 참조하면, 제어부(100)는 이미지 판독 모듈(IM)을 통해, 스크린샷 이미지로부터 획득된 복수의 비텍스트 영역(NTA)으로부터 이미지 정보(IN)를 추출할 수 있다.12 is a diagram for explaining a method for extracting image information IN from a non-text area NTA. Referring to FIG. 12 , the controller 100 may extract image information IN from a plurality of non-text areas NTA obtained from a screenshot image through the image reading module IM.

도 5에는 도시하지 않았지만, 스크린샷 이미지를 복수의 영역으로 구분하는 단계(S1300)가 수행된 후, 상기 복수의 영역 중 비텍스트 영역으로부터 객체 정보를 추출하는 단계가 수행될 수 있다.Although not shown in FIG. 5 , after the step of dividing the screenshot image into a plurality of areas ( S1300 ) is performed, the step of extracting object information from the non-text area among the plurality of areas may be performed.

상기 비텍스트 영역으로부터 객체 정보를 추출하는 단계는 미리 학습된 신경망 모델을 통해 수행될 수 있다. 상기 미리 학습된 신경망 모델은 이미지에 기초하여 객체 정보를 획득하도록 학습된 모델일 수 있다.The step of extracting object information from the non-text area may be performed through a pre-trained neural network model. The pre-trained neural network model may be a model trained to obtain object information based on an image.

상기 객체 정보는 스크린샷 이미지에 맵핑되어 저장될 수 있고, 사용자는 스크린샷 이미지에 맵핑되어 저장된 상기 객체 정보에 기초하여 상기 스크린샷 이미지를 분류 및/또는 검색할 수 있다. 또한, 사용자는 스크린샷 이미지에 맵핑되어 저장된 객체 정보 및/또는 키워드를 사용하여 원하는 스크린샷 이미지를 검색할 수 있다. 예컨대, 사용자는 스크린샷 이미지에 맵핑되어 저장된 상기 객체 정보에 관한 내용을 검색어로 입력함으로써 원하는 스크린샷 이미지를 검색할 수 있다. The object information may be stored by being mapped to a screenshot image, and a user may classify and/or search for the screenshot image based on the object information mapped to and stored on the screenshot image. In addition, the user may search for a desired screenshot image by using the stored object information and/or keywords mapped to the screenshot image. For example, the user may search for a desired screenshot image by inputting the content related to the stored object information mapped to the screenshot image as a search term.

한편, 상기 비텍스트 영역으로부터 객체 정보를 추출하는 단계는 사용자 단말기(1000)에서 수행될 수 있으나, 이에 한정되는 것은 아니다. 비텍스트 영역으로부터 객체 정보를 추출하는 단계는 서버(S)에서 수행될 수 있다.Meanwhile, the step of extracting the object information from the non-text area may be performed in the user terminal 1000, but is not limited thereto. The step of extracting object information from the non-text area may be performed in the server S.

예를 들어, 사용자 단말기(1000)의 제어부(100)는 스크린샷 이미지로부터 획득되는 비텍스트 영역(NTA)을 서버(S)로 전송할 수 있고, 서버(S)는 수신된 비텍스트 영역(NTA)으로부터 이미지 판독 모듈(IM) 또는 신경망 모델을 통해 객체 정보(TI)를 획득할 수 있다.For example, the control unit 100 of the user terminal 1000 may transmit a non-text area (NTA) obtained from the screenshot image to the server (S), and the server (S) receives the received non-text area (NTA). Object information (TI) may be obtained from an image reading module (IM) or a neural network model.

다른 예로, 사용자 단말기(1000)의 제어부(100)는 스크린샷 이미지를 서버(S)로 전송할 수 있고, 서버(S)는 수신된 스크린샷 이미지로부터 이미지 판독 모듈(IM) 또는 신경망 모델을 통해 객체 정보(TI)를 획득할 수 있다.As another example, the control unit 100 of the user terminal 1000 may transmit a screenshot image to the server S, and the server S may use an image reading module IM from the received screenshot image or an object through a neural network model. Information TI may be obtained.

제어부(100)는 추출된 텍스트 정보를 스크린샷 이미지와 맵핑 후 저장하는 단계(S1500)를 통해, 추출된 텍스트 정보를 스크린샷 이미지와 맵핑할 수 있고, 이후 저장부(300)에 저장할 수 있다.The control unit 100 may map the extracted text information with the screenshot image and then store the extracted text information with the screenshot image ( S1500 ), and may map the extracted text information with the screenshot image, and then store it in the storage unit 300 .

제어부(100)는 스크린샷 이미지로부터 텍스트 정보를 추출한 후, 추출된 텍스트 정보를 상기 스크린샷 이미지에 맵핑할 수 있다. 여기서, 추출된 텍스트 정보를 스크린샷 이미지에 맵핑한다는 것은, 예시적으로 상기 텍스트 정보가 상기 스크린샷 이미지로부터 추출된 것임을 제어부(100) 또는 서버(S)가 인식할 수 있도록 식별 표시를 한다는 것일 수 있다.After extracting text information from the screenshot image, the controller 100 may map the extracted text information to the screenshot image. Here, the mapping of the extracted text information to the screenshot image may mean, for example, to display identification so that the control unit 100 or the server S can recognize that the text information is extracted from the screenshot image. have.

제어부(100)는 추출된 텍스트를 서버로 전송하는 단계(S1600)를 통해, 추출된 텍스트를 서버로 전송할 수 있다. 제어부(100)는 추출된 텍스트 정보로부터 키워드를 추출하기 위해 상기 텍스트 정보를 서버로 전송할 수 있다. The controller 100 may transmit the extracted text to the server through the step of transmitting the extracted text to the server ( S1600 ). The controller 100 may transmit the text information to the server in order to extract a keyword from the extracted text information.

2.3 서버에서 수행되는 전체 프로세스2.3 The whole process performed on the server

도 13은 서버에서 텍스트 정보로부터 키워드를 추출하는 전체 프로세스를 설명하기 위한 도면이다. 도 13을 참조하면, 서버에서 텍스트 정보로부터 키워드를 추출하는 방법은 텍스트 정보를 사용자 단말기로부터 수신하는 단계(S2100), 텍스트 정보를 분석하여 키워드를 획득하는 단계(S2200), 획득된 키워드를 스크린샷 이미지와 맵핑 후 저장하는 단계(S2300), 회득된 키워드를 사용자 단말기로 전송하는 단계(S2400)를 포함할 수 있다.13 is a diagram for explaining the entire process of extracting a keyword from text information in a server. Referring to FIG. 13 , the method for extracting a keyword from text information in the server includes the steps of receiving text information from a user terminal (S2100), analyzing text information to obtain a keyword (S2200), and taking a screenshot of the obtained keyword It may include a step of storing the image after mapping (S2300), and transmitting the obtained keyword to the user terminal (S2400).

서버(S)는 텍스트 정보를 사용자 단말기로부터 수신하는 단계(S2100)를 통해, 사용자 단말기(1000)로부터 텍스트 정보를 수신할 수 있다. 여기서 상기 텍스트 정보는 상술한바와 같이 사용자 단말기(1000)의 제어부(100)에 의해 스크린샷 이미지로부터 추출된 텍스트에 관한 정보일 수 있다.The server S may receive text information from the user terminal 1000 through the step S2100 of receiving the text information from the user terminal. Here, the text information may be information about text extracted from the screenshot image by the controller 100 of the user terminal 1000 as described above.

서버(S)는 텍스트 정보를 분석하여 키워드를 획득하는 단계(S2200)를 통해, 사용자 단말기(1000)를 통해 획득한 텍스트 정보를 분석하고, 이를 바탕으로 키워드를 획득할 수 있다. The server S may analyze the text information obtained through the user terminal 1000 through a step S2200 of analyzing the text information to obtain a keyword, and obtain a keyword based thereon.

도 14는 일 실시예에 따른 제2 신경망 모델을 통해 텍스트 정보로부터 키워드를 획득하는 방법을 설명하기 위한 도면이다.14 is a diagram for explaining a method of acquiring a keyword from text information through a second neural network model according to an embodiment.

도 14를 참조하면, 서버(S)는 제2 신경망 모델(NN2)을 통해 텍스트 정보로부터 키워드를 획득할 수 있다. 제2 신경망 모델(NN2)은 텍스트 정보로부터 키워드를 획득하도록 학습될 수 있다. 제2 신경망 모델(NN2)은 텍스트 정보에 기초하여 키워드를 획득하도록 학습될 수 있다.Referring to FIG. 14 , the server S may obtain a keyword from text information through the second neural network model NN2. The second neural network model NN2 may be trained to obtain keywords from text information. The second neural network model NN2 may be trained to obtain keywords based on text information.

도면에는 도시하지 않았지만, 서버(S)는 키워드 추출 알고리즘을 이용해서 텍스트 정보로부터 중요 키워드를 획득할 수 있다. 이때, 상기 키워드 추출 알고리즘은 복수의 단어 또는 문장 구조 내에서 핵심 단어 또는 문장 등을 추출하도록 정의된 기 알려진 다양한 알고리즘을 포함할 수 있다.Although not shown in the drawing, the server S may acquire important keywords from text information using a keyword extraction algorithm. In this case, the keyword extraction algorithm may include various known algorithms defined to extract key words or sentences from a plurality of words or sentence structures.

상기 키워드는 텍스트 정보에 포함되어 있는 다양한 텍스트(예컨대, 단어, 숫자, 문장 등)로부터 핵심 단어, 숫자, 문장 등을 의미할 수 있다.The keyword may mean a key word, number, sentence, etc. from various texts (eg, words, numbers, sentences, etc.) included in text information.

상기 키워드는 텍스트 정보에 포함되어 있는 다양한 텍스트(예컨대, 단어, 숫자, 문장 등)로부터 스크린샷 이미지를 대표할 수 있는 적어도 하나 이상의 핵심 단어, 숫자, 문장 등을 의미할 수 있다.The keyword may mean at least one or more key words, numbers, sentences, etc. that can represent a screenshot image from various texts (eg, words, numbers, sentences, etc.) included in text information.

텍스트 정보로부터 획득되는 키워드는 복수개일 수 있다. 즉, 스크린샷 이미지를 대표할 수 있는 단어 또는 문장 등은 복수개일 수 있으므로, 텍스트 정보로부터 획득되는 키워드 또한 복수개일 수 있다.There may be a plurality of keywords obtained from text information. That is, since there may be a plurality of words or sentences that may represent the screenshot image, a plurality of keywords obtained from the text information may also be present.

서버(S)가 텍스트 정보로부터 복수의 키워드를 획득하는 경우, 서버(S)는 텍스트 정보로부터 중요도가 반영되어 있는 복수의 키워드를 획득할 수 있다. 즉, 서버(S)로부터 획득되는 복수의 키워드에는 각기 중요도가 반영되어 있을 수 있고, 이때 복수의 키워드에 반영된 중요도는 동일하거나 상이할 수 있다.When the server S acquires a plurality of keywords from the text information, the server S may acquire a plurality of keywords whose importance is reflected from the text information. That is, each of the plurality of keywords obtained from the server S may have their respective importance levels reflected, and in this case, the importance reflected on the plurality of keywords may be the same or different.

예를 들어, 서버(S)는 텍스트 정보로부터 2개의 키워드를 획득할 수 있으며, 획득된 2개의 키워드에는 모두 동일한 중요도가 반영되어 있을 수 있다.For example, the server S may obtain two keywords from the text information, and the same importance may be reflected in both obtained keywords.

다른 예로, 서버(S)는 텍스트 정보로부터 2개의 키워드를 획득할 수 있으며, 획득된 키워드 중 어느 하나에는 다른 하나보다 높은 중요도가 반영되어 있을 수 있다. 이 경우, 상대적으로 높은 중요도가 반영된 키워드가 상기 스크린샷 이미지와 더 높은 확률로 관련성이 있을 수 있다. 또는, 상대적으로 높은 중요도가 반영된 키워드가 상기 스크린샷 이미지를 보다 대표하는 단어, 문장 등일 수 있다.As another example, the server S may obtain two keywords from the text information, and one of the obtained keywords may reflect a higher importance than the other. In this case, a keyword reflecting a relatively high importance may be related to the screenshot image with a higher probability. Alternatively, the keyword to which the relatively high importance is reflected may be a word or sentence more representative of the screenshot image.

상기 키워드는 텍스트 정보에 포함되어 있는 다양한 텍스트 중 어느 하나일 수 있으나 이에 한정되는 것은 아니다. 상기 키워드는 텍스트 정보에 포함되어 있지 않은 단어, 숫자 또는 문장 등일 수 있다. 다시 말해, 상기 키워드는 텍스트 정보에 포함되어 있는 다양한 텍스트에 기초하여 추출되는 새로운 단어, 숫자, 문장 또는 이들의 조합일 수 있다. The keyword may be any one of various texts included in the text information, but is not limited thereto. The keyword may be a word, number, or sentence not included in the text information. In other words, the keyword may be a new word, number, sentence, or a combination thereof extracted based on various texts included in text information.

서버(S)는 획득된 키워드를 스크린샷 이미지와 맵핑 후 저장하는 단계(S2300)를 통해, 획득된 키워드를 스크린샷 이미지에 맵핑할 수 있고, 키워드가 맵핑된 스크린샷 이미지를 저장할 수 있다. 여기서 획득된 키워드를 스크린샷 이미지에 맵핑한다는 것은, 예시적으로 상기 키워드가 상기 스크린샷 이미지를 대표하는 것임을 제어부(100) 또는 서버(S)가 인식할 수 있도록 식별 표시를 한다는 것일 수 있다.The server S may map the acquired keyword to the screenshot image and store the screenshot image to which the keyword is mapped through the step (S2300) of mapping the acquired keyword to the screenshot image and then storing it. The mapping of the obtained keyword to the screenshot image may mean, for example, to display identification so that the controller 100 or the server S can recognize that the keyword represents the screenshot image.

서버(S)는 획득된 키워드를 사용자 단말기로 전송하는 단계(S2400)를 통해, 획득된 키워드를 사용자 단말기(1000)로 전송할 수 있다. 서버(S)는 획득된 키워드에 기초하여 스크린샷 이미지 검색 및 분류가 사용자 단말기(1000)상에서 이뤄질 수 있도록 획득된 키워드를 사용자 단말기(1000)에 전송할 수 있다.The server S may transmit the acquired keyword to the user terminal 1000 through the step S2400 of transmitting the acquired keyword to the user terminal. The server S may transmit the acquired keyword to the user terminal 1000 so that a screen shot image search and classification can be performed on the user terminal 1000 based on the acquired keyword.

도 15는 스크린샷 이미지로부터 추출된 키워드 및 타겟 정보를 예시적으로 설명하기 위한 도면이다.15 is a diagram for exemplarily explaining keywords and target information extracted from a screenshot image.

도 15를 참조하면, 사용자의 동작에 의해 캡처되어 저장되는 스크린샷 이미지(SI)에는 다양한 종류의 텍스트 또는 비텍스트가 포함되어 있을 수 있다. 이때, 사용자 단말기(1000) 및/또는 서버(S)를 통해 스크린샷 이미지로부터 적어도 하나 이상의 키워드 및/또는 객체 정보를 획득할 수 있다.Referring to FIG. 15 , various types of text or non-text may be included in a screenshot image SI captured and stored by a user's motion. In this case, at least one or more keywords and/or object information may be acquired from the screenshot image through the user terminal 1000 and/or the server S.

예를 들어, 도 15를 참조하면, 스크린샷 이미지(SI)에 포함된 제1 텍스트 영역(TA1)으로부터 제1 키워드가 획득될 수 있고, 제2 텍스트 영역(TA2)으로부터 제2 키워드가 획득될 수 있고, 제3 텍스트 영역(TA3)으로부터 제3 키워드가 획득될 수 있고, 제4 텍스트 영역(TA4)으로부터 제4 키워드가 획득될 수 있고, 제5 텍스트 영역(TA2)으로부터 제5 키워드가 획득될 수 있고, 제6 텍스트 영역(TA6)으로부터 제6 키워드가 획득될 수 있고, 제7 텍스트 영역(TA7)으로부터 제7 키워드가 획득될 수 있고, 제8 텍스트 영역(TA8)으로부터 제8 키워드가 획득될 수 있다.For example, referring to FIG. 15 , the first keyword may be obtained from the first text area TA1 included in the screenshot image SI, and the second keyword may be obtained from the second text area TA2. , a third keyword may be obtained from the third text area TA3 , a fourth keyword may be obtained from the fourth text area TA4 , and a fifth keyword may be obtained from the fifth text area TA2 . , a sixth keyword may be obtained from the sixth text area TA6 , a seventh keyword may be obtained from the seventh text area TA7 , and an eighth keyword may be obtained from the eighth text area TA8 . can be obtained.

이 경우, 상술한 바와 같이 획득된 복수의 키워드(예컨대, 제1 키워드 내지 제8 키워드)에는 중요도가 반영되어 있을 수 있다. 예컨대, 상기 스크린샷 이미지(SI)를 가장 잘 대표할 수 있는 키워드가 제1 키워드인 경우, 상기 제1 키워드에 가장 높은 중요도가 반영되어 있을 수 있다.In this case, importance may be reflected in the plurality of keywords (eg, first to eighth keywords) obtained as described above. For example, when the keyword that can best represent the screenshot image SI is the first keyword, the highest importance may be reflected in the first keyword.

또한, 도 15를 참조하면, 제1 비텍스트 영역(NTA1)으로부터 제1 객체 정보가 획득될 수 있고, 제2 비텍스트 영역(NTA2)으로부터 제2 객체 정보가 획득될 수 있다.Also, referring to FIG. 15 , first object information may be obtained from the first non-text area NTA1 and second object information may be obtained from the second non-text area NTA2 .

스크린샷 이미지(SI)로부터 획득되는 적어도 하나 이상의 키워드 및/또는 객체 정보를 태그 정보로 정의할 수 있으며, 상기 태그 정보는 스크린샷 이미지(SI)에 맵핑되어 사용자 단말기(1000)에 저장될 수 있다. 상기 태그 정보를 이용하여 사용자는 원하는 스크린샷 이미지를 검색할 수 있다. 획득된 적어도 하나 이상의 태그 정보를 이용하여 사용자 단말기(1000)에 저장된 스크린샷 이미지를 검색하는 방법에 관하여는 후술하도록 한다.At least one or more keywords and/or object information obtained from the screenshot image SI may be defined as tag information, and the tag information may be mapped to the screenshot image SI and stored in the user terminal 1000 . . The user can search for a desired screenshot image by using the tag information. A method of retrieving a screenshot image stored in the user terminal 1000 using the acquired at least one or more tag information will be described later.

3 추천 키워드 선정 및 제공 방법3 How to select and provide recommended keywords

제어부(100)는 스크린샷 이미지(SI)로부터 획득되는 적어도 하나 이상의 태그 정보에 기초하여 사용자 단말기(1000)에 저장되어 있는 복수의 스크린샷 이미지 중 어느 하나를 획득할 수 있다.The controller 100 may acquire any one of a plurality of screenshot images stored in the user terminal 1000 based on at least one piece of tag information obtained from the screenshot image SI.

도 16은 사용자 단말기의 출력부를 통해 표시되는 화면을 예시적으로 설명하기 위한 도면이다. 도 16을 참조하면, 사용자 단말기(1000)의 출력부(500)에는 사용자의 입력을 획득할 수 있는 검색부(SE), 추천 태그 표시부(REC), 검색 결과 표시부(RES)가 표시될 수 있다.16 is a view for explaining a screen displayed through an output unit of a user terminal by way of example. Referring to FIG. 16 , the output unit 500 of the user terminal 1000 may display a search unit SE capable of obtaining a user's input, a recommendation tag display unit REC, and a search result display unit RES. .

제어부(100)는 검색부(SE)를 통해 획득된 사용자의 입력에 기초하여 저장부(500)에 저장되어 있는 복수의 스크린샷 이미지 중 적어도 하나를 선택하여 사용자에게 제공할 수 있다.The control unit 100 may select at least one of a plurality of screenshot images stored in the storage unit 500 based on a user input obtained through the search unit SE and provide it to the user.

제어부(100)는 스크린샷 이미지(SI)로부터 획득되는 적어도 하나 이상의 태그 정보를 출력부(500)의 추천 태그 표시부(REC)를 통해 표시하여 사용자에게 제공할 수 있다. 구체적으로, 제어부(100)는 미리 정해진 기준에 부합하는 태그 정보를 출력부(500)를 통해 표시하여 사용자에게 추천할 수 있다. 이 경우, 상기 사용자에게 추천되는 태그 정보는 추천 태그일 수 있다.The controller 100 may display at least one or more tag information obtained from the screenshot image SI through the recommendation tag display unit REC of the output unit 500 and provide it to the user. Specifically, the controller 100 may display tag information that meets a predetermined criterion through the output unit 500 and recommend it to the user. In this case, the tag information recommended to the user may be a recommendation tag.

일 실시예에 따르면, 사용자 단말기(1000)는 제1 스크린샷 이미지로부터 문자 판독 모듈을 통해 제1 텍스트 정보를 추출한 후, 상기 제1 텍스트 정보를 서버에 전송할 수 있으며, 서버는 상기 제1 텍스트 정보에 기초하여 제1 태그 정보를 추출한 후 이를 사용자 단말기(1000)에 전송할 수 있다. 마찬가지로, 사용자 단말기(1000)는 제2 스크린샷 이미지로부터 문자 판독 모듈을 통해 제2 텍스트 정보를 추출한 후, 상기 제2 텍스트 정보를 서버에 전송할 수 있으며, 서버는 상기 제2 텍스트 정보에 기초하여 제2 태그 정보를 추출한 후 이를 사용자 단말기(1000)에 전송할 수 있다.According to an embodiment, after extracting the first text information from the first screenshot image through the character reading module, the user terminal 1000 may transmit the first text information to the server, and the server may transmit the first text information After extracting the first tag information based on , it may be transmitted to the user terminal 1000 . Similarly, after extracting the second text information from the second screenshot image through the character reading module, the user terminal 1000 may transmit the second text information to the server, and the server may send the second text information to the server based on the second text information. 2 After the tag information is extracted, it may be transmitted to the user terminal 1000 .

이 경우, 사용자 단말기(1000)는 미리 정해진 기준에 따라 상기 제1 태그 정보 및 상기 제2 태그 정보 중 적어도 하나를 추천 태그로 결정하여 사용자에게 제공할 수 있다.In this case, the user terminal 1000 may determine at least one of the first tag information and the second tag information as a recommendation tag according to a predetermined criterion and provide it to the user.

사용자 단말기(1000)는 상기 서버로부터 제1 시점에 상기 제1 태그 정보를 수신하고, 제2 시점에 상기 제2 태그 정보를 수신할 수 있다. 이때, 사용자 단말기(1000)는 상기 제1 태그 정보를 상기 추천 태그로 결정할 수 있고, 상기 제1 시점은 상기 제2 시점보다 이른 시점일 수 있다.The user terminal 1000 may receive the first tag information from the server at a first time and receive the second tag information at a second time. In this case, the user terminal 1000 may determine the first tag information as the recommendation tag, and the first time point may be earlier than the second time point.

사용자 단말기(1000)는 복수의 스크린샷 이미지로부터 추출된 복수의 태그 정보를 서버로부터 수신할 수 있다. 이 경우, 사용자 단말기(1000)는 복수의 태그 정보가 서버로부터 수신되는 빈도에 근거하여 추천 태그를 결정할 수 있다. 예를 들어, 사용자 단말기(1000)는 복수의 태그 정보 중 서버로부터 수신되는 빈도가 가장 높은 태그 정보를 추천 태그로 결정하여 사용자에게 제공할 수 있다.The user terminal 1000 may receive a plurality of tag information extracted from a plurality of screenshot images from the server. In this case, the user terminal 1000 may determine the recommendation tag based on the frequency at which a plurality of pieces of tag information are received from the server. For example, the user terminal 1000 may determine, as a recommendation tag, tag information with the highest frequency received from the server among a plurality of tag information and provide it to the user.

결국, 서버는 사용자 단말기로부터 수신된 텍스트 정보로부터 키워드를 획득한 후, 획득된 키워드에 기초하여 태그 정보를 추출할 수 있다. 상기 키워드는 텍스트 정보에 포함되어 있는 텍스트 중 스크린샷 이미지를 대표하는 적어도 하나 이상의 단어, 숫자, 문장 또는 이들의 조합일 수 있다. 여기서, 사용자 단말기(1000)는 상기 키워드로부터 태그 정보를 추출할 수 있고, 이를 이용하여 추천 태그를 추천 태그 표시부(REC)를 통해 표시할 수 있다.As a result, after obtaining a keyword from the text information received from the user terminal, the server may extract tag information based on the obtained keyword. The keyword may be at least one word, number, sentence, or a combination thereof representing a screenshot image among texts included in the text information. Here, the user terminal 1000 may extract tag information from the keyword, and may display the recommendation tag through the recommendation tag display unit REC by using it.

일 실시예에 따르면, 도 16의 (a)와 같이 추천 태그 표시부(REC)에는 문자로 구성된 추천 태그가 표시될 수 있다. 즉, 상술한 방법으로 추출되는 추천 태그는 문자 형태로 구비될 수 있다.According to an embodiment, a recommendation tag composed of characters may be displayed on the recommendation tag display unit REC as shown in (a) of FIG. 16 . That is, the recommendation tag extracted by the above-described method may be provided in the form of text.

다른 실시예에 따르면, 도 16의 (b)와 같이 추천 태그 표시부(REC)에는 문자 및 대표 이미지로 구성된 추천 태그가 표시될 수 있다. 또한, 도면에는 도시되지 않았지만, 추천 태그 표시부(REC)에는 대표 이미지로 구성된 추천 태그가 표시될 수 있다. 이때, 추천 태그 표시부(REC)에 표시되는 상기 대표 이미지는 상기 키워드에 기초하여 정해지는 이미지일 수 있다. 예컨대, 상기 대표 이미지는 상기 키워드에 기초하여 정해지되, 상기 스크린샷 이미지의 유형 내지 정보를 잘 나타낼 수 있는 이미지일 수 있다.According to another embodiment, as shown in (b) of FIG. 16 , a recommendation tag composed of text and a representative image may be displayed on the recommendation tag display unit REC. Also, although not shown in the drawings, a recommendation tag composed of a representative image may be displayed on the recommendation tag display unit REC. In this case, the representative image displayed on the recommendation tag display unit REC may be an image determined based on the keyword. For example, the representative image is determined based on the keyword, and may be an image that can well represent the type or information of the screenshot image.

도 17은 추천 태그를 선정하는 방법 및 사용자에게 제공하는 방법을 설명하기 위한 도면이다. 도 17을 참조하면, 추천 태그 선정 방법 및 사용자에게 제공하는 방법은 스크린샷 이미지를 획득하는 단계(S3100), 스크린샷 이미지를 분석하는 단계(S3200), 스크린샷 이미지로부터 텍스트 정보를 추출하는 단계(S3300), 텍스트 정보로부터 복수의 키워드를 획득하는 단계(S3400), 복수의 키워드 중 추천 태그를 선정하는 단계(S3500) 및 추천 태그를 사용자에게 제공하는 단계(S3600)를 포함할 수 있다. 17 is a diagram for explaining a method of selecting a recommendation tag and providing it to a user. Referring to FIG. 17 , the method of selecting a recommendation tag and providing it to the user includes the steps of obtaining a screenshot image (S3100), analyzing the screenshot image (S3200), and extracting text information from the screenshot image ( S3300), obtaining a plurality of keywords from text information (S3400), selecting a recommended tag from among the plurality of keywords (S3500), and providing the recommended tag to the user (S3600) may include.

제어부(100)는 스크린샷 이미지를 획득하는 단계(S3100)를 통해 사용자에 의해 캡처되어 저장된 스크린샷 이미지를 획득할 수 있다. 제어부(100)는 스크린샷 이미지를 획득하는 단계(S3100)를 통해 사용자의 동작에 의해 캡처되어 저장부(300)에 저장되어 있는 복수의 스크린샷 이미지 중 적어도 하나의 스크린샷 이미지 또는 2이상의 스크린샷 이미지를 획득할 수 있다. 이 외에도, 제어부(100)는 사용자 단말기(1000)의 외부로부터 스크린샷 이미지를 획득할 수도 있으며, 이와 관련해서는 도 5의 스크린샷 이미지를 획득하는 단계(S1100)를 통해 상술한 바 있으므로 중복되는 설명은 생략한다.The controller 100 may acquire a screenshot image captured and stored by the user through the step of acquiring the screenshot image ( S3100 ). The control unit 100 captures by the user's action through the step (S3100) of obtaining a screenshot image and stores at least one screenshot image or two or more screenshot images among a plurality of screenshot images stored in the storage unit 300 image can be obtained. In addition to this, the control unit 100 may obtain a screenshot image from the outside of the user terminal 1000, and in this regard, the description is duplicated because it has been described above through the step (S1100) of obtaining the screenshot image of FIG. 5 . is omitted.

제어부(100)는 스크린샷 이미지를 분석하는 단계(S3200)를 통해 획득된 스크린샷 이미지의 분석을 수행할 수 있다. 제어부(100)는 획득된 스크린샷 이미지가 분석하기에 적합하도록 이미지를 전처리하는 단계를 수행할 수 있다. 스크린샷 이미지를 분석하는 단계(S3200)는 도 5의 스크린샷 이미지를 분석하는 단계(S1200)와 동일 또는 상응하므로 중복되는 설명은 생략한다.The controller 100 may analyze the screenshot image obtained through the step of analyzing the screenshot image ( S3200 ). The controller 100 may perform a step of pre-processing the image so that the obtained screenshot image is suitable for analysis. The step of analyzing the screenshot image ( S3200 ) is the same as or corresponding to the step of analyzing the screenshot image of FIG. 5 ( S1200 ), so a redundant description will be omitted.

제어부(100)는 스크린샷 이미지로부터 텍스트 정보를 추출하는 단계(S3300)를 통해 획득된 스크린샷 이미지를 분석하여 텍스트 정보를 추출할 수 있다. 보다 구체적으로, 제어부(100)는 스크린샷 이미지를 텍스트 영역 및/또는 비텍스트 영역으로 구분할 수 있으며, 상기 텍스트 영역으로부터 문자 판독 모듈을 통해 텍스트 정보를 추출할 수 있다. 이와 관련하여서는 도 5의 스크린샷 이미지를 복수의 영역으로 구분하는 단계(S1300), 복수의 영역 중 텍스트 영역으로부터 텍스트를 추출하는 단계(S1400)를 통해 상술한 바 있으므로 중복되는 설명은 생략한다.The controller 100 may extract text information by analyzing the screenshot image obtained through the step of extracting text information from the screenshot image ( S3300 ). More specifically, the controller 100 may divide the screenshot image into a text area and/or a non-text area, and extract text information from the text area through a character reading module. In this regard, since the steps of dividing the screenshot image of FIG. 5 into a plurality of areas ( S1300 ) and extracting text from the text area among the plurality of areas ( S1400 ) have been described above, the overlapping description will be omitted.

제어부(100)는 상기 획득된 텍스트 정보를 서버(S)로 전송할 수 있다. 서버(S)는 수신된 텍스트 정보로부터 복수의 키워드를 획득할 수 있다. 서버(S)는 미리 학습된 신경망 모델 또는 기 알려진 알고리즘을 통해 텍스트 정보로부터 복수의 키워드를 획득할 수 있으며, 이와 관련하여는 도 13의 일련의 단계를 통하여 상술한 바 있으므로 중복되는 설명은 생략한다.The controller 100 may transmit the obtained text information to the server S. The server S may obtain a plurality of keywords from the received text information. The server S may acquire a plurality of keywords from text information through a pre-trained neural network model or a known algorithm. .

제어부(100)는 복수의 키워드 중 추천 태그를 선정하는 단계(S3500)를 통해, 획득된 복수의 키워드 중에서 사용자에게 제공할 추천 태그를 선정할 수 있다. The controller 100 may select a recommendation tag to be provided to the user from among the acquired keywords through the step of selecting a recommendation tag from among the plurality of keywords ( S3500 ).

상기 추천 태그는 저장부(400)에 저장되어 있는 태그 정보 중 미리 정해진 기준에 기초하여 선택될 수 있다.The recommendation tag may be selected based on a predetermined criterion among tag information stored in the storage unit 400 .

제어부(100)는 저장부(400)에 저장되어 있는 태그 정보를 시간 순서에 따라 분류하여 사용자에게 추천 태그로 제공할 수 있다. 예를 들어, 제어부(100)는 저장부(400)에 저장되어 있는 태그 정보 중 최근에 획득되어 저장된 태그 순서로 사용자에게 제공할 수 있다. 즉, 제어부(100)가 사용자에게 제공하는 추천 태그는 가장 최근에 캡처되어 저장된 스크린샷 이미지로부터 추출되는 태그 정보일 수 있다. The control unit 100 may classify the tag information stored in the storage unit 400 according to time order and provide it to the user as a recommended tag. For example, the control unit 100 may provide the tag information stored in the storage unit 400 to the user in the order of recently acquired and stored tags. That is, the recommendation tag provided to the user by the controller 100 may be tag information extracted from the most recently captured and stored screenshot image.

제어부(100)는 저장부(400)에 저장되어 있는 태그 정보를 빈도 수에 따라 분류하여 사용자에게 추천 태그로 제공할 수 있다. 보다 구체적으로, 저장부(400)에는 복수의 태그 정보가 저장되어 있으므로, 이 경우 반복하여 저장되어 있는 태그 정보가 있을 수 있다. 이때, 제어부(100)는 저장부(400)에 저장되어 있는 태그 정보 중 가장 많이 저장된 태그 순서로 사용자에게 제공할 수 있다. 즉, 제어부(100)가 사용자에게 제공하는 추천 태그는 가장 많이 반복하여 저장된 태그일 수 있다. The controller 100 may classify the tag information stored in the storage 400 according to the frequency and provide it to the user as a recommended tag. More specifically, since a plurality of tag information is stored in the storage unit 400 , in this case, there may be repeatedly stored tag information. In this case, the control unit 100 may provide the user with the order of the most stored tags among the tag information stored in the storage unit 400 . That is, the recommendation tag provided to the user by the controller 100 may be a tag stored repeatedly the most.

제어부(100)는 저장부(400)에 저장되어 있는 태그 정보를 사용자의 활동 로그에 따라 분류하여 사용자에게 추천 태그로 제공할 수 있다. 제어부(100)는 사용자 단말기(1000)에서의 사용자의 활동 로그에 대한 정보를 획득한 후 이에 기초하여 사용자에게 적합한 태그 정보를 제공할 수 있다.The controller 100 may classify the tag information stored in the storage 400 according to the user's activity log and provide it to the user as a recommended tag. After obtaining information about the user's activity log in the user terminal 1000 , the controller 100 may provide suitable tag information to the user based thereon.

제어부(100)는 저장부(400)에 저장되어 있는 태그 정보를 유형 별로 분류하여 사용자에게 추천 태그로 제공할 수 있다. 보다 구체적으로, 저장부(400)에 저장되어 있는 복수의 스크린샷 이미지는 태그 정보에 기초하여 유형 별로 분류될 수 있다. 예컨대, 저장부(400)에 저장되어 있는 복수의 스크린샷 이미지 중 '웹페이지'라는 태그 정보가 맵핑된 스크린샷 이미지는 제1 유형으로 분류될 수 있고, '기프티콘'이라는 태그 정보가 맵핑된 스크린샷 이미지는 제2 유형으로 분류될 수 있고, '쇼핑몰'이라는 태그 정보가 맵핑된 스크린샷 이미지는 제3 유형으로 분류될 수 있고, 'SNS'라는 태그 정보가 맵핑된 스크린샷 이미지는 제4 유형으로 분류될 수 있다.The control unit 100 may classify the tag information stored in the storage unit 400 by type and provide it to the user as a recommended tag. More specifically, the plurality of screenshot images stored in the storage 400 may be classified by type based on tag information. For example, a screenshot image to which tag information 'webpage' is mapped among a plurality of screenshot images stored in the storage unit 400 may be classified as a first type, and a screen to which tag information 'gifticon' is mapped. The shot image may be classified into a second type, a screenshot image to which tag information of 'shopping mall' is mapped may be classified as a third type, and a screenshot image to which tag information of 'SNS' is mapped may be classified as a fourth type can be classified as

이처럼, 스크린샷 이미지는 맵핑되어 있는 태그 정보에 기초하여 복수의 유형으로 분류될 수 있으며, 제어부(100)는 스크린샷 이미지를 복수의 유형으로 분류하는데 기초가 된 태그 정보를 사용자에게 추천 태그로 제공할 수 있다. 즉, 제어부(100)가 사용자에게 제공하는 추천 태그는 '웹페이지', '기프티콘', '쇼핑몰' 등일 수 있다. As such, the screenshot image may be classified into a plurality of types based on the mapped tag information, and the controller 100 provides tag information based on classifying the screenshot image into a plurality of types as a recommended tag to the user. can do. That is, the recommendation tag provided by the controller 100 to the user may be a 'web page', a 'gifticon', a 'shopping mall', or the like.

제어부(100)는 추천 태그를 사용자에게 제공하는 단계(S3600)를 통해 선정된 추천 태그를 출력부(500)에 표시하여 사용자에게 제공할 수 있다. The control unit 100 may display the recommendation tag selected through the step of providing the recommendation tag to the user ( S3600 ) on the output unit 500 and provide it to the user.

제어부(100)가 추천 태그를 출력부(500)를 통해 사용자에게 제공하는 방법은 다양할 수 있다. 예를 들어, 제어부(100)는 추천 태그에 관한 텍스트만을 출력부(500)를 통해 사용자에게 제공할 수 있다. 다른 예로, 제어부(100)는 추천 태그에 관한 텍스트와 함께 상기 추천 태그가 맵핑되어 있는 스크린샷 이미지 중 적어도 일부 영역을 출력부(500)를 통해 사용자에게 제공할 수 있다. 추천 태그가 텍스트와 함께 이미지 형태로 사용자에게 제공되는 경우, 사용자는 해당 추천 태그에 관한 정보를 보다 직관적으로 파악할 수 있다.There may be various methods in which the controller 100 provides the recommendation tag to the user through the output unit 500 . For example, the controller 100 may provide only text related to the recommendation tag to the user through the output unit 500 . As another example, the controller 100 may provide, through the output unit 500 , at least a partial region of the screenshot image to which the recommendation tag is mapped together with the text related to the recommendation tag to the user. When the recommendation tag is provided to the user in the form of an image together with text, the user may more intuitively understand information about the recommendation tag.

이때, 상기 추천 태그와 함께 사용자에게 제공되는 이미지의 형태는 스크린샷 이미지 중 적어도 일부 영역에 관한 것일 수 있다. 예컨대, 상기 이미지의 형태는 스크린샷 이미지 중 텍스트 영역 및/또는 비텍스트 영역 중 어느 하나에 관한 것일 수 있다. 또는, 상기 추천 태그와 함께 사용자에게 제공되는 이미지의 형태는 상기 추천 태그에 기초하여 가공된 이미지일 수 있다.In this case, the shape of the image provided to the user together with the recommendation tag may relate to at least a partial area of the screenshot image. For example, the shape of the image may relate to any one of a text area and/or a non-text area among the screenshot images. Alternatively, the form of the image provided to the user together with the recommendation tag may be an image processed based on the recommendation tag.

4 태그 정보에 기초한 스크린샷 이미지 검색 방법 및 제공 방법4 How to search for and provide screenshot images based on tag information

사용자 단말기(1000)는, 복수의 스크린샷 이미지를 저장하는 저장부(300), 상기 복수의 스크린샷 이미지 중 적어도 하나 이상의 스크린샷 이미지를 디스플레이하는 출력부(500), 사용자 입력을 입력 받는 입력부(400), 외부 서버와 통신을 수행하는 통신부(700) 및 상기 사용자 입력에 기초하여 상기 복수의 스크린샷 이미지 중 상기 출력부에 표시할 적어도 하나 이상의 스크린샷 이미지를 결정하는 제어부(100)를 포함할 수 있다.The user terminal 1000 includes a storage unit 300 for storing a plurality of screenshot images, an output unit 500 for displaying at least one screenshot image among the plurality of screenshot images, and an input unit for receiving a user input ( 400), a communication unit 700 for communicating with an external server, and a control unit 100 for determining at least one screenshot image to be displayed on the output unit among the plurality of screenshot images based on the user input can

이하에서는 도면을 참조하여, 태그 정보에 기초하여 저장부(300)에 저장된 복수의 스크린샷 이미지 중 적어도 하나 이상을 검색하는 방법에 대해 설명한다.Hereinafter, a method of retrieving at least one of a plurality of screenshot images stored in the storage unit 300 based on tag information will be described with reference to the drawings.

도 18은 태그 정보에 기초하여 스크린샷 이미지를 검색하는 방법 및 검색 결과를 사용자에게 제공하는 방법을 설명하기 위한 도면이다.18 is a diagram for explaining a method of searching for a screenshot image based on tag information and a method of providing a search result to a user.

도 18을 참조하면, 출력부(500)는 사용자가 스크린샷 이미지 검색을 위해 검색어를 입력할 수 있는 검색부(SE), 검색부에 입력되는 사용자 입력과 관련된 키워드를 표시하는 관련 검색부(SE'), 사용자 입력에 응답하여 검색된 스크린샷 이미지를 표시하는 검색 결과부(RES)를 포함할 수 있다.Referring to FIG. 18 , the output unit 500 includes a search unit SE that allows a user to input a search word for a screen shot image search, and a related search unit SE that displays keywords related to a user input input to the search unit. '), and a search result unit RES for displaying a screen shot image searched for in response to a user input.

상기 저장부(300)에는 복수의 스크린샷 이미지가 저장되어 있을 수 있으며, 상기 복수의 스크린샷 이미지 각각에는 적어도 하나 이상의 태그 정보가 맵핑 되어 있을 수 있다. 이때, 상기 태그 정보는 상기 스크린샷 이미지로부터 추출된 것이며, 이와 관련해서는 상술한 바 있으므로 중복되는 설명은 생략한다.A plurality of screenshot images may be stored in the storage unit 300 , and at least one piece of tag information may be mapped to each of the plurality of screenshot images. In this case, the tag information is extracted from the screenshot image, and since this has been described above, a redundant description will be omitted.

제어부(100)는 상기 사용자 입력이 상기 입력부(400)를 통해 입력되면, 상기 저장부(300)에 저장되어 있는 상기 태그 정보 중 상기 사용자 입력에 대응되는 매칭 태그 정보를 결정할 수 있다.When the user input is input through the input unit 400 , the controller 100 may determine matching tag information corresponding to the user input from among the tag information stored in the storage unit 300 .

예를 들어, 제어부(100)는 저장부(300)에 저장되어 있는 태그 정보 중 상기 사용자 입력과 동일한 태그 정보 또는 일정 기준 이상의 유사도를 보이는 태그 정보를 매칭 태그 정보로 결정할 수 있다.For example, the controller 100 may determine the same tag information as the user input from among the tag information stored in the storage 300 or tag information showing a degree of similarity greater than or equal to a predetermined standard as the matching tag information.

일 실시예에 따르면, 도 18의 (a)와 같이 검색부(SE)를 통해 입력되는 사용자의 입력은 1개일 수 있다. 즉, 사용자 입력부(400)를 통해 획득되는 사용자 입력은 1개이며, 제어부(100)는 입력된 사용자 입력에 기초하여 매칭 태그 정보를 결정한 후, 상기 매칭 태그 정보가 맵핑 되어 있는 적어도 하나 이상의 스크린샷 이미지를 검색 결과부(RES)에 출력할 수 있다.According to an embodiment, as shown in (a) of FIG. 18 , there may be one user input input through the search unit SE. That is, there is one user input obtained through the user input unit 400, and the control unit 100 determines matching tag information based on the input user input, and then at least one or more screenshots to which the matching tag information is mapped. The image may be output to the search result unit RES.

다른 실시예에 따르면, 도 18의 (b)와 같이 사용자 입력부(400)를 통해 획득되는 사용자 입력은 2개 이상일 수 있다. 사용자 입력부(400)를 통해 획득되는 사용자 입력은 제1 사용자 입력 및 제2 사용자 입력을 포함할 수 있다. 이 경우, 상기 제1 사용자 입력은 제1 시점에 입력된 것이고, 상기 제2 사용자 입력은 제2 시점에 입력된 것이되, 상기 제1 시점과 제2 시점은 다를 수 있다.According to another embodiment, as shown in (b) of FIG. 18 , there may be two or more user inputs obtained through the user input unit 400 . The user input obtained through the user input unit 400 may include a first user input and a second user input. In this case, the first user input is input at a first time point, and the second user input is input at a second time point, but the first time point and the second time point may be different.

제어부(100)는 제1 사용자 입력 또는 제2 사용자 입력 중 적어도 하나에 기초하여 매칭 태그 정보를 결정할 수 있다. 예컨대, 제어부(100)는 제1 사용자 입력 또는 제2 사용자 입력 중 적어도 하나에 대응되는 태그 정보를 매칭 태그 정보로 결정할 수 있다.The controller 100 may determine matching tag information based on at least one of a first user input and a second user input. For example, the controller 100 may determine tag information corresponding to at least one of the first user input and the second user input as the matching tag information.

상기 제1 시점은 상기 제2 시점보다 이른 시점일 수 있으며, 이 경우 제어부(100)는 제1 사용자 입력 및 제2 사용자 입력이 입력부(400)에 입력된 시점을 고려하여 상기 매칭 태그 정보를 결정할 수 있다. The first time point may be earlier than the second time point. In this case, the control unit 100 determines the matching tag information in consideration of the time points at which the first user input and the second user input are input to the input unit 400 . can

예를 들어, 제어부(100)는 제1 사용자 입력 및 제2 사용자 입력에 기초하여 상기 매칭 태그 정보를 결정하되, 상기 제1 사용자 입력에 중요도를 두어 매칭 태그 정보를 결정할 수 있다. 보다 구체적으로, 제어부(100)는 복수의 태그 정보 중 제1 사용자 입력 및 제2 사용자 입력에 대응되는 태그 정보 중에서 제1 사용자 입력에 대응될 확률이 더 높은 태그 정보를 매칭 태그 정보로 결정할 수 있다.For example, the control unit 100 may determine the matching tag information based on a first user input and a second user input, and determine the matching tag information by giving importance to the first user input. More specifically, the controller 100 may determine, as the matching tag information, tag information having a higher probability of corresponding to the first user input from among the plurality of tag information corresponding to the first user input and the second user input. .

다른 예로, 제어부(100)는 복수의 태그 정보 중 제1 사용자 입력에 대응되는 태그 정보 중에서 제2 사용자 입력에 대응되는 태그 정보를 매칭 태그 정보로 결정할 수 있다. 보다 구체적으로, 제어부(100)는 복수의 태그 정보 중 제1 사용자 입력에 대응되는 제1 태그 정보를 결정할 수 있다. 이후, 제어부(100)는 상기 제1 태그 정보 중 제2 사용자 입력에 대응되는 제2 태그 정보를 매칭 태그 정보로 결정할 수 있다.As another example, the controller 100 may determine tag information corresponding to the second user input from among the plurality of tag information corresponding to the first user input as matching tag information. More specifically, the controller 100 may determine the first tag information corresponding to the first user input from among the plurality of tag information. Thereafter, the controller 100 may determine second tag information corresponding to a second user input among the first tag information as matching tag information.

제어부(100)는 상기 매칭 태그 정보가 맵핑 되어 있는 스크린샷 이미지가 상기 출력부를 통해 출력되도록 제어하되, 상기 스크린샷 이미지가 상기 저장부에 저장된 시간 순서에 따라 상기 출력부에 표시되도록 제어할 수 있다.The control unit 100 may control the screen shot image to which the matching tag information is mapped to be output through the output unit, and control so that the screenshot image is displayed on the output unit according to the time sequence stored in the storage unit. .

예를 들어, 저장부(300)에 저장되어 있는 스크린샷 이미지에는 캡처된 시간 정보가 입력되어 있을 수 있다. 이때, 제어부(100)는 상기 매칭 태그 정보가 맵핑 되어 있는 복수의 스크린샷 이미지를 저장부(300)에 저장된 시간 순서에 따라 정렬한 후 출력부(500)에 출력되도록 제어할 수 있다.For example, captured time information may be input to the screenshot image stored in the storage unit 300 . In this case, the controller 100 may control the plurality of screenshot images to which the matching tag information is mapped to be output to the output unit 500 after arranging them according to the chronological order stored in the storage unit 300 .

보다 구체적으로, 제어부(100)는 상기 매칭 태그 정보가 맵핑 되어 있는 복수의 스크린샷 이미지 중 저장부(300)에 가장 최근에 저장된 순서로 정렬한 후 출력부(500)에 출력되도록 제어할 수 있다. More specifically, the control unit 100 may control the output to the output unit 500 after sorting in the order most recently stored in the storage unit 300 among a plurality of screenshot images to which the matching tag information is mapped. .

또한, 제어부(100)는 상기 매칭 태그 정보가 맵핑 되어 있는 스크린샷 이미지가 상기 출력부를 통해 출력되도록 제어하되, 상기 매칭 태그 정보가 상기 사용자 입력에 대응될 확률 값이 높은 순서에 따라 상기 출력부에 표시되도록 제어할 수 있다.In addition, the control unit 100 controls the screen shot image to which the matching tag information is mapped to be output through the output unit, and the matching tag information is output to the output unit in the order of the highest probability value corresponding to the user input. You can control the display.

예를 들어, 상기 매칭 태그 정보는 사용자 입력에 기초하여 정해지며, 보다 구체적으로 복수의 태그 정보 중 사용자 입력에 대응되는 태그 정보가 상기 매칭 태그 정보로 결정된다.For example, the matching tag information is determined based on a user input, and more specifically, tag information corresponding to a user input among a plurality of tag information is determined as the matching tag information.

이때, 상기 매칭 태그 정보는 복수의 태그 정보 중 사용자 입력에 대응될 확률 값에 기초하여 결정될 수 있다. 예컨대, 복수의 태그 정보 중 사용자 입력에 대응될 확률 값이 미리 정해진 기준 이상인 태그 정보가 매칭 태그 정보로 결정될 수 있다.In this case, the matching tag information may be determined based on a probability value corresponding to a user input among a plurality of tag information. For example, tag information in which a probability value corresponding to a user input is greater than or equal to a predetermined criterion among a plurality of tag information may be determined as matching tag information.

이상에서 실시 형태들에 설명된 특징, 구조, 효과 등은 본 발명의 적어도 하나의 실시 형태에 포함되며, 반드시 하나의 실시 형태에만 한정되는 것은 아니다. 나아가, 각 실시 형태에서 예시된 특징, 구조, 효과 등은 실시 형태들이 속하는 분야의 통상의 지식을 가지는 자에 의해 다른 실시 형태들에 대해서도 조합 또는 변형되어 실시 가능하다. 따라서 이러한 조합과 변형에 관계된 내용들은 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.Features, structures, effects, etc. described in the above embodiments are included in at least one embodiment of the present invention, and are not necessarily limited to only one embodiment. Furthermore, features, structures, effects, etc. illustrated in each embodiment can be combined or modified for other embodiments by those of ordinary skill in the art to which the embodiments belong. Accordingly, the contents related to such combinations and modifications should be interpreted as being included in the scope of the present invention.

또한, 이상에서 실시 형태를 중심으로 설명하였으나 이는 단지 예시일 뿐 본 발명을 한정하는 것이 아니며, 본 발명이 속하는 분야의 통상의 지식을 가진 자라면 본 실시 형태의 본질적인 특성을 벗어나지 않는 범위에서 이상에 예시되지 않은 여러 가지의 변형과 응용이 가능함을 알 수 있을 것이다. 즉, 실시 형태에 구체적으로 나타난 각 구성 요소는 변형하여 실시할 수 있는 것이다. 그리고 이러한 변형과 응용에 관계된 차이점들은 첨부된 청구 범위에서 규정하는 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.In addition, although the embodiment has been mainly described in the above, this is only an example and does not limit the present invention, and those of ordinary skill in the art to which the present invention pertains to the above in the range that does not depart from the essential characteristics of the present embodiment. It will be appreciated that various modifications and applications not illustrated are possible. That is, each component specifically shown in the embodiment can be implemented by modification. And differences related to such modifications and applications should be construed as being included in the scope of the present invention defined in the appended claims.

Claims

A system for extracting tag information from a screenshot image, comprising:
a user terminal in which at least one screenshot image is stored; and
Including; a server for transmitting and receiving information about the screen shot image with the user terminal;
The screenshot image includes at least one text,
The user terminal obtains a text area in which the text exists and a non-text area in which the text does not exist in the screenshot image, and then extracts text information from the text area through a character reading module, and then the text information is mapped to the identification number assigned to the screenshot image and transmitted to the server,
After receiving the text information from the user terminal, the server extracts a keyword from the text information, maps the tag information derived based on the keyword to the identification number, and transmits it to the user terminal,
A system for extracting tag information from screenshot images.

According to claim 1,
The screenshot image may include at least one text area,
the text area includes a first text area having a first attribute and a second text area having a second attribute to an N-th text area having an N-th attribute;
The first to Nth properties are determined by at least one of a text size and a font,
A system for extracting tag information from screenshot images.

3. The method of claim 2,
The text area includes a first text area, which is an area in which text having a first size within an error range, and a second text area, an area in which text having a second size within an error range, among texts in the screenshot image. containing,
A system for extracting tag information from screenshot images.

4. The method of claim 3,
The first text area is an area of text that is adjacent within a certain range among texts having the first size in the screenshot image,
The second text area is an area of text that is adjacent within a certain range among texts having the second size in the screenshot image;
A system for extracting tag information from screenshot images.

3. The method of claim 2,
The text area includes a first text area, which is an area in which text having a first font, among texts in the screenshot image, and a second text area, which is an area in which text having a second font exists,
A system for extracting tag information from screenshot images.

6. The method of claim 5,
The first text area is an area of text that is adjacent within a certain range among texts having the first font in the screenshot image,
The second text area is a text area adjacent within a certain range among texts having the second font in the screenshot image.
A system for extracting tag information from screenshot images.

6. The method according to claim 3 or 5,
The text information includes first text information and second text information,
The user terminal extracts the first text information from the first text area through the text reading module, and extracts the second text information from the second text area through the text reading module, and then the first text extracting information and the second text information,
the first text area and the second text area are independently input to the character reading module;
A system for extracting tag information from screenshot images.

3. The method of claim 2,
The user terminal obtains the text area from the screenshot image using a pre-trained neural network model,
The pre-trained neural network model is trained to distinguish the text area into the first text area and the second text area to the N-th text area.
A system for extracting tag information from screenshot images.

According to claim 1,
The keywords extracted from the text information may be plural,
The keyword is a word, number, sentence, or a combination thereof representing the screenshot image among texts included in the text information;
A system for extracting tag information from screenshot images.

According to claim 1,
The server extracts a plurality of keywords from the text information, and the importance is reflected in the plurality of keywords,
The importance is determined based on a probability of representing the screenshot image among words, numbers, sentences, or a combination thereof representing the screenshot image included in the text information;
A system for extracting tag information from screenshot images.

According to claim 1,
the server extracts the keyword from the text information through a pre-trained neural network model, and the pre-trained neural network model is trained to obtain the keyword based on the text information,
A system for extracting tag information from screenshot images.

According to claim 1,
After the user terminal extracts object information from the non-text area, the object information is mapped to the screenshot image and transmitted to the server,
After the server derives the tag information based on the object information and the keyword, the derived tag information is mapped to the screenshot image and then transmitted to the user terminal,
A system for extracting tag information from screenshot images.