KR102642894B1

KR102642894B1 - Method, apparatus and computer programs for identifying illustration images

Info

Publication number: KR102642894B1
Application number: KR1020230058022A
Authority: KR
Inventors: 박천명; 진혁준; 석주원
Original assignee: 주식회사 키위스튜디오
Priority date: 2023-05-03
Filing date: 2023-05-03
Publication date: 2024-03-04

Abstract

본 발명의 일 실시 예에 따른 그림 이미지 식별 방법은 이미지에서 분석 대상 영역을 추출하는 단계, 질감 분류 모델을 이용하여 분석 대상 영역에 미술도구의 질감을 갖는 객체와 배경 객체가 포함된 것으로 판단되면, 이미지를 그림 이미지로 식별하는 단계를 포함하며, 질감 분류 모델은 미술도구의 질감으로 그려진 점, 선 또는 면 중 적어도 하나를 포함하는 복수의 그림 이미지 세트와 배경 객체의 일 영역으로 구성된 학습 이미지로 학습된 기계학습 모델인 것을 특징으로 한다.
본 발명에 의하면, 다수의 이미지 중에서 그림 데이터를 포함하는 이미지만을 추출, 분류하여 활용할 수 있다.
A drawing image identification method according to an embodiment of the present invention includes the steps of extracting an analysis target area from an image, and using a texture classification model to determine that the analysis target area includes an object with the texture of an art tool and a background object, It includes the step of identifying an image as a drawing image, and the texture classification model is trained with a learning image consisting of a set of a plurality of drawing images containing at least one of a point, line, or plane drawn with the texture of an art tool and a region of a background object. It is characterized as a machine learning model.
According to the present invention, only images containing picture data can be extracted, classified, and utilized among a plurality of images.

Description

Methods, devices and computer programs for identifying pictorial images {METHOD, APPARATUS AND COMPUTER PROGRAMS FOR IDENTIFYING ILLUSTRATION IMAGES}

본 발명은 이미지에서 그림 이미지를 식별하는 방법, 장치 및 컴퓨터 프로그램에 관한 것이다. 구체적으로 본 발명은 기계학습 모델을 이용하여 이미지에서 그림 이미지를 식별하는 방법, 장치 및 컴퓨터 프로그램에 관한 것이다. The present invention relates to a method, device, and computer program for identifying a pictorial image in an image. Specifically, the present invention relates to a method, device, and computer program for identifying a pictorial image in an image using a machine learning model.

이미지에서 특정 객체를 분류하는 것은 컴퓨터 비전 분야에서 매우 중요한 기술 중 하나이다. 대표적으로 컨볼루션 신경망(Convolution Neural Networks, CNN)은 딥러닝 분야에서 가장 대표적인 알고리즘 중 하나로, 이미지에서 특징을 추출하는 데 사용되며, 이미지 세그멘테이션(Image Segmentation) 기술은 이미지 내부의 픽셀 그룹이 어떤 객체에 해당하는지를 알아내는데 사용되기도 한다.Classifying specific objects in images is one of the most important techniques in the field of computer vision. Typically, Convolution Neural Networks (CNN) is one of the most representative algorithms in the field of deep learning and is used to extract features from images, and Image Segmentation technology identifies which object a group of pixels within an image is. It can also be used to find out if it is applicable.

이처럼 이미지에 포함된 객체들을 식별하기 위한 방법은 다양하게 제시되어 왔으며, 객체의 종류가 무엇인지에 따라 객체 식별력을 높이기 위한 접근 방식이 상이하게 개발되어왔다.As such, various methods have been proposed to identify objects included in images, and different approaches to increase object identification have been developed depending on the type of object.

일 예로, 한국공개특허 제2017-0047500호는 인쇄 문서의 전자 문서화를 위하여 이미지에서 문서 영역을 검출하는데, 이를 위해 이미지에서 하나 이상의 엣지를 디텍팅하고, 그룹화된 라인들의 교차점에 기초하여 꼭지점을 결정하는 방식으로 문서 영역을 검출한다.For example, Korea Patent Publication No. 2017-0047500 detects a document area in an image for electronic documentation of a printed document. To this end, one or more edges are detected in the image and a vertex is determined based on the intersection of grouped lines. The document area is detected in this way.

기존에 이와 같이 배경 이미지와 구별되는 문서 영역을 검출하는 기술은 문서 인식, 명함 인식 등 다양한 애플리케이션에 적용되어왔으나, 텍스트가 포함된 문서 이미지가 아닌 그림이 포함된 이미지를 식별하는 기술은 상대적으로 개발이 미비하였다.Previously, technology for detecting document areas that are distinct from background images has been applied to various applications such as document recognition and business card recognition, but technology for identifying images containing pictures rather than document images containing text has been relatively developed. This was insufficient.

최근 한국등록특허 제10-2415106호와 같이 미술 작품에 기반하여 심리를 분석하는 등 그림 데이터를 이용하고자 하는 시도가 이루어지고 있다는 점 등을 고려하면, 이미지에서 객체들을 추출하기 전 해당 이미지가 그림 이미지인지 여부를 자동으로 판단함으로써 다수의 이미지 데이터 베이스에서 그림 데이터를 선별할 필요가 있다. 다만 전술한 바와 같이 그림 데이터가 포함된 그림 이미지를 식별하는 기술은 공개된 바가 많지 않아 개발이 필요한 상황이다.Considering that recent attempts have been made to use picture data, such as analyzing psychology based on works of art, such as Korean Patent No. 10-2415106, before extracting objects from the image, the image must be a picture image. There is a need to select picture data from multiple image databases by automatically determining whether it is recognized or not. However, as mentioned above, technology for identifying picture images containing picture data is not widely available and development is needed.

본 발명이 해결하고자 하는 일 과제는, 그림 데이터가 포함된 그림 이미지를 식별할 수 있는 방법, 장치 및 컴퓨터 프로그램을 제공하는 것이다.The problem to be solved by the present invention is to provide a method, device, and computer program that can identify a picture image containing picture data.

본 발명은 그림 데이터를 활용하기 위하여, 다수의 이미지에서 그림 데이터를 포함하는 그림 이미지를 분류하는 것을 일 목적으로 한다.The purpose of the present invention is to classify picture images containing picture data from multiple images in order to utilize picture data.

본 발명이 해결하고자 하는 과제는 상술한 과제로 제한되는 것은 아니며, 언급되지 아니한 과제들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The problem to be solved by the present invention is not limited to the above-described problem, and problems not mentioned can be clearly understood by those skilled in the art from this specification and the attached drawings. .

본 발명의 일 실시 예에 따른 그림 이미지 식별 방법은 이미지에서 분석 대상 영역을 추출하는 단계, 질감 분류 모델을 이용하여 분석 대상 영역에 미술도구의 질감을 갖는 객체가 포함된 것으로 판단되면, 상기 이미지를 그림 이미지로 식별하는 단계를 포함하며, 상기 질감 분류 모델은 미술도구의 질감으로 그려진 점, 선 또는 면 중 적어도 하나를 포함하는 복수의 그림 이미지 세트로 학습된 기계학습 모델인 것을 특징으로 한다.A drawing image identification method according to an embodiment of the present invention includes the steps of extracting an analysis target area from an image, and if it is determined that the analysis target area includes an object with the texture of an art tool using a texture classification model, the image It includes the step of identifying a picture image, wherein the texture classification model is a machine learning model learned with a set of a plurality of picture images including at least one of a point, line, or surface drawn with the texture of an art tool.

본 발명의 일 실시 예에 따른 그림 이미지 식별 장치는 이미지에서 분석 대상 영역을 추출하는 영역 추출, 질감 분류 모델을 이용하여 상기 분석 대상 영역에 미술도구의 질감이 포함된 것으로 판단되면, 상기 이미지를 그림 이미지로 식별하는 판단부를 포함하며, 상기 질감 분류 모델은 미술도구의 질감으로 그려진 점, 선 또는 면 중 적어도 하나를 포함하는 복수의 그림 이미지 세트로 학습된 기계 학습 모델인 것을 특징으로 한다.A drawing image identification device according to an embodiment of the present invention uses a region extraction and texture classification model to extract an analysis target area from an image, and when it is determined that the analysis target area contains the texture of an art tool, the image is drawn as a drawing. It includes a determination unit that identifies images, and the texture classification model is a machine learning model learned with a set of a plurality of picture images including at least one of a point, line, or surface drawn with the texture of an art tool.

본 발명의 과제의 해결 수단이 상술한 해결 수단들로 제한되는 것은 아니며, 언급되지 아니한 해결 수단들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The means for solving the problem of the present invention are not limited to the above-mentioned solution means, and the solution methods not mentioned will be clearly understood by those skilled in the art from this specification and the attached drawings. You will be able to.

본 발명의 일 실시 예에 따른 방법, 장치 및 컴퓨터 프로그램에 의하면, 다수의 이미지 중에서 그림 데이터를 포함하는 이미지만을 추출, 분류하여 활용할 수 있다.According to the method, device, and computer program according to an embodiment of the present invention, only images containing picture data can be extracted, classified, and utilized among a plurality of images.

또한 본 발명에 따르면, 그림 데이터를 활용하는 전 과정을 자동화할 수 있다. 즉, 본 발명에 의하면 기존에 데이터베이스에 저장되어 있던 이미지 중에서 그림 데이터의 활용 및 분석이 필요한 이미지를 자동으로 선별할 수 있으므로, 애플리케이션 내에서 분석 대상이 되는 이미지를 사용자가 선택하여 입력하는 단계를 생략할 수 있으며, 이는 결과적으로 사용자에게 우수한 사용자 경험(user experience)를 제공할 수 있다.Additionally, according to the present invention, the entire process of utilizing picture data can be automated. In other words, according to the present invention, it is possible to automatically select images that require utilization and analysis of picture data among images previously stored in a database, thereby omitting the step of the user selecting and entering images to be analyzed within the application. This can be done, and as a result, it can provide users with an excellent user experience.

본 발명의 효과는 상술한 효과들로 제한되지 않으며, 언급되지 아니한 효과들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.The effects of the present invention are not limited to the effects described above, and effects not mentioned can be clearly understood by those skilled in the art from this specification and the attached drawings.

도 1은 본 발명의 일 실시 예에 따른 그림 이미지 식별 장치의 블록도이다.
도 2는 본 발명의 다른 실시 예에 따른 그림 이미지 식별 장치의 블록도이다.

도 4는 본 발명의 일 실시 예에 따른 기계학습 모델을 이용하여 종이 영역을 검출하는 일 양상을 도시한 개략도이다.
도 3은 본 발명의 일 실시 예에 따른 질감 분류 모델을 이용하여 분석 대상 영역에 포함된 질감 정보를 획득하는 일 양상을 도시한 개략도이다.
도 5는 본 발명의 일 실시 예에 따른 그림 이미지 식별 방법을 도시한 순서도이다.
도 6은 본 발명의 일 실시 예에 따른 피사체 영역 추출 방법을 설명하기 위한 순서도이다.
도 7은 본 발명의 다른 실시 예에 따른 종이 영역 추출 방법을 설명하기 위한 순서도이다.
도 8은 본 발명의 일 실시 예에 따른 그림 이미지 식별 방법을 설명하기 위한 순서도이다.
도 9는 본 발명의 일 실시 예에 따른 RGB 깊이값을 이용한 그림 이미지 식별 방법을 설명하기 위한 순서도이다.

1 is a block diagram of a picture image identification device according to an embodiment of the present invention.
Figure 2 is a block diagram of a picture image identification device according to another embodiment of the present invention.

Figure 4 is a schematic diagram showing an aspect of detecting a paper area using a machine learning model according to an embodiment of the present invention.
Figure 3 is a schematic diagram showing an aspect of acquiring texture information included in an analysis target area using a texture classification model according to an embodiment of the present invention.
Figure 5 is a flowchart showing a method for identifying a picture image according to an embodiment of the present invention.
Figure 6 is a flowchart for explaining a method for extracting a subject area according to an embodiment of the present invention.
Figure 7 is a flowchart explaining a paper area extraction method according to another embodiment of the present invention.
Figure 8 is a flowchart for explaining a method for identifying a picture image according to an embodiment of the present invention.
Figure 9 is a flow chart to explain a method for identifying a picture image using RGB depth values according to an embodiment of the present invention.

본 발명의 상술한 목적, 특징들 및 장점은 첨부된 도면과 관련된 다음의 상세한 설명을 통해 보다 분명해질 것이다. 다만, 본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시 예들을 가질 수 있는 바, 이하에서는 특정 실시 예들을 도면에 예시하고 이를 상세히 설명하고자 한다.The above-described objects, features and advantages of the present invention will become more apparent through the following detailed description in conjunction with the accompanying drawings. However, since the present invention can make various changes and have various embodiments, specific embodiments will be illustrated in the drawings and described in detail below.

명세서 전체에 걸쳐서 동일한 참조번호들은 원칙적으로 동일한 구성요소들을 나타낸다. 또한, 각 실시 예의 도면에 나타나는 동일한 사상의 범위 내의 기능이 동일한 구성요소는 동일한 참조부호를 사용하여 설명하며, 이에 대한 중복되는 설명은 생략하기로 한다.Like reference numerals throughout the specification in principle refer to the same elements. In addition, components with the same function within the scope of the same idea shown in the drawings of each embodiment will be described using the same reference numerals, and overlapping descriptions thereof will be omitted.

본 발명과 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 본 명세서의 설명 과정에서 이용되는 숫자(예를 들어, 제1, 제2 등)는 하나의 구성요소를 다른 구성요소와 구분하기 위한 식별기호에 불과하다.If it is determined that a detailed description of a known function or configuration related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description will be omitted. In addition, numbers (eg, first, second, etc.) used in the description of this specification are merely identifiers to distinguish one component from another component.

또한, 이하의 실시 예에서 사용되는 구성요소에 대한 접미사 "부" 또는 “모듈”은 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다.In addition, the suffix “part” or “module” for the components used in the following embodiments is given or used interchangeably only considering the ease of writing the specification, and does not have a distinct meaning or role in itself.

이하의 실시 예에서, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.In the following examples, singular terms include plural terms unless the context clearly dictates otherwise.

이하의 실시 예에서, 포함하다 또는 가지다 등의 용어는 명세서상에 기재된 특징, 또는 구성요소가 존재함을 의미하는 것이고, 하나 이상의 다른 특징들 또는 구성요소가 부가될 가능성을 미리 배제하는 것은 아니다.In the following embodiments, terms such as include or have mean the presence of features or components described in the specification, and do not preclude the possibility of adding one or more other features or components.

도면에서는 설명의 편의를 위하여 구성 요소들이 그 크기가 과장 또는 축소될 수 있다. 예컨대, 도면에서 나타난 각 구성의 크기 및 두께는 설명의 편의를 위해 임의로 나타낸 것으로, 본 발명이 반드시 도시된 바에 한정되지 않는다.In the drawings, the sizes of components may be exaggerated or reduced for convenience of explanation. For example, the size and thickness of each component shown in the drawings are arbitrarily shown for convenience of explanation, and the present invention is not necessarily limited to what is shown.

어떤 실시 예가 달리 구현 가능한 경우에 특정한 프로세스의 순서는 설명되는 순서와 다르게 수행될 수도 있다. 예를 들어, 연속하여 설명되는 두 프로세스가 실질적으로 동시에 수행될 수도 있고, 설명되는 순서와 반대의 순서로 진행될 수 있다.In cases where an embodiment can be implemented differently, the order of specific processes may be performed differently from the order described. For example, two processes described in succession may be performed substantially simultaneously, or may proceed in an order opposite to that in which they are described.

이하의 실시 예에서, 구성 요소 등이 연결되었다고 할 때, 구성 요소들이 직접적으로 연결된 경우뿐만 아니라 구성요소들 중간에 구성 요소들이 개재되어 간접적으로 연결된 경우도 포함한다.In the following embodiments, when components are connected, this includes not only the case where the components are directly connected, but also the case where the components are indirectly connected by intervening between the components.

예컨대, 본 명세서에서 구성 요소 등이 전기적으로 연결되었다고 할 때, 구성 요소 등이 직접 전기적으로 연결된 경우뿐만 아니라, 그 중간에 구성 요소 등이 개재되어 간접적으로 전기적 연결된 경우도 포함한다.For example, in this specification, when components, etc. are said to be electrically connected, this includes not only cases where the components are directly electrically connected, but also cases where components, etc. are interposed and indirectly electrically connected.

한편, 본 명세서에서 “그림 이미지”는 그림 데이터를 포함하는 이미지를 의미한다. “그림 데이터”란, 미술도구를 이용하여 생성된 점, 선 또는 면으로 구성된 2차원 또는 3차원 데이터를 포함할 수 있다. 또한 그림 데이터는 전자 장치의 애플리케이션을 이용하여 생성된 그림 데이터를 포함할 수 있다.Meanwhile, in this specification, “picture image” means an image containing picture data. “Picture data” may include two-dimensional or three-dimensional data consisting of points, lines, or surfaces created using art tools. Additionally, picture data may include picture data generated using an application of an electronic device.

통상적으로 “그림”은 색이 나오는 도구를 이용하여 형상을 평면상에 나타낸 회화를 의미하나, 본 명세서에서 “그림”은 그 외에도 색종이, 나무, 플라스틱 등 다양한 재료를 사용하여 창작된 조각, 공예, 건축, 디자인, 판화, 소묘 등의 미술 작품을 모두 포괄하는 의미로 이해될 수 있다. 즉, 본 명세서에서 “그림 데이터”란 미술 작품에 대응되는 시각적 데이터를 의미하는 것으로 해석될 수 있으며, 2차원 또는 3차원 이미지를 통해 표시되는 것일 수 있다.Typically, “painting” refers to a painting in which a shape is expressed on a flat surface using colored tools, but in this specification, “painting” refers to sculptures, crafts, etc. created using various materials such as colored paper, wood, and plastic. It can be understood as encompassing all works of art such as architecture, design, prints, and drawings. In other words, “picture data” in this specification can be interpreted to mean visual data corresponding to a work of art, and may be displayed through a two-dimensional or three-dimensional image.

이하에서는 도 1 내지 도 7을 참고하여 본 발명의 그림 이미지 식별 장치, 방법 및 컴퓨터 프로그램에 관하여 설명한다.Hereinafter, the picture image identification device, method, and computer program of the present invention will be described with reference to FIGS. 1 to 7.

도 1은 본 출원의 일 실시예에 따른 그림 이미지 식별 장치(100)의 블록도이다.Figure 1 is a block diagram of a picture image identification device 100 according to an embodiment of the present application.

본 발명의 일 실시 예에 따른 그림 이미지 식별 장치(100)는 전처리부(110/생략 가능), 영역 추출부(130) 및 판단부(150)을 포함하는 전자 장치의 프로세서일 수 있다. 그림 이미지 식별 장치(100)는 서버에서 동작하는 프로세서일 수 있으며, 단말에서 동작하는 프로세서일 수도 있다. 만약 그림 이미지 식별 장치(100)가 서버에서 동작하는 프로세서인 경우, 서버는 통신 모듈을 통해 단말로부터 이미지를 획득하여 그림 이미지 식별 장치(100)로 전달하여 이미지가 그림 이미지인지 여부를 식별할 수 있다. 그림 이미지 식별 장치(100)가 단말에서 동작하는 프로세서인 경우, 단말은 단말 장치에 저장된 이미지 또는 단말 장치에 포함된 카메라 모듈을 이용하여 획득한 이미지를 그림 이미지 식별 장치(100)로 전달할 수 있으며, 그림 이미지 식별 장치(100)를 통해 이미지가 그림 이미지인지 여부를 식별할 수 있다.The picture image identification device 100 according to an embodiment of the present invention may be a processor of an electronic device including a pre-processing unit 110 (can be omitted), an area extraction unit 130, and a determination unit 150. The picture image identification device 100 may be a processor operating in a server or a processor operating in a terminal. If the picture image identification device 100 is a processor that operates on a server, the server can acquire an image from the terminal through a communication module and transmit it to the picture image identification device 100 to identify whether the image is a picture image. . If the picture image identification device 100 is a processor operating in a terminal, the terminal may transmit an image stored in the terminal device or an image acquired using a camera module included in the terminal device to the picture image identification device 100, The picture image identification device 100 can identify whether an image is a picture image.

전처리부(110)는 그림 이미지 식별 장치(100)가 그림 이미지를 식별함에 있어서 선택적으로 포함될 수 있다.The pre-processing unit 110 may be optionally included when the picture image identification device 100 identifies a picture image.

전처리부(110)는 이미지를 텍스트 추출 모델에 적용하여 이미지에서 텍스트가 포함된 글자 영역을 하나 이상 추출할 수 있다. 전처리부(110)는 이미지에서 글자로 인식된 영역을 제거할 수 있으며, 추가적으로 노이즈를 제거할 수 있다.The preprocessor 110 may apply the image to a text extraction model to extract one or more character areas containing text from the image. The pre-processing unit 110 can remove areas recognized as letters from the image and can additionally remove noise.

다른 실시 예에서, 전처리부(110)는 글씨 영역과 노이즈가 제거된 이미지를 전처리 이미지로 생성하여 이를 영역 추출부(130)로 전달할 수 있다. 전처리부(110)가 작업을 수행하는 경우, 영역 추출부(130)는 이미지가 아닌 전처리 이미지에서 분석 대상 영역을 추출할 수 있다.In another embodiment, the preprocessor 110 may generate an image from which the text area and noise have been removed as a preprocessed image and transmit it to the area extractor 130. When the pre-processing unit 110 performs a task, the area extracting unit 130 may extract the analysis target area from the pre-processing image rather than the image.

영역 추출부(130)는 이미지에서 분석 대상 영역을 추출할 수 있다. 여기서 이미지는 사진 이미지일 수 있으며, 전자 장치에서 생성된 이미지일 수도 있다. 이미지는 데이터베이스에 저장되어있던 이미지일 수 있고, 카메라와 같은 촬상 장치(또는 모듈)로부터 실시간으로 획득된 것일 수도 있다.The region extraction unit 130 may extract an analysis target region from the image. Here, the image may be a photographic image or an image generated by an electronic device. The image may be an image stored in a database, or may be acquired in real time from an imaging device (or module) such as a camera.

분석 대상 영역은 분석하고자 하는 관심 영역으로, 피사체 영역일 수도 있고, 종이로 판단되는 일 영역일 수 있다.The analysis target area is the area of interest to be analyzed, and may be a subject area or an area judged on paper.

영역 추출부(130)는 피사체 영역 또는 종이 영역을 포함하는 분석 대상 영역을 추출하여 이를 판단부(150)로 전달할 수 있다.The area extraction unit 130 may extract the analysis target area including the subject area or the paper area and transmit it to the determination unit 150.

이하에서는 영역 추출부(130)가 분석하고자 하는 관심 영역이 피사체 영역인 경우의 일 실시 예와, 종이 영역인 경우의 일 실시 예를 보다 구체적으로 설명한다.Below, an example where the region of interest to be analyzed by the region extraction unit 130 is a subject region and an embodiment where the region is a paper region will be described in more detail.

분석 대상 영역이 피사체 영역인 경우, 영역 추출부(130)는 이미지에서 배경을 제거하고 피사체 영역을 검출한다. 일 실시 예에 따르면, 영역 추출부(130)는 피사체/배경 영역 분류 모델에 이미지를 적용하여 피사체 영역을 크롭하는 방식으로 배경을 제거할 수 있다. 여기서 피사체/배경 영역 분류 모델은 픽셀 단위로 영역 특성이 라벨링된 대용량 데이터 세트를 이용하여 학습된 기계학습 모델일 수 있다. 예를 들어, 피사체/배경 영역 분류 모델은 각 픽셀 단위로 피사체인지, 배경인지가 라벨링된 대용량 데이터 세트로 학습된 딥러닝 모델일 수 있다. 대용량 데이터 세트로는 DUTS-TE, DUT-OMRON, HKU-IS, ECSSD 등이 사용될 수 있으며, 개별적으로 수집된 이미지 데이터들이 활용될 수 있다.When the analysis target area is a subject area, the area extractor 130 removes the background from the image and detects the subject area. According to one embodiment, the area extractor 130 may remove the background by applying the image to a subject/background area classification model and cropping the subject area. Here, the subject/background area classification model may be a machine learning model learned using a large data set in which area characteristics are labeled on a pixel basis. For example, a subject/background area classification model may be a deep learning model learned from a large data set in which each pixel is labeled as subject or background. DUTS-TE, DUT-OMRON, HKU-IS, ECSSD, etc. can be used as large data sets, and individually collected image data can be used.

한편, 피사체/배경 영역 분류 모델로는 unet 계열의 딥러닝 모델이 사용될 수 있으나, 반드시 이에 한정되는 것은 아니다.Meanwhile, a unet series deep learning model can be used as a subject/background area classification model, but is not necessarily limited to this.

영역 추출부(130)는 이미지에서 피사체 영역이 검출되면, 피사체 영역을 리사이징하여 분석 대상 영역으로 설정하고 리사이징된 피사체 영역을 질감 분류 모델에 적용할 수 있다. 일 실시 예로, 영역 추출부(130)는 피사체 영역의 가로 세로가 동일해지도록 리사이징 할 수 있다. 피사체 영역의 리사이징은 질감 분류 모델이 기 설정된 이미지 사이즈를 입력받는 경우 수행될 수 있다.When a subject area is detected in an image, the area extractor 130 may resize the subject area, set it as an analysis target area, and apply the resized subject area to a texture classification model. In one embodiment, the area extractor 130 may resize the subject area so that the width and height are the same. Resizing the subject area can be performed when the texture classification model receives a preset image size.

분석 대상 영역이 종이 영역인 경우, 영역 추출부(130)는 이미지에서 종이 영역을 검출하고, 검출된 종이 영역을 크롭하여 분석 대상 영역으로 설정할 수 있다. 이때, 종이 영역은 스케치북, 도화지, A4 용지 등과 같이 종이로 인식되는 영역을 의미한다.When the analysis target area is a paper area, the area extractor 130 may detect the paper area in the image, crop the detected paper area, and set it as the analysis target area. At this time, the paper area refers to an area recognized as paper, such as a sketchbook, drawing paper, A4 paper, etc.

일 실시 예에 따르면, 영역 추출부(130)는 종이 영역을 검출함에 있어서, 종이 영역 검출 모델, 엣지 검출기, 스케치북 영역 검출 모델을 이용하여 종이 영역을 검출할 수 있는데, 각 기계학습 모델 각각에 이미지를 적용하고 단일 모델의 검출 결과에 따라 종이 영역을 크롭하는 것도 가능하며, 도 3에 도시된 바와 같이 복수 개의 모델에 이미지를 적용하여 획득한 종이 영역 검출 결과를 교차 검증할 수도 있다. 이러한 실시 예에 따르면 종이 영역 검출 정확도가 높아지므로, 결과적으로 그림 이미지 식별 정확도도 높아질 수 있다.According to one embodiment, when detecting the paper area, the area extractor 130 may detect the paper area using a paper area detection model, an edge detector, and a sketchbook area detection model. An image is included in each machine learning model. It is also possible to apply and crop the paper area according to the detection result of a single model, and it is also possible to cross-verify the paper area detection result obtained by applying the image to multiple models, as shown in FIG. 3. According to this embodiment, the paper area detection accuracy increases, and as a result, the picture image identification accuracy can also increase.

전술한 실시 예에서 종이 영역 검출 모델은 복수의 종이 이미지로 학습된 YOLO와 같은 기계학습 모델일 수 있다. 종이 영역 검출 모델은 바운딩 박스를 사용하여 레이블링된 학습 데이터 세트를 이용하여 학습된 모델로, 이미지를 적용하면 종이 영역에 대응되는 바운딩 박스 좌표값을 출력할 수 있다. 엣지 검출기는 인접 픽셀 간의 히스토그램 차이를 이용하여 엣지를 검출할 수 있다. 엣지 검출기는 canny edge detector와 같은 영상처리 라이브러리를 사용하여 테두리를 검출하는 것으로, 픽셀들 간의 히스토그램 값을 구하여 인접 픽셀과 히스토그램 값의 차이가 크면 이를 엣지로 정의하는 검출기이다. 또한 스케치북 영역 검출 모델은 스케치북 영역을 픽셀 단위로 레이블링한 학습 데이터를 이용하여 학습된 모델일 수 있다. 엣지 검출기를 이용하여 종이 영역을 검출하는 다른 실시 예는 다음과 같다. 영역 추출부(130)는 엣지 검출기를 이용하여 이미지에 포함된 모든 오브젝트의 엣지를 검출할 수 있다. 영역 추출부(130)는 엣지의 형태를 분석하여 사각형의 형태를 갖는 엣지(윤곽선)를 모두 추출할 수 있다. 그리고 사각형의 형태를 갖는 엣지 중 가장 크기가 큰 사각형에 대응되는 일 영역을 종이 영역으로 설정할 수 있다. 여기서 종이 영역으로 설정된 일 영역은 그림 영역으로 식별될 수 있다.In the above-described embodiment, the paper area detection model may be a machine learning model such as YOLO learned with a plurality of paper images. The paper area detection model is a model learned using a learning data set labeled using a bounding box. When an image is applied, the bounding box coordinates corresponding to the paper area can be output. The edge detector can detect edges using the histogram difference between adjacent pixels. The edge detector detects edges using an image processing library such as canny edge detector. It is a detector that obtains histogram values between pixels and defines them as edges when the difference between adjacent pixels and histogram values is large. Additionally, the sketchbook area detection model may be a model learned using training data labeling the sketchbook area in pixel units. Another example of detecting a paper area using an edge detector is as follows. The area extractor 130 can detect the edges of all objects included in the image using an edge detector. The area extractor 130 can analyze the shape of the edges and extract all edges (outlines) that have a rectangular shape. And, among the edges having a square shape, an area corresponding to the largest square can be set as the paper area. Here, an area set as a paper area can be identified as a drawing area.

영역 추출부(130)는 이미지에서 종이 영역을 검출하여 크롭하면, 이를 판단부(150)에 전달하여 판단부(150)가 종이 영역 내에 그림 데이터가 포함되었는지 여부를 판단할 수 있도록 한다.When the area extraction unit 130 detects and crops the paper area in the image, it transmits this to the determination unit 150 so that the determination unit 150 can determine whether picture data is included in the paper area.

판단부(150)는 질감 분류 모델을 이용하여 분석 대상 영역에 미술도구의 질감을 갖는 객체가 포함되었는지 여부를 판단할 수 있다. 만약 판단 결과, 분석 대상 영역에 미술도구의 질감을 갖는 객체가 포함된 것으로 판단되면, 판단부(150)는 이미지를 그림 이미지로 분류할 수 있다. 이때, 미술도구는 크레파스, 물감, 색종이, 연필, 볼펜, 사인펜, 파스텔, 페인트, 점토, 스탬프 또는 마카 중 적어도 하나일 수 있으며, 물감은 아크릴 물감, 수채화 물감, 유화 물감 등 그 종류와 무관하게 색을 들이는 물질을 모두 포함할 수 있다.The determination unit 150 may determine whether an object having the texture of an art tool is included in the analysis target area using a texture classification model. If, as a result of the determination, it is determined that the analysis target area includes an object having the texture of an art tool, the determination unit 150 may classify the image as a drawing image. At this time, the art tool may be at least one of crayons, paint, colored paper, pencil, ballpoint pen, marker pen, pastel, paint, clay, stamp, or marker, and the paint may be of any color, such as acrylic paint, watercolor paint, or oil paint. It can contain all substances that contain .

판단부(150)에서 미술도구의 질감을 갖는 객체(그림 데이터)의 포함 여부를 판단함에 있어서는 질감 분류 모델이 사용된다. 질감 분류 모델은 미술도구의 질감으로 그려진 점, 선 또는 면 중 적어도 하나를 포함하는 복수의 그림 이미지 세트로 학습된 기계학습 모델로 CNN 기반의 다중 분류 모델일 수 있다. 질감 분류 모델을 학습시키는데 사용되는 복수의 그림 이미지 세트에는 미술도구의 질감으로 그려진 객체를 포함하는 그림 이미지 세트 외에도, 플라스틱, 피부, 장판, 마루바닥 또는 나무 중 적어도 하나의 질감을 갖는 객체의 일 영역으로 구성된된 학습 이미지가 더 포함될 수 있다. 전술한 객체들은 그림 이미지에 빈번하게 포함되는 것으로, 그림 이미지 분류의 정확도를 높이기 위해 학습 이미지에 포함될 수 있다. 즉, 질감 분류 모델은 특정한 질감을 갖는 객체의 사진 등에서 크롭한 일 영역의 이미지를 학습 데이터로 하여 학습된 모델일 수 있으며, 입력되는 이미지에 포함된 하나 이상의 객체의 질감을 분류할 수 있다. 판단부(150)는 미술도구의 질감을 갖는 객체만을 포함하는 그림 이미지 세트로 학습된 질감 분류 모델과 미술 도구의 질감을 갖는 객체 및 배경 객체(피부, 장판, 마루 바닥, 나무 등)를 모두 포함하는 그림 이미지 세트로 학습된 질감 분류 모델을 함께 사용할 수 있으며, 이들을 개별적으로 사용할 수도 있다.The determination unit 150 uses a texture classification model to determine whether an object (picture data) with a texture of an art tool is included. The texture classification model is a machine learning model learned with a set of multiple picture images containing at least one of points, lines, or faces drawn with the texture of an art tool, and may be a CNN-based multi-classification model. A plurality of picture image sets used to train a texture classification model include, in addition to a set of picture images containing objects drawn with the texture of art tools, a region of the object having at least one texture among plastic, skin, flooring, flooring, or wood. A learning image consisting of may be further included. The above-described objects are frequently included in picture images and can be included in learning images to increase the accuracy of picture image classification. That is, the texture classification model may be a model learned using as learning data an image of a region cropped from a photo of an object with a specific texture, etc., and may classify the texture of one or more objects included in the input image. The judgment unit 150 includes both a texture classification model learned from a set of picture images containing only objects with the texture of art tools, objects with the texture of art tools, and background objects (skin, flooring, flooring, wood, etc.) Texture classification models learned on a set of picture images can be used together, or they can be used individually.

판단부(150)는 질감 분류 모델의 분류 결과로 미술 도구(크레파스, 물감, 색종이, 연필, 볼펜, 사인펜 등) 중 적어도 하나가 높은 확률로 존재하는 것으로 확인 되면, 해당 분석 대상 영역에 그림 데이터가 포함된 것으로 판단할 수 있다. 그리고 해당 분석 대상 영역을 포함하는 이미지를 그림 이미지, 즉 그림 데이터가 포함된 그림 이미지로 판단할 수 있다. 즉, 도 4에 도시된 바와 같이, 질감 분류 모델은 분석 대상 영역을 입력하면 분석 대상 영역에 포함된 질감 정보를 획득하는 모델로써, 각 미술 도구의 질감을 갖는 객체가 포함되어 있을 확률을 출력값으로 제공할 수 있다. 판단부(150)는 출력되는 질감 정보를 이용하여 그림 데이터의 포함 여부를 판단하게 되며, 분석 대상 영역에 그림 데이터가 포함되어 있다고 판단되면 분석 대상 영역을 포함하는 분석 대상 이미지를 그림 이미지로 판단할 수 있게 되는 것이다.If it is confirmed that at least one of the art tools (crayons, paints, colored paper, pencils, ballpoint pens, marker pens, etc.) exists with a high probability as a result of the classification of the texture classification model, the judgment unit 150 determines that the drawing data is in the corresponding analysis target area. It can be judged as included. And the image containing the analysis target area can be judged to be a drawing image, that is, a drawing image containing drawing data. That is, as shown in Figure 4, the texture classification model is a model that obtains texture information included in the analysis target area when the analysis target area is input. The output value is the probability that an object with the texture of each art tool is included. can be provided. The determination unit 150 uses the output texture information to determine whether picture data is included. If it is determined that the analysis target area contains picture data, the analysis target image including the analysis target area is judged to be a picture image. It becomes possible.

이하에서는 도 2를 참조하여 본 발명의 다른 실시 예에 따른 그림 이미지 식별 장치(200)를 설명한다. 도 2에 도시된 일 실시 예에 따르면, 본 발명의 일 실시 예에 따른 그림 이미지 식별 장치(200)는 전처리부(210/생략 가능), 픽셀 분석부(230), 및 판단부(250)을 포함하는 전자 장치의 프로세서일 수 있다. 그림 이미지 식별 장치(200)는 서버에서 동작하는 프로세서일 수 있으며, 단말에서 동작하는 프로세서일 수도 있다. 만약 그림 이미지 식별 장치(200)가 서버에서 동작하는 프로세서인 경우, 서버는 통신 모듈을 통해 단말로부터 이미지를 획득하여 그림 이미지 식별 장치(200)로 전달하여 이미지가 그림 이미지인지 여부를 식별할 수 있다. 그림 이미지 식별 장치(200)가 단말에서 동작하는 프로세서인 경우, 단말은 단말 장치에 저장된 이미지 또는 단말 장치에 포함된 카메라 모듈을 이용하여 획득한 이미지를 그림 이미지 식별 장치(200)로 전달할 수 있으며, 그림 이미지 식별 장치(200)를 통해 이미지가 그림 이미지인지 여부를 식별할 수 있다.Hereinafter, a picture image identification device 200 according to another embodiment of the present invention will be described with reference to FIG. 2. According to an embodiment shown in FIG. 2, the picture image identification device 200 according to an embodiment of the present invention includes a preprocessor (210/omitted), a pixel analysis unit 230, and a determination unit 250. It may be a processor of an electronic device including a processor. The picture image identification device 200 may be a processor operating in a server or a processor operating in a terminal. If the picture image identification device 200 is a processor that operates on a server, the server can acquire an image from the terminal through a communication module and transmit it to the picture image identification device 200 to identify whether the image is a picture image. . If the picture image identification device 200 is a processor operating in a terminal, the terminal may transmit an image stored in the terminal device or an image acquired using a camera module included in the terminal device to the picture image identification device 200, The picture image identification device 200 can identify whether an image is a picture image.

전처리부(210)는 도 1에서 설명한 전처리부(110)와 동일한 작업을 수행할 수 있으며, 생략 가능한 특성도 동일하다. 따라서 전처리부(210)에 대한 설명은 전처리부(110)에 대한 설명에 갈음한다.The pre-processing unit 210 can perform the same tasks as the pre-processing unit 110 described in FIG. 1, and has the same omitted characteristics. Therefore, the description of the pre-processing unit 210 replaces the description of the pre-processing unit 110.

픽셀 분석부(230)는 이미지에서 이미지를 구성하는 픽셀의 특성 정보를 추출할 수 있다. 이때 특성 정보는 픽셀의 RGB값, 픽셀의 RGB 깊이값(RGB-DEPTH) 중 어느 하나일 수 있다. 픽셀 분석부(230)는 판단부(250)로 이미지를 구성하고 있는 픽셀의 RGB값 또는 픽셀의 RGB 깊이값을 전달할 수 있다. 일 실시 예에 따른 픽셀의 RGB 깊이값을 추출하는 방법은 다음과 같다. 먼저 픽셀 분석부(230)는 RGB-d 객체 데이터세트(RGB-d Object Dataset)와 같이 깊이가 RGB로 표시된 데이터 세트를 이용하여 학습된 단안 깊이 추정 모델에 이미지를 적용하여, 이미지의 깊이를 RGB값으로 표시할 수 있다. RGB-d 객체 데이터세트는는 RGB값으로 사물이 2D 이미지 에서 어느정도 깊이에 위치하는지 학습할 수 있는 데이터세트이며, 단안 깊이 추정 모델은 Monocular depth estimation 모델로, pix2pix나 unet 기반 모델이 사용될 수 있다. 그리고 픽셀 분석부(230)는 RGB값으로 표시된 이미지에서 각 픽셀별 RGB값의 깊이값을 추출함으로써, 각 픽셀의 RGB 깊이값을 획득할 수 있다.The pixel analysis unit 230 may extract characteristic information of pixels constituting the image from the image. At this time, the characteristic information may be either the RGB value of the pixel or the RGB depth value (RGB-DEPTH) of the pixel. The pixel analysis unit 230 may transmit the RGB value of the pixel or the RGB depth value of the pixel constituting the image to the determination unit 250. A method of extracting the RGB depth value of a pixel according to one embodiment is as follows. First, the pixel analysis unit 230 applies the image to a monocular depth estimation model learned using a data set in which depth is expressed in RGB, such as the RGB-d Object Dataset, and sets the depth of the image to RGB. It can be expressed as a value. The RGB-d object dataset is a dataset that can learn how deep an object is located in a 2D image using RGB values, and the monocular depth estimation model is a monocular depth estimation model, and pix2pix or unet-based models can be used. Additionally, the pixel analysis unit 230 can obtain the RGB depth value of each pixel by extracting the depth value of the RGB value for each pixel from the image displayed as the RGB value.

판단부(250)는 픽셀의 특성 정보(픽셀의 RGB값, 픽셀의 RGB 깊이값)를 이용하여 이미지를 그림 이미지로 식별할 수 있다.The determination unit 250 may identify the image as a drawing image using pixel characteristic information (pixel RGB value, pixel RGB depth value).

일 예로, 판단부(250)는 픽셀 분석부(230)에서 추출된 특성 정보가 RGB값이면 인접한 픽셀의 RGB값의 차이가 기 설정된 임계값 이상인 픽셀 그룹을 검출할 수 있다. 이때 RGB값의 차이는 유클리디언 거리를 연산하는 방식으로 구할 수 있다. 그리고 임계값 이상의 RGB값 차이를 갖는 픽셀 그룹이 미리 설정한 개수 이상 존재한다고 판단되면, 이미지를 그림 이미지로 식별할 수 있다.For example, if the characteristic information extracted by the pixel analysis unit 230 is an RGB value, the determination unit 250 may detect a pixel group in which the difference in RGB values of adjacent pixels is greater than or equal to a preset threshold. At this time, the difference between RGB values can be obtained by calculating the Euclidean distance. And, if it is determined that there are more than a preset number of pixel groups with RGB value differences greater than the threshold, the image can be identified as a drawing image.

예를 들어, (2,3), (3,4), (4,5) 위치의 픽셀의 RGB값이 [220,255,80], (2,4),(3,5),(4,6) 위치의 픽셀의 RGB값이 [255,255,255], 인접 픽셀 간 RGB값 차이의 판단 기준이 되는 임계값이 150이라고 가정한다. 판단부(250)는 인접한 픽셀, 즉 {(2,3)과 (2,4)}, {(3,4)와 (3,5)}, {(4,5)와 (4,6)}간의 RGB값 차이를 계산할 수 있으며, 이 경우 각 인접 픽셀 간 RGB값의 차이가 약 178.4로 산출되는 바, 모두 임계값 이상의 차이를 갖는 픽셀 그룹이라고 판단할 수 있다. 임계값 이상의 RGB값 차이를 갖는 픽셀 그룹은 부자연스러운 그라데이션을 의미하는 것으로, 판단부(250)는 그라데이션이 자연스럽지 않은 점, 선, 면의 객체가 이미지에 포함되어 있으면, 해당 이미지를 그림 이미지로 판단할 수 있다. For example, the RGB values of pixels at positions (2,3), (3,4), (4,5) are [220,255,80], (2,4),(3,5),(4,6) ) Assume that the RGB value of the pixel at the location is [255,255,255], and the threshold value that serves as a criterion for judging the difference in RGB values between adjacent pixels is 150. The determination unit 250 determines adjacent pixels, that is, {(2,3) and (2,4)}, {(3,4) and (3,5)}, and {(4,5) and (4,6). } can be calculated, and in this case, the difference in RGB values between each adjacent pixel is calculated to be about 178.4, so it can be determined that it is a group of pixels that all have a difference greater than the threshold value. A pixel group with an RGB value difference greater than a threshold means an unnatural gradation. If the image contains an object with an unnatural gradation of points, lines, or surfaces, the determination unit 250 converts the image into a drawing image. You can judge.

픽셀 분석부(230)에서 추출된 특성 정보가 RGB값인 경우의 다른 실시 예로, 판단부(250)는 추출된 RGB값을 클러스터링하고, 클러스터링된 군집의 수가 기 설정된 개수 미만이면, 이미지를 그림 이미지로 식별할 수 있다. k-means 클러스터링과 같은 방식으로 픽셀별 RGB값을 클러스터링하면 비슷한 범위의 색들이 클러스터링될 수 있다. 그림 이미지의 경우 미술도구에서 제공되는 색상 내에서 창작이 이루어지는 것이 일반적이므로, 그림 이미지에 포함된 색 군집의 수는 사진 이미지에 포함된 색 군집 대비 개수가 적을 가능성이 높다. 이를 이용하여 이미지를 그림 이미지로 식별하기 위한 기준 개수가 설정될 수 있으며, 기준 개수보다 적은 색 군집을 갖는 경우 판단부(250)는 해당 이미지가 그림 이미지라고 판단할 수 있다. 예를 들어 기준 개수가 8개인 경우, 판단부(250)는 이미지에서 추출하여 클러스터링한 색 군집의 개수가 5개이면 이미지를 그림 이미지로 분류하고, 색 군집의 개수가 10개이면 이를 사진 이미지로 분류하는 식이다.As another example when the characteristic information extracted by the pixel analysis unit 230 is an RGB value, the determination unit 250 clusters the extracted RGB values, and if the number of clustered clusters is less than a preset number, the image is converted into a picture image. can be identified. By clustering RGB values for each pixel in a manner such as k-means clustering, colors in a similar range can be clustered. In the case of drawing images, creation is generally done within the colors provided by art tools, so the number of color groups included in the drawing image is likely to be smaller than the number of color groups included in the photographic image. Using this, a standard number for identifying an image as a drawing image can be set, and if it has fewer color clusters than the standard number, the determination unit 250 can determine that the image is a drawing image. For example, when the standard number is 8, the judgment unit 250 classifies the image as a picture image if the number of color clusters extracted and clustered from the image is 5, and if the number of color clusters is 10, it is classified as a photo image. It is a classification method.

기준 개수는 사용자 특성에 따라 상이하게 설정될 수 있다. 만약 그림 이미지 식별 장치(100)가 일정 기간 동안 사용자 A의 그림 이미지들을 분석하여, 사용자 A의 그림에는 평균적으로 6개 내외의 색상군이 사용된다고 하면, 판단부(250)는 기준 개수를 7~8개로 설정할 수 있다. 만약 사용자 B가 업로드한 그림 이미지에는 평균적으로 15개 내외의 색상군이 사용되는 경우, 그림 이미지 식별 장치(100)는 사용자 B의 데이터베이스 내 이미지를 대상으로 그림 이미지를 식별함에 있어서는 기준 개수를 15~16개로 설정할 수 있을 것이다.The standard number may be set differently depending on user characteristics. If the picture image identification device 100 analyzes user A's picture images for a certain period of time and determines that user A's picture uses about 6 color groups on average, the determination unit 250 sets the standard number to 7~ It can be set to 8. If the picture image uploaded by user B uses about 15 color groups on average, the picture image identification device 100 sets the standard number to 15 when identifying the picture image for the images in user B's database. You can set it to 16.

픽셀 분석부(230)에서 추출된 특성 정보가 RGB 깊이값인 경우, 판단부(250)는 이미지에서 RGB깊이값이 기 설정된 범위에 속하는 일 영역을 크롭하여 크롭 이미지를 생성할 수 있다. 예를 들어, 깊이값이 유사한 범위 내에 있다면, 이는 해당 픽셀들이 동일 선상에 위치한다는 것을 의미하며, 곧 편평한 객체를 나타내는 픽셀일 가능성이 높다. 따라서 판단부(250)는 편평한 종이 또는 색종이일 가능성이 높은 일 영역을 크롭함으로써 종이 영역을 검출할 수 있다. 이렇게 검출된 종이 영역(크롭 이미지)은 동일한 가로 세로 길이를 갖도록 리사이즈된 후 질감 분류 모델에 적용될 수 있다. 즉, 동 실시 예에서 크롭 이미지는 도 3에 도시된 분석 대상 영역(관심 영역)이 된다고 할 수 있으며, 이후의 과정은 도 1의 판단부(150)가 질감 분류 모델을 이용하여 그림 이미지를 식별하는 것과 유사하다. 질감 분류 모델의 특성은 도 1에서 설명한 바 있으므로 생략한다.When the characteristic information extracted by the pixel analysis unit 230 is an RGB depth value, the determination unit 250 may generate a cropped image by cropping an area in the image where the RGB depth value falls within a preset range. For example, if the depth values are within a similar range, this means that the pixels are located on the same line, and there is a high possibility that the pixels represent a flat object. Accordingly, the determination unit 250 can detect the paper area by cropping an area that is likely to be flat paper or colored paper. The paper area (cropped image) detected in this way can be resized to have the same width and height and then applied to the texture classification model. In other words, in this embodiment, the cropped image can be said to become the analysis target area (region of interest) shown in FIG. 3, and in the subsequent process, the determination unit 150 of FIG. 1 identifies the picture image using the texture classification model. It is similar to doing The characteristics of the texture classification model have been described in Figure 1, so they are omitted.

이하에서는 도 5 내지 도 7을 참조하여, 본 발명의 일 실시 예에 따른 그림 이미지 식별 방법을 설명한다.Hereinafter, a method for identifying a picture image according to an embodiment of the present invention will be described with reference to FIGS. 5 to 7.

도 5는 본 발명의 일 실시 예에 따른 그림 이미지 식별 방법을 도시한 순서도이다. 도 5를 참조하면, 본 발명의 일 실시 예에 따른 그림 이미지 식별 방법은 이미지 획득 단계(S50), 분석 대상 영역 추출 단계(S100), 분석 대상 영역에 미술도구의 질감을 갖는 객체가 포함되었는지 판단하는 단계(S200), 이미지를 그림 이미지로 식별하는 단계(S300)를 포함할 수 있다.Figure 5 is a flowchart showing a method for identifying a picture image according to an embodiment of the present invention. Referring to FIG. 5, the drawing image identification method according to an embodiment of the present invention includes an image acquisition step (S50), an analysis target area extraction step (S100), and a determination of whether the analysis target area includes an object with the texture of an art tool. It may include a step (S200) and a step of identifying the image as a picture image (S300).

이미지를 획득 및 전처리하는 단계(S50)에서 전자 장치는 데이터베이스에 저장되어있던 이미지를 불러오거나 카메라와 같은 촬상 장치(또는 모듈)로 부터 실시간으로 이미지를 획득할 수 있다. 만약 전자 장치가 서버인 경우, 서버는 네트워크를 이용하여 단말로부터 이미지를 수신함으로써 이미지를 획득할 수 있다. 한편, 단계 50에서 전자 장치는 선택적으로 이미지를 전처리할 수 있다. 즉, 단계 50은 생략 가능한 것으로 이해될 수 있다. 단계 50에서 이미지를 전처리하는 경우, 전자 장치는 이미지를 텍스트 추출 모델에 적용하여 이미지에서 텍스트가 포함된 글자 영역을 하나 이상 추출할 수 있다. 전자 장치는 단계 50에서 이미지에서 글자로 인식된 영역을 제거할 수 있으며, 추가적으로 노이즈를 제거할 수도 있다. 단계 50이 포함되는 경우, 전자 장치는 단계 100에서 원본 이미지가 아닌 글씨 영역과 노이즈가 제거된 이미지에서 분석 대상 영역을 추출할 수 있다.In the image acquisition and preprocessing step (S50), the electronic device can retrieve an image stored in a database or acquire an image in real time from an imaging device (or module) such as a camera. If the electronic device is a server, the server can obtain the image by receiving the image from the terminal using a network. Meanwhile, in step 50, the electronic device may selectively preprocess the image. In other words, step 50 can be understood as being omitted. When preprocessing the image in step 50, the electronic device may apply the image to a text extraction model to extract one or more character areas containing text from the image. In step 50, the electronic device may remove areas recognized as letters from the image and may additionally remove noise. If step 50 is included, the electronic device may extract the analysis target area in step 100 from the image from which the text area and noise have been removed, rather than the original image.

이미지에서 분석 대상 영역을 추출하는 단계(S100)에서 전자 장치는 사진 이미지 또는 전자 장치에서 생성된 이미지에서 분석 대상 영역을 추출할 수 있다.In the step of extracting the analysis target area from the image (S100), the electronic device may extract the analysis target area from a photographic image or an image generated by the electronic device.

분석 대상 영역은 분석하고자 하는 관심 영역으로, 피사체 영역일 수도 있고, 종이로 판단되는 일 영역일 수 있다. 분석 대상 영역을 추출하는 동작은 그림 이미지 식별의 정확도 및 속도를 높이기 위함이다. 획득한 이미지가 사진 이미지인 경우, 그림 데이터는 종이 영역 내에 포함되어있을 가능성이 높다. 따라서 본 발명의 일 실시 예에 따르면, 노이즈가 될 수 있는 배경 영역을 제거함으로써 종이 영역에 포함된 객체의 질감을 더 빠르고 정확하게 식별할 수 있다.The analysis target area is the area of interest to be analyzed, and may be a subject area or an area judged on paper. The operation of extracting the analysis target area is to increase the accuracy and speed of picture image identification. If the acquired image is a photographic image, the pictorial data is likely to be contained within the paper area. Therefore, according to an embodiment of the present invention, the texture of an object included in the paper area can be identified more quickly and accurately by removing the background area that may be noise.

이하에서는 도 6 및 도 7을 참조하여, 전자 장치가 분석 대상 영역으로 피사체 영역 또는 종이 영역을 추출하는 방법을 구체적으로 설명한다.Hereinafter, with reference to FIGS. 6 and 7 , a method in which an electronic device extracts a subject area or a paper area as an analysis target area will be described in detail.

도 6은 본 발명의 일 실시 예에 따른 피사체 영역 추출 방법을 설명하기 위한 순서도이다. 도 6을 참조하면, 분석 대상 영역 추출 단계(S100)에서 전자 장치는 이미지에서 배경을 제거하고 피사체 영역을 검출할 수 있다(S110). 다음으로, 전자 장치는 피사체 영역을 리사이징하여 분석 대상 영역으로 설정할 수 있다(S130).Figure 6 is a flowchart for explaining a method for extracting a subject area according to an embodiment of the present invention. Referring to FIG. 6, in the analysis target area extraction step (S100), the electronic device may remove the background from the image and detect the subject area (S110). Next, the electronic device can resize the subject area and set it as the analysis target area (S130).

단계 110에서 전자 장치는 피사체/배경 영역 분류 모델에 이미지를 적용하여 피사체 영역을 크롭하는 방식으로 배경을 제거할 수 있다. 여기서 피사체/배경 영역 분류 모델은 픽셀 단위로 영역 특성이 라벨링된 대용량 데이터 세트를 이용하여 학습된 기계학습 모델일 수 있다. 예를 들어, 피사체/배경 영역 분류 모델은 각 픽셀 단위로 피사체인지, 배경인지가 라벨링된 대용량 데이터 세트로 학습된 딥러닝 모델일 수 있다. 대용량 데이터 세트로는 DUTS-TE, DUT-OMRON, HKU-IS, ECSSD 등이 사용될 수 있으며, 개별적으로 수집된 이미지 데이터들이 활용될 수 있다.In step 110, the electronic device may apply the image to a subject/background area classification model to remove the background by cropping the subject area. Here, the subject/background area classification model may be a machine learning model learned using a large data set in which area characteristics are labeled on a pixel basis. For example, a subject/background area classification model may be a deep learning model learned from a large data set in which each pixel is labeled as subject or background. DUTS-TE, DUT-OMRON, HKU-IS, ECSSD, etc. can be used as large data sets, and individually collected image data can be used.

단계 130에서 전자 장치는 피사체 영역의 가로 세로가 동일해지도록 리사이징할 수 있으며, 리사이징된 피사체 영역을 분석 대상 영역으로 설정하여 질감 분류 모델에 적용할 수 있다.In step 130, the electronic device can resize the subject area so that the width and height are the same, and the resized subject area can be set as an analysis target area and applied to the texture classification model.

도 7은 본 발명의 다른 실시 예에 따른 분석 대상 영역 추출 방법을 설명하기 위한 순서도이다. 도 7을 참조하면, 단계 100에서 전자 장치는 이미지에서 종이 영역을 검출하고(S130), 검출된 종이 영역을 크롭하여 분석 대상 영역으로 설정할 수 있다(S150). 이때, 종이 영역은 스케치북, 도화지, A4 용지 등과 같이 종이로 인식되는 영역을 의미한다.Figure 7 is a flowchart explaining a method for extracting an analysis target region according to another embodiment of the present invention. Referring to FIG. 7, in step 100, the electronic device may detect a paper area in the image (S130), crop the detected paper area, and set it as an analysis target area (S150). At this time, the paper area refers to an area recognized as paper, such as a sketchbook, drawing paper, A4 paper, etc.

일 실시 예에 따르면, 전자 장치는 단계 130에서 종이 영역 검출 모델, 엣지 검출기, 스케치북 영역 검출 모델을 이용하여 종이 영역을 검출할 수 있는데, 각 기계학습 모델 각각에 이미지를 적용하고 단일 모델의 검출 결과에 따라 종이 영역을 크롭하는 것도 가능하며, 도 3에 도시된 바와 같이 복수 개의 모델에 이미지를 적용하여 획득한 종이 영역 검출 결과를 교차 검증할 수도 있다. 이러한 교차 검증 방식을 사용하면 종이 영역 검출 정확도가 높아지므로, 결과적으로 그림 이미지 식별 정확도도 높아질 수 있다.According to one embodiment, the electronic device may detect the paper area using a paper area detection model, an edge detector, and a sketchbook area detection model in step 130. An image is applied to each machine learning model and the detection result of a single model is obtained. It is also possible to crop the paper area according to , and as shown in FIG. 3, the paper area detection result obtained by applying the image to a plurality of models can be cross-verified. Using this cross-validation method increases the accuracy of paper area detection, which can ultimately increase the accuracy of picture image identification.

종이 영역 검출 모델은 복수의 종이 이미지로 학습된 YOLO와 같은 기계학습 모델일 수 있다. 종이 영역 검출 모델은 바운딩 박스를 사용하여 레이블링된 학습 데이터 세트를 이용하여 학습된 모델로, 이미지를 적용하면 종이 영역에 대응되는 바운딩 박스 좌표값을 출력할 수 있다. 엣지 검출기는 인접 픽셀 간의 히스토그램 차이를 이용하여 엣지를 검출할 수 있다. 엣지 검출기는 canny edge detector와 같은 영상처리 라이브러리를 사용하여 테두리를 검출하는 것으로, 픽셀들 간의 히스토그램 값을 구하여 인접 픽셀과 히스토그램 값의 차이가 크면 이를 엣지로 정의하는 검출기이다. 또한 스케치북 영역 검출 모델은 스케치북 영역을 픽셀 단위로 레이블링한 학습 데이터를 이용하여 학습된 모델일 수 있다.The paper area detection model may be a machine learning model such as YOLO learned with multiple paper images. The paper area detection model is a model learned using a learning data set labeled using a bounding box. When an image is applied, the bounding box coordinates corresponding to the paper area can be output. The edge detector can detect edges using the histogram difference between adjacent pixels. The edge detector detects edges using an image processing library such as canny edge detector. It is a detector that obtains histogram values between pixels and defines them as edges when the difference between adjacent pixels and histogram values is large. Additionally, the sketchbook area detection model may be a model learned using training data labeling the sketchbook area in pixel units.

엣지 검출기를 이용하여 종이 영역을 검출하는 다른 실시 예는 다음과 같다. 전자 장치는 단계 130에서 엣지 검출기를 이용하여 이미지에 포함된 모든 오브젝트의 엣지를 검출할 수 있다. 다음으로 전자 장치는 엣지의 형태를 분석하여 사각형의 형태를 갖는 엣지(윤곽선)를 모두 추출할 수 있다. 그리고 사각형의 형태를 갖는 엣지 중 가장 크기가 큰 사각형에 대응되는 일 영역을 종이 영역으로 설정할 수 있다. 여기서 종이 영역으로 설정된 일 영역은 그림으로 식별될 수 있으며, 전자 장치는 해당 영역을 그림 영역으로 지정하는 것도 가능하다.Another example of detecting a paper area using an edge detector is as follows. In step 130, the electronic device may detect the edges of all objects included in the image using an edge detector. Next, the electronic device can analyze the shape of the edges and extract all edges (outlines) that have a square shape. And, among the edges having a square shape, an area corresponding to the largest square can be set as the paper area. Here, an area set as a paper area can be identified as a picture, and the electronic device can also designate the area as a picture area.

다시 도 5를 참조하면, 전자 장치는 단계 200에서 분석 대상 영역에 미술도구의 질감을 갖는 객체가 포함되었는지 여부를 판단할 수 있다. 만약 단계 200에서의 판단 결과, 분석 대상 영역에 미술도구의 질감을 갖는 객체가 포함된 것으로 판단되면, 전자 장치는 이미지를 그림 이미지로 분류, 식별할 수 있다(S300). 이때, 미술도구는 크레파스, 물감, 색종이, 연필, 볼펜, 사인펜, 파스텔, 페인트, 점토, 스탬프 또는 마카 중 적어도 하나일 수 있으며, 물감은 아크릴 물감, 수채화 물감, 유화 물감 등 그 종류와 무관하게 색을 들이는 물질을 모두 포함할 수 있다. 단계 200에서 미술도구의 질감을 갖는 객체(그림 데이터)의 포함 여부를 판단함에 있어서는 질감 분류 모델이 사용된다. 질감 분류 모델은 미술도구의 질감으로 그려진 점, 선 또는 면 중 적어도 하나를 포함하는 복수의 그림 이미지 세트로 학습된 기계학습 모델로 CNN 기반의 다중 분류 모델일 수 있다. 질감 분류 모델을 학습시키는데 사용되는 복수의 그림 이미지 세트에는 미술도구의 질감으로 그려진 객체를 포함하는 그림 이미지 세트 외에도, 플라스틱, 피부, 장판, 마루바닥 또는 나무 중 적어도 하나의 질감을 갖는 객체의 일 영역으로 구성된 학습 이미지가 더 포함될 수 있다. 전술한 객체들은 그림 이미지에 빈번하게 포함되는 것으로, 그림 이미지 분류의 정확도를 높이기 위해 사용될 수 있다.Referring again to FIG. 5 , in step 200, the electronic device may determine whether an object having the texture of an art tool is included in the analysis target area. If, as a result of the determination in step 200, it is determined that the analysis target area includes an object having the texture of an art tool, the electronic device may classify and identify the image as a drawing image (S300). At this time, the art tool may be at least one of crayons, paint, colored paper, pencil, ballpoint pen, marker pen, pastel, paint, clay, stamp, or marker, and the paint may be of any color, such as acrylic paint, watercolor paint, or oil paint. It can contain all substances that contain . In step 200, a texture classification model is used to determine whether an object (picture data) having a texture of an art tool is included. The texture classification model is a machine learning model learned with a set of multiple picture images containing at least one of points, lines, or faces drawn with the texture of an art tool, and may be a CNN-based multi-classification model. A plurality of picture image sets used to train a texture classification model include, in addition to a set of picture images containing objects drawn with the texture of art tools, a region of the object having at least one texture among plastic, skin, flooring, flooring, or wood. A learning image consisting of may further be included. The above-described objects are frequently included in picture images and can be used to increase the accuracy of picture image classification.

전자 장치는 단계 200에서 미술도구의 질감을 갖는 객체만을 포함하는 그림 이미지 세트로 학습된 질감 분류 모델과 미술 도구의 질감을 갖는 객체 및 배경 객체(피부, 장판, 마루 바닥, 나무 등)를 모두 포함하는 그림 이미지세트로 학습된 질감 분류 모델을 함께 사용할 수 있으며, 이들을 개별적으로 사용할 수도 있다.In step 200, the electronic device includes both a texture classification model learned from a set of picture images containing only objects with the texture of art tools, objects with the texture of art tools, and background objects (skin, flooring, flooring, wood, etc.) Texture classification models learned with a set of picture images can be used together, or they can be used individually.

전자 장치는 단계 200에서 질감 분류 모델의 분류 결과로 미술 도구(크레파스, 물감, 색종이, 연필, 볼펜, 사인펜 등) 중 적어도 하나가 높은 확률로 존재하는 것으로 확인 되면, 해당 분석 대상 영역에 그림 데이터가 포함된 것으로 판단할 수 있다. 그리고 해당 분석 대상 영역을 포함하는 이미지를 그림 이미지, 즉 그림 데이터가 포함된 그림 이미지로 판단할 수 있다.If at least one of the art tools (crayons, paints, colored paper, pencils, ballpoint pens, marker pens, etc.) is confirmed to exist with a high probability as a result of the classification of the texture classification model in step 200, the electronic device generates picture data in the analysis target area. It can be judged as included. And the image containing the analysis target area can be judged to be a drawing image, that is, a drawing image containing drawing data.

단계 300에서, 전자 장치는 분석 대상 영역에 미술도구의 질감을 갖는 객체가 포함되었다는 단계 200의 결과에 기초하여, 이미지를 그림 이미지로 식별 또는 분류할 수 있다.In step 300, the electronic device may identify or classify the image as a drawing image based on the result of step 200 that the analysis target area includes an object with the texture of an art tool.

이하에서는 도 8을 참조하여, 픽셀의 특성 정보를 이용한 그림 이미지 식별 방법을 설명한다. 도 8은 본 발명의 일 실시 예에 따른 그림 이미지 식별 방법을 설명하기 위한 순서도이다. 도 8을 참조하면, 본 발명의 일 실시 예에 따른 그림 이미지 식별 방법은 이미지 획득 단계(S5), 픽셀 특성 정보 검출 단계(S10), 그림 이미지 식별 단계(S30)을 포함할 수 있다.Below, with reference to FIG. 8, a method for identifying a picture image using pixel characteristic information will be described. Figure 8 is a flowchart for explaining a method for identifying a picture image according to an embodiment of the present invention. Referring to FIG. 8, the picture image identification method according to an embodiment of the present invention may include an image acquisition step (S5), a pixel characteristic information detection step (S10), and a picture image identification step (S30).

이미지를 획득하는 단계(S5)는 도 5를 참조하여 설명한 이미지 획득 단계(S50)와 동일한 특징을 지니므로, 단계 50의 내용을 참고한다. Since the image acquisition step (S5) has the same characteristics as the image acquisition step (S50) described with reference to FIG. 5, refer to the content of step 50.

픽셀의 특성 정보를 검출하는 단계(S10)에서 전자 장치는 이미지에서 이미지를 구성하는 픽셀의 특성 정보를 추출할 수 있다. 이때 특성 정보는 픽셀의 RGB값, 픽셀의 RGB 깊이값(RGB-DEPTH) 중 어느 하나일 수 있다. 단계 10에서 전자 장치는 RGB-d 객체 데이터세트와 같이 깊이가 RGB로 표시된 데이터 세트를 이용하여 학습된 단안 깊이 추정 모델에 이미지를 적용하여, 이미지의 깊이를 RGB값으로 표시할 수 있다. 이때, 단안 깊이 추정 모델은 Monocular depth estimation 모델로, pix2pix나 unet 기반 모델이 사용될 수 있다. 그리고 전자 장치는 RGB값으로 표시된 이미지에서 각 픽셀별 RGB값의 깊이값을 추출함으로써, 각 픽셀의 RGB 깊이값을 획득할 수 있다.In the step of detecting pixel characteristic information (S10), the electronic device may extract characteristic information of pixels constituting the image from the image. At this time, the characteristic information may be either the RGB value of the pixel or the RGB depth value (RGB-DEPTH) of the pixel. In step 10, the electronic device may display the depth of the image as an RGB value by applying the image to a monocular depth estimation model learned using a data set in which depth is expressed in RGB, such as the RGB-d object dataset. At this time, the monocular depth estimation model is a monocular depth estimation model, and a pix2pix or unet-based model may be used. And the electronic device can obtain the RGB depth value of each pixel by extracting the depth value of the RGB value for each pixel from the image displayed as RGB value.

픽셀의 특성 정보를 이용하여 그림 이미지를 식별하는 단계(S30)에서 전자 장치는 픽셀의 특성 정보가 무엇인지에 따라 다른 동작을 수행할 수 있다. 일 예로, 추출된 특성 정보가 RGB값이면 단계 30에서 전자 장치는 인접한 픽셀의 RGB값의 차이가 기 설정된 임계값 이상인 픽셀 그룹을 검출하고, 임계값 이상의 RGB값 차이를 갖는 픽셀 그룹이 미리 설정한 개수 이상 존재한다고 판단되면, 이미지를 그림 이미지로 식별할 수 있다. 이때 RGB값의 차이는 유클리디언 거리를 연산하는 방식으로 구할 수 있다.In the step (S30) of identifying a picture image using pixel characteristic information, the electronic device may perform different operations depending on the pixel characteristic information. For example, if the extracted characteristic information is an RGB value, in step 30, the electronic device detects a pixel group in which the difference in RGB values of adjacent pixels is more than a preset threshold, and the pixel group with the difference in RGB values more than the threshold is preset. If it is determined that there are more than one number, the image can be identified as a picture image. At this time, the difference between RGB values can be obtained by calculating the Euclidean distance.

단계 30의 다른 실시 예로, 전자 장치는 추출된 특성 정보가 RGB값인 경우 추출된 RGB값을 클러스터링하고, 클러스터링된 군집의 수가 기 설정된 개수(기준 개수) 미만이면, 이미지를 그림 이미지로 식별할 수 있다. 클러스터링 알고리즘으로는 k-means, hierarchical 등의 알고리즘이 사용될 수 있으나, 이에 한정되는 것은 아니다. 전자 장치는 그림 이미지의 식별 기준이 되는 기준 개수를 설정함에 있어서, 사용자 별로 기 저장된 그림 이미지들의 특성을 이용할 수 있다. 즉, 사용자 별로 그림 이미지 내 색 군집 통계를 산출하고, 산출 결과를 기준 개수 설정에 활용할 수 있다. As another example of step 30, if the extracted characteristic information is an RGB value, the electronic device clusters the extracted RGB values, and if the number of clustered clusters is less than a preset number (standard number), the electronic device may identify the image as a picture image. . Clustering algorithms such as k-means and hierarchical may be used, but are not limited to these. When setting a reference number that serves as an identification standard for a picture image, the electronic device can use the characteristics of picture images pre-stored for each user. In other words, color cluster statistics within the picture image can be calculated for each user, and the calculation results can be used to set the standard number.

도 9는 본 발명의 일 실시 예에 따른 RGB 깊이값을 이용한 그림 이미지 식별 방법을 설명하기 위한 순서도이다.Figure 9 is a flow chart to explain a method for identifying a picture image using RGB depth values according to an embodiment of the present invention.

픽셀 특성 정보로 RGB 깊이값을 사용하는 일 실시 예에 따르면, 전자장치는 RGB깊이값이 기 설정된 범위에 속하는 일 영역을 크롭하여 크롭 이미지를 생성하고(S31), 질감 분류 모델에 크롭 이미지를 적용하여 그림 이미지를 식별할 수 있다(S33). 단계 31에서 전자 장치는 유사한 범위에 속한 깊이값을 갖는 픽셀들이 하나의 영역에 포함될 수 있도록 영역을 설정할 수 있다. 동일 범위의 깊이값을 갖는 픽셀은 편평한 객체를 나타내는 픽셀임을 의미하므로, 전자장치는 크롭 이미지를 종이 영역으로 인식하고, 이를 리사이즈 후 질감 분류 모델에 적용할 수 있다. 일 실시 예에서 크롭 이미지는 가로와 세로가 동일한 길이를 갖도록 리사이즈 될 수 있다. 단계 33에서 전자 장치는 리사이즈된 크롭 이미지를 질감 분류 모델에 적용하여 질감 정보를 획득하며, 질감 정보에 기초하여 그림 이미지를 식별할 수 있다. 질감 분류 모델의 특징은 도 1 및 도 4에서 설명한 바와 동일하다. According to an embodiment of using RGB depth values as pixel characteristic information, the electronic device generates a cropped image by cropping an area where the RGB depth value falls within a preset range (S31), and applies the cropped image to the texture classification model. Thus, the picture image can be identified (S33). In step 31, the electronic device can set a region so that pixels with depth values within a similar range can be included in one region. Since pixels with depth values in the same range mean pixels representing flat objects, the electronic device can recognize the cropped image as a paper area, resize it, and apply it to the texture classification model. In one embodiment, the cropped image may be resized so that its width and height have the same length. In step 33, the electronic device obtains texture information by applying the resized cropped image to a texture classification model, and can identify the picture image based on the texture information. The characteristics of the texture classification model are the same as those described in FIGS. 1 and 4.

본 발명의 일 실시 예에 따른 그림 이미지 식별 장치(100, 200)를 구성하는 전처리부(110, 210), 영역 추출부(130), 픽셀 분석부(230), 판단부(150, 250)는 필요에 따라 선택, 조합될 수 있다. 예컨대, 픽셀 분석부(230)에서 픽셀의 특성 정보를 추출한 후, 이를 이용하여 영역 추출부(130)에서 종이 영역을 추출할 수 있으며, 판단부(250)는 종이 영역을 관심 영역으로 설정하여 질감 분류 모델에 적용함으로써 추출된 종이 영역에 포함된 질감 정보를 획득할 수 있다. 따라서, 본 발명은 본 명세서에 기재된 실시 예에 한정되지 않으며, 각 모듈의 선택적인 조합을 통해 그림 이미지를 식별하고자 하는 목표를 달성할 수 있다.The preprocessing units 110 and 210, the area extraction unit 130, the pixel analysis unit 230 and the determination units 150 and 250 that constitute the picture image identification device 100 and 200 according to an embodiment of the present invention. They can be selected and combined as needed. For example, after extracting pixel characteristic information from the pixel analysis unit 230, the paper area can be extracted from the area extraction unit 130 using this, and the determination unit 250 sets the paper area as the area of interest and texture By applying it to the classification model, texture information contained in the extracted paper area can be obtained. Therefore, the present invention is not limited to the embodiments described herein, and the goal of identifying picture images can be achieved through selective combination of each module.

도 4 내지 도 9을 참조하여 전술한 그림 이미지 식별 방법은 컴퓨터를 이용하여 각 방법 중 어느 하나의 방법을 실행시키기 위하여 매체에 저장된 컴퓨터 프로그램을 통해 구현될 수 있을 것이다.The picture image identification method described above with reference to FIGS. 4 to 9 may be implemented through a computer program stored in a medium to execute any one of the methods using a computer.

본 발명에 따르면, 그림 데이터를 활용하는 전 과정을 자동화할 수 있다. 즉, 본 발명에 의하면 기존에 데이터베이스에 저장되어 있던 이미지 중에서 그림 데이터의 활용 및 분석이 필요한 이미지를 자동으로 선별할 수 있으므로, 애플리케이션 내에서 분석 대상이 되는 이미지를 사용자가 선택하여 입력하는 단계를 생략할 수 있으며, 이는 결과적으로 사용자에게 우수한 사용자 경험(user experience)를 제공할 수 있다.According to the present invention, the entire process of utilizing picture data can be automated. In other words, according to the present invention, it is possible to automatically select images that require utilization and analysis of picture data among images previously stored in a database, thereby omitting the step of the user selecting and entering images to be analyzed within the application. This can be done, and as a result, it can provide users with an excellent user experience.

이상에서 실시 형태들에 설명된 특징, 구조, 효과 등은 본 발명의 적어도 하나의 실시 형태에 포함되며, 반드시 하나의 실시 형태에만 한정되는 것은 아니다. 나아가, 각 실시 형태에서 예시된 특징, 구조, 효과 등은 실시 형태들이 속하는 분야의 통상의 지식을 가지는 자에 의해 다른 실시 형태들에 대해서도 조합 또는 변형되어 실시 가능하다. 따라서 이러한 조합과 변형에 관계된 내용들은 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.The features, structures, effects, etc. described in the embodiments above are included in at least one embodiment of the present invention and are not necessarily limited to only one embodiment. Furthermore, the features, structures, effects, etc. illustrated in each embodiment can be combined or modified and implemented in other embodiments by a person with ordinary knowledge in the field to which the embodiments belong. Therefore, contents related to such combinations and modifications should be construed as being included in the scope of the present invention.

또한, 이상에서 실시 형태를 중심으로 설명하였으나 이는 단지 예시일 뿐 본 발명을 한정하는 것이 아니며, 본 발명이 속하는 분야의 통상의 지식을 가진 자라면 본 실시 형태의 본질적인 특성을 벗어나지 않는 범위에서 이상에 예시되지 않은 여러 가지의 변형과 응용이 가능함을 알 수 있을 것이다. 즉, 실시 형태에 구체적으로 나타난 각 구성 요소는 변형하여 실시할 수 있는 것이다. 그리고 이러한 변형과 응용에 관계된 차이점들은 첨부된 청구 범위에서 규정하는 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.In addition, although the above description focuses on the embodiment, this is only an example and does not limit the present invention, and those skilled in the art will be able to understand the above without departing from the essential characteristics of the present embodiment. You will see that various modifications and applications not illustrated are possible. In other words, each component specifically shown in the embodiment can be modified and implemented. And these variations and differences in application should be construed as being included in the scope of the present invention as defined in the appended claims.

100, 200: 그림 이미지 식별 장치
110, 210: 전처리부
130: 영역 추출부
230: 픽셀 분석부
150, 250: 판단부

100, 200: Picture image identification device
110, 210: Preprocessing unit
130: Area extraction unit
230: Pixel analysis unit
150, 250: Judgment unit

Claims

A method for an electronic device to identify a pictorial image containing pictorial data in one or more images, comprising:
Extracting an analysis target area from an image;
If it is determined that the analysis target area includes an object with the texture of an art tool and a background object using a texture classification model, identifying the image as a drawing image;
The texture classification model is a machine learning model learned from a learning image consisting of a set of a plurality of picture images including at least one of a point, line, or surface drawn with the texture of an art tool and a region of a background object. A method for identifying picture images.

According to paragraph 1,
The art tool is a drawing image identification method including at least one of crayons, paint, colored paper, pencil, ballpoint pen, marker pen, pastel, clay, stamp, or marker.

According to paragraph 1,
A method for identifying a picture image, wherein the background object is an object having at least one texture selected from plastic, skin, flooring, flooring, or wood.

According to paragraph 1,
The analysis target area extraction step is,
removing the background from the image and detecting the subject area;
A picture image identification method comprising the step of resizing the subject area and setting it as an analysis target area.

According to paragraph 4,
The subject area detection step is,
A step of cropping the subject area by applying the image to a subject/background area classification model,
The subject/background area classification model is a picture image identification method that is a machine learning model learned using a large data set in which area characteristics are labeled in pixel units.

The method of claim 5, wherein the machine learning model is a unet series model.

According to paragraph 1,
The analysis target area extraction step is,
detecting a paper area in the image;
A pictorial image identification method including the step of cropping the detected paper area and setting it as the analysis target area.

In clause 7,
The paper area detection step is,
One of the following: a paper area detection model learned from a set of multiple paper images, an edge detector that detects edges using histogram differences between adjacent pixels, or a sketchbook area detection model learned using training data labeling sketchbook areas in pixel units. A pictorial image identification method comprising the step of cross-validating the paper area detection result by applying the image to the above.

According to clause 1,
The electronic device is a server,
A method of identifying a picture image, wherein the one or more images are received from a terminal.

According to paragraph 1,
Extracting one or more character areas containing text by applying the image to a text extraction model;
Further comprising generating a preprocessed image by removing the text area from the image,
The analysis target area extraction step is
Comprising the step of extracting the analysis target area from the preprocessed image,
A picture image identification method in which the text extraction model is a model learned with a plurality of text data.

A method for an electronic device to identify a pictorial image containing pictorial data in one or more images, comprising:
Extracting RGB depth values (RGB-DEPTH) of pixels constituting the image;
If it is determined that the RGB depth value (RGB-DEPTH) contains the texture of an art tool by applying a texture classification model to an area within a similar range, identifying the image as a drawing image;
The texture classification model is a picture image identification method that is a machine learning model learned from a set of a plurality of picture images containing at least one of a point, line, or surface drawn with the texture of an art tool.

According to clause 11,
The picture image identification step is,
generating a crop image by cropping an area in the image where the RGB depth values fall within a similar range;
resizing the cropped image;
A drawing image identification method comprising the step of applying the resized cropped image to a texture classification model and, if it is determined to contain the texture of an art tool, identifying the image as a drawing image.

According to clause 11,
The step of extracting the RGB depth value of the pixel is:
Displaying the depth of the image as an RGB value using a monocular depth estimation model learned with the RGB-d object dataset;
A picture image identification method comprising extracting the depth value of the RGB value for each pixel from the image displayed with the RGB value.

According to clause 13,
The monocular depth estimation model is a monocular depth estimation model, a picture image identification method characterized in that it is a pix2pix or unet based model.

A device for identifying a picture image containing picture data in one or more images, comprising:
an area extraction unit that extracts an analysis target area from the image;
If it is determined that the analysis target area includes an object with the texture of an art tool and a background object using a texture classification model, it includes a determination unit that identifies the image as a drawing image,
The texture classification model is a machine learning model learned from a learning image consisting of a set of a plurality of picture images including at least one of a point, line, or surface drawn with the texture of an art tool and a region of a background object. A picture image identification device.

A device for identifying a picture image containing picture data in one or more images, comprising:
A pixel analysis unit that extracts the RGB depth value (RGB-DEPTH) constituting the image;
If it is determined that the RGB depth value (RGB-DEPTH) contains a texture of an art tool by applying an area within a similar range to a texture classification model, it includes a determination unit that identifies the image as a drawing image,
The texture classification model is a picture image identification device that is a machine learning model learned from a set of a plurality of picture images containing at least one of a point, line, or surface drawn with the texture of an art tool.

A computer program stored in a medium to execute any one of the methods of claims 1 to 14 using a computer.

delete