KR20220124594A

KR20220124594A - Virtual fitting method supporting high-resolution image

Info

Publication number: KR20220124594A
Application number: KR1020210049800A
Authority: KR
Inventors: 구교정; 남건민; 김우석; 최승환; 박성현; 이용균
Original assignee: (주)내스타일
Priority date: 2021-03-03
Filing date: 2021-04-16
Publication date: 2022-09-14
Also published as: KR102374141B1; KR20220124593A; KR20220124595A

Abstract

The present invention relates to a virtual fitting method for compositing a fitting garment to an original image using a computer device. In one embodiment, the disclosed virtual fitting method using a computer device may include the steps of: generating a segmentation from the original image; generating a modified fitting garment by modifying the fitting garment (c) based on the original image; generating a misaligned area in which a region of the modified fitting garment does not overlap among existing garment regions of the segmentation; and generating a synthesized image by inputting input data including the original image and fitting garment into an image synthesis algorithm.

Description

Virtual fitting method supporting high-resolution image

본 발명은 컴퓨터를 이용한 가상 피팅 방법에 관한 것으로, 보다 상세하게는, 원본 이미지에 피팅 의상을 합성한 합성 이미지를 생성하기 위한 가상 피팅 방법 및 장치에 관한 것이다. The present invention relates to a virtual fitting method using a computer, and more particularly, to a virtual fitting method and apparatus for generating a composite image obtained by synthesizing fitting clothes with an original image.

의류를 실제로 착용하기 전에 가상으로 의류를 착용한 이미지를 생성하여 사용자에게 보여주는 가상 피팅 기술이 현재 널리 이용되고 있다. 그러나 기존의 가상 피팅 기술은 저해상도 데이터셋 위주로 가상피팅 하는 방법을 제시하고 있어서 1024x768 이상의 고해상도에서는 작은 오차에 의한 가상피팅 결과가 두드러지게 잘못된 합성으로 이어지는 결과를 보이고 있다. A virtual fitting technology that creates an image of wearing clothes virtually before actually wearing them and shows them to a user is currently widely used. However, the existing virtual fitting technology suggests a method of virtual fitting mainly on low-resolution datasets, so that at a high resolution of 1024x768 or higher, the virtual fitting result due to a small error remarkably leads to incorrect synthesis.

또한 가상 피팅 기술은 이미지 상의 모델에 의상을 합성하여 보여주기 때문에 의상이 배경에 자연스럽게 들어맞지 못하여 어색하여 실제 소비자가 착용한 모습에 비해 부자연스러운 경우가 많으며, 따라서 소비자가 의상을 입었을 때의 모습을 정확히 예상하기 어려운 경우가 많다. In addition, since the virtual fitting technology synthesizes and shows the clothes on the model on the image, the clothes do not naturally fit into the background and are awkward and often unnatural compared to the appearance of the actual consumer. It is often difficult to predict accurately.

특허문헌1: 한국 공개특허 제2020-0038777호 (2020년 4월 14일 공개)Patent Document 1: Korean Patent Publication No. 2020-0038777 (published on April 14, 2020) 특허문헌2: 한국 공개특허 제2020-0139766호 (2020년 12월 14일 공개)Patent Document 2: Korean Patent Publication No. 2020-0139766 (published on December 14, 2020)

본 발명은 이러한 문제점을 해결하기 위한 것으로 고해상도에서도 강건한 이미지를 생성함으로써 의상이 자연스럽게 합성된 모습을 보여줄 수 있고 소비자가 실제 의상을 작용했을 때의 모습을 비교적 정확히 예측할 수 있도록 하는 가상 피팅 방법 및 장치를 제공하는 것을 목적으로 한다. The present invention is to solve this problem, and by generating a robust image even at high resolution, it is possible to show a naturally synthesized appearance of clothes and to provide a virtual fitting method and apparatus that allow consumers to relatively accurately predict the appearance of actual clothes. intended to provide

본 발명의 일 실시예에 따르면, 컴퓨터 장치를 이용하여 원본 이미지에 피팅의상을 합성하는 가상 피팅 방법으로서, 원본 이미지로부터 세그먼테이션을 생성하는 단계; 상기 원본 이미지에 기초하여 상기 피팅의상(c)을 변형하여 변형된 피팅의상을 생성하는 단계; 상기 세그먼테이션의 기존의상 영역 중 상기 변형된 피팅의상의 영역이 중첩되지 않는 비정렬 영역(Mmisalign)을 생성하는 단계; 및 상기 원본 이미지와 피팅의상을 포함하는 입력데이터를 이미지 합성 알고리즘에 입력하여 합성 이미지를 생성하는 단계;를 포함하는, 컴퓨터 장치를 이용한 가상 피팅 방법을 개시한다.According to an embodiment of the present invention, there is provided a virtual fitting method for synthesizing a fitting costume with an original image using a computer device, the method comprising: generating a segmentation from the original image; generating a deformed fitting garment by deforming the fitting garment (c) based on the original image; generating a misalignment region (Mmisalign) in which regions of the deformed fitting clothes do not overlap among the segmentation regions of the existing clothes; and generating a composite image by inputting input data including the original image and fitting clothes into an image synthesis algorithm.

본 발명의 일 실시예에 따르면, 상기 가상 피팅 방법을 실행시키기 위한 컴퓨터 프로그램이 기록된 컴퓨터 판독가능 기록매체를 개시한다. According to an embodiment of the present invention, there is provided a computer-readable recording medium in which a computer program for executing the virtual fitting method is recorded.

본 발명의 일 실시예에 따르면 고해상도에서도 강건한 이미지를 생성할 수 있고 의상이 자연스럽게 합성된 합성 이미지를 생성할 수 있어 소비자가 실제 의상을 작용했을 때의 모습을 비교적 정확히 예측할 수 있도록 하는 기술적 효과를 가진다. According to an embodiment of the present invention, it is possible to generate a robust image even at high resolution and to generate a synthetic image in which clothes are naturally synthesized, which has a technical effect that enables consumers to relatively accurately predict the appearance of actual clothes. .

도1은 일 실시예에 따른 가상 피팅 방법을 실행하는 시스템을 설명하는 블록도,
도2는 일 실시예에 따른 가상 피팅 시스템의 블록도,
도3은 일 실시예에 따른 전처리부의 동작을 설명하는 도면,,
도4는 일 실시예에 따른 전처리부의 동작을 설명하는 흐름도,
도5는 일 실시예에 따른 의상제거 이미지(Ia) 생성 방법을 설명하는 도면,
도6은 일 실시예에 따른 세그먼테이션부의 동작을 설명하는 도면,
도7은 일 실시예에 따른 세그먼테이션부의 동작을 설명하는 흐름도,
도8은 일 실시예에 따른 의상 변형부의 동작을 설명하는 도면,
도9는 일 실시예에 따른 의상 변형부의 동작을 설명하는 흐름도,
도10은 대안적 실시예에 따른 의상 변형부를 설명하는 도면,
도11은 일 실시예에 따른 이미지 합성부의 동작을 설명하는 도면,
도12는 일 실시예에 따른 이미지 합성부의 동작을 설명하는 흐름도,
도13는 일 실시예에 따른 이미지 합성 알고리즘 입력 데이터를 설명하는 도면,
도14 내지 도16은 일 실시예에 따른 이미지 합성 알고리즘을 설명하는 도면,
도17은 대안적 실시예에 따른 이미지 합성부를 설명하는 도면,
도18은 일 실시예에 따른 본 발명의 효과를 설명하는 도면이다. 1 is a block diagram illustrating a system for implementing a virtual fitting method according to an embodiment;
2 is a block diagram of a virtual fitting system according to an embodiment;
3 is a view for explaining an operation of a preprocessor according to an embodiment;
4 is a flowchart illustrating an operation of a preprocessor according to an embodiment;
5 is a view for explaining a method of generating a clothing removal image (Ia) according to an embodiment;
6 is a view for explaining an operation of a segmentation unit according to an embodiment;
7 is a flowchart illustrating an operation of a segmentation unit according to an embodiment;
8 is a view for explaining the operation of the clothes transforming unit according to an embodiment;
9 is a flowchart for explaining the operation of the clothes transforming unit according to an embodiment;
Fig. 10 is a view for explaining a garment transforming unit according to an alternative embodiment;
11 is a view for explaining an operation of an image synthesizing unit according to an embodiment;
12 is a flowchart illustrating an operation of an image synthesizing unit according to an embodiment;
13 is a diagram for explaining input data of an image synthesis algorithm according to an embodiment;
14 to 16 are diagrams for explaining an image synthesis algorithm according to an embodiment;
Fig. 17 is a view for explaining an image synthesizing unit according to an alternative embodiment;
18 is a view for explaining the effects of the present invention according to an embodiment.

이상의 본 발명의 목적들, 다른 목적들, 특징들 및 이점들은 첨부된 도면과 관련된 이하의 바람직한 실시예들을 통해서 쉽게 이해될 것이다. 그러나 본 발명은 여기서 설명되는 실시예들에 한정되지 않고 다른 형태로 구체화될 수도 있다. 오히려, 여기서 소개되는 실시예들은 개시된 내용이 철저하고 완전해질 수 있도록 그리고 당업자에게 본 발명의 사상이 충분히 전달될 수 있도록 하기 위해 제공되는 것이다. The above objects, other objects, features and advantages of the present invention will be easily understood through the following preferred embodiments in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments described herein and may be embodied in other forms. Rather, the embodiments introduced herein are provided so that the disclosed subject matter may be thorough and complete, and that the spirit of the present invention may be sufficiently conveyed to those skilled in the art.

본 명세서에서 제1, 제2 등의 용어가 구성요소들을 기술하기 위해서 사용된 경우, 이들 구성요소들이 이 같은 용어들에 의해서 한정되어서는 안된다. 이들 용어들은 단지 어느 구성요소를 다른 구성요소와 구별시키기 위해서 사용되었을 뿐이다. 여기에 설명되고 예시되는 실시예들은 그것의 상보적인 실시예들도 포함한다.In this specification, when terms such as first, second, etc. are used to describe components, these components should not be limited by these terms. These terms are only used to distinguish one component from another. The embodiments described and illustrated herein also include complementary embodiments thereof.

본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 '~를 포함한다', '~로 구성된다', 및 '~으로 이루어진다'라는 표현은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다.In this specification, the singular also includes the plural, unless specifically stated otherwise in the phrase. The expressions 'comprising', 'consisting of', and 'consisting of' as used in the specification do not exclude the presence or addition of one or more other elements in addition to the stated elements.

본 명세서에서 용어 '소프트웨어'는 컴퓨터에서 하드웨어를 움직이는 기술을 의미하고, 용어 '하드웨어'는 컴퓨터를 구성하는 유형의 장치나 기기(CPU, 메모리, 입력 장치, 출력 장치, 주변 장치 등)를 의미하고, 용어 '단계'는 소정의 목을 달성하기 위해 시계열로 연결된 일련의 처리 또는 조작을 의미하고, 용어 '컴퓨터 프로그램', '프로그램', 또는 '알고리즘'은 컴퓨터로 처리하기에 합한 명령의 집합을 의미하고, 용어 '프로그램 기록 매체'는 프로그램을 설치하고 실행하거나 유통하기 위해 사용되는 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 의미한다. As used herein, the term 'software' refers to a technology that moves hardware in a computer, and the term 'hardware' refers to a tangible device or device (CPU, memory, input device, output device, peripheral device, etc.) constituting the computer, and , the term 'step' means a series of processing or manipulations linked in time series to achieve a predetermined goal, and the term 'computer program', 'program', or 'algorithm' refers to a set of instructions that are summed up for processing by a computer. means, and the term 'program recording medium' refers to a computer-readable recording medium in which a program used for installing, executing, or distributing a program is recorded.

본 명세서에서 발명의 구성요소를 지칭하기 위해 사용된 '~부', '~모듈', '~유닛', '~블록', '~보드' 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 물리적, 기능적, 또는 논리적 단위를 의미할 수 있고 이는 하나 이상의 하드웨어나 소프트웨어 또는 펌웨어로 구현되거나 또는 하나 이상의 하드웨어, 소프트웨어, 및/또는 펌웨어의 결합으로 구현될 수 있다. Terms such as '~ unit', '~ module', '~ unit', '~ block', and '~ board' used in this specification to refer to the elements of the present invention are used to refer to at least one function or operation. It may mean a physical, functional, or logical unit, which may be implemented as one or more hardware, software, or firmware, or may be implemented as a combination of one or more hardware, software, and/or firmware.

본 명세서에서, '처리장치', '컴퓨터', '컴퓨팅 장치', '서버 장치', '서버'는 윈도우, 맥, 또는 리눅스와 같은 운영체제, 컴퓨터 프로세서, 메모리, 응용프로그램들, 기억장치(예를 들면, HDD, SDD), 및 모니터를 구비한 장치로 구현될 수 있다. 컴퓨터는 예를 들면, 데스크톱 컴퓨터나 노트북, 모바일 단말기 등과 같은 장치일 수 있으나 이들은 예시적인 것이며 이에 한정되는 것은 아니다. 모바일 단말기는 스마트폰, 태블릿 PC, 또는 PDA와 같은 모바일 무선통신기기 중 하나일 수 있다. In this specification, 'processing device', 'computer', 'computing device', 'server device', and 'server' are operating systems such as Windows, Mac, or Linux, computer processors, memory, applications, storage devices (eg For example, HDD, SDD), and a device having a monitor may be implemented. The computer may be, for example, a device such as a desktop computer, a notebook computer, or a mobile terminal, but these are exemplary and not limited thereto. The mobile terminal may be one of a smart phone, a tablet PC, or a mobile wireless communication device such as a PDA.

본 발명의 일 실시예에 따른 컴퓨터 장치는 "원본 이미지"와 "피팅의상 이미지"를 수신하고 이를 합성하여 "합성 이미지" 또는 "결과 이미지"를 생성한다. 본 명세서에서 "원본 이미지"는 가상으로 의상을 입히려는 사람(모델)의 전신 또는 일부가 표시된 이미지를 의미한다. 또한 본 명세서에서 "피팅의상 이미지"는 원본 이미지의 모델에게 가상으로 입히려는 의상(피팅의상)을 포함하는 이미지를 의미한다. 피팅의상은 상의 및/또는 하의가 될 수 있다. A computer device according to an embodiment of the present invention receives an "original image" and a "fitting image" and synthesizes them to generate a "composite image" or a "result image". As used herein, the term “original image” refers to an image in which the whole body or a part of a person (model) who is going to wear a virtual costume is displayed. In addition, in this specification, "fitting clothes image" means an image including clothes (fitting clothes) to be virtually worn on the model of the original image. The fitting clothes may be tops and/or bottoms.

또한 본 명세서에서 "기존의상"은 피팅의상으로 대체되는 의상, 즉 원본 이미지의 모델이 입고 있는 의상 중 피팅의상으로 대체되어야 할 의상을 의미한다. 예를 들어 원본 이미지에서 상의와 하의를 착용한 모델에 상의의 피팅의상을 가상으로 입히려는 경우, '기존의상'은 원본 이미지의 상의를 의미함을 이해할 것이다. Also, in this specification, the term "existing clothes" refers to clothes replaced with fitting clothes, that is, clothes to be replaced with fitting clothes among clothes worn by the model of the original image. For example, in the case of a model wearing a top and a bottom in the original image, it will be understood that 'existing clothes' means the top of the original image.

본 명세서에서 "합성 이미지" 또는 "결과 이미지"는 원본 이미지와 피팅의상 이미지를 수신하여 이를 합성한 이미지로서, 원본 이미지의 사람에게 피팅 의상을 가상으로 입힌 모습이 표현된 이미지이다. In the present specification, a "composite image" or "result image" is an image obtained by receiving an original image and an image of fitting clothes and synthesizing them, and is an image in which a person of the original image is virtually dressed in fitting clothes.

한편 본 명세서에서 사람(모델)의 신체나 의상을 지칭할 때 설명의 편의를 위해 '이미지'란 단어를 생략할 수 있다. 예컨대 본 명세서에 '사람', '상의', '하의', '피팅의상', '기존의상' 등은 실제의 사람이나 신체 일부 또는 실제 의상을 의미할 수도 있지만 해당 대상물(신체나 의상)이 표현된 이미지(이미지 파일) 또는 이미지의 일부 영역(픽셀)을 의미할 수 있음을 당업자는 충분히 이해할 것이다.Meanwhile, in this specification, when referring to a body or clothes of a person (model), the word 'image' may be omitted for convenience of explanation. For example, in this specification, 'person', 'top', 'bottom', 'fitting clothes', 'existing clothes', etc. may mean a real person, body part, or real clothes, but the object (body or clothes) is expressed It will be fully understood by those skilled in the art that it may mean an image (image file) or a portion of an image (pixels).

이하 도면을 참조하여 본 발명을 상세히 설명하도록 한다. 아래의 특정 실시예들을 기술하는데 있어서 여러 가지의 특정적인 내용들은 발명을 더 구체적으로 설명하고 이해를 돕기 위해 작성되었다. 하지만 본 발명을 이해할 수 있을 정도로 이 분야의 지식을 갖고 있는 독자는 이러한 여러 가지의 특정적인 내용들이 없어도 사용될 수 있다는 것을 인지할 수 있다. 또한 발명을 기술하는 데 있어서 공지 또는 주지관용 기술이면서 발명과 크게 관련 없는 부분들은 본 발명을 설명하는 데 있어 혼돈을 막기 위해 기술하지 않음을 미리 언급해 둔다. Hereinafter, the present invention will be described in detail with reference to the drawings. In describing the specific embodiments below, various specific contents have been prepared to more specifically describe the invention and help understanding. However, a reader having enough knowledge in this field to understand the present invention may recognize that the present invention may be used without these various specific details. In addition, it is mentioned in advance that parts that are well-known or commonly used techniques in describing the invention, but not largely related to the invention, are not described in order to avoid confusion in describing the present invention.

도1은 일 실시예에 따른 가상 피팅 방법을 실행하는 시스템을 설명하는 블록도이다. 1 is a block diagram illustrating a system for executing a virtual fitting method according to an embodiment.

도1을 참조하면, 일 실시예에 따른 가상 피팅 방법은 컴퓨터 장치(10)에서 실행될 수 있다. 컴퓨터 장치(10)는 데스크탑 컴퓨터 또는 노트북 등 통상적인 범용 컴퓨터로 구현될 수 있고 스마트폰과 같은 모바일 단말기로 구현될 수도 있다. 컴퓨터 장치(10)는 도2 내지 도17을 참조하여 후술하는 가상 피팅 방법을 실행하기 위한 하나 이상의 응용 프로그램을 실행할 수 있다. Referring to FIG. 1 , a virtual fitting method according to an embodiment may be executed in a computer device 10 . The computer device 10 may be implemented as a general general computer such as a desktop computer or a notebook computer, or may be implemented as a mobile terminal such as a smart phone. The computer device 10 may execute one or more application programs for executing a virtual fitting method to be described later with reference to FIGS. 2 to 17 .

일 실시예에서 컴퓨터 장치(10)는 사람(모델)의 전신 또는 일부(예컨대 상반신)의 이미지를 카메라(20) 및/또는 사용자 단말기(30)로부터 수신할 수 있다. 카메라(20)는 컴퓨터 장치(10)에 독립적인 카메라 장치일 수도 있고 컴퓨터 장치(10)에 유선/무선으로 연결된 카메라일 수도 있다. 컴퓨터 장치(10)가 스마트폰인 경우 카메라(20)는 이 컴퓨터 장치(10)(즉 스마트폰)에 매립된 카메라일 수 있다. 사용자 단말기(30)는 스마트폰과 같은 휴대용 장치일 수 있고, 예를 들어 사용자 단말기(30)의 카메라로 사람을 촬영하고 이 촬영된 이미지를 유선 또는 무선으로 컴퓨터 장치(10)로 전송할 수 있다. In an embodiment, the computer device 10 may receive an image of a whole body or a part (eg, upper body) of a person (model) from the camera 20 and/or the user terminal 30 . The camera 20 may be a camera device independent of the computer device 10 or a camera connected to the computer device 10 by wire/wireless. When the computer device 10 is a smartphone, the camera 20 may be a camera embedded in the computer device 10 (ie, a smartphone). The user terminal 30 may be a portable device such as a smart phone, for example, may photograph a person with the camera of the user terminal 30 and transmit the captured image to the computer device 10 by wire or wirelessly.

일 실시예에서 컴퓨터 장치(10)는 카메라(20) 또는 사용자 단말기(30)로부터 피사체(예컨대 사람)의 이미지 파일을 수신한다. 이미지 파일은 컴퓨터로 읽을 수 있는 형태의 이미지 파일로서, 예를 들어 JPG, TIF 등의 확장자를 갖는 이미지 파일일 수 있다. 컴퓨터 장치(10)가 입력받은 사람의 전신 또는 일부의 이미지를 이하에서 “원본 이미지”라 칭하기로 한다. In an embodiment, the computer device 10 receives an image file of a subject (eg, a person) from the camera 20 or the user terminal 30 . The image file is a computer-readable image file, and may be, for example, an image file having an extension such as JPG or TIF. Hereinafter, an image of the whole body or a part of a person inputted by the computer device 10 will be referred to as an “original image”.

또한 컴퓨터 장치(10)는 원본 이미지의 사람에게 가상으로 입히려는 의상(이하 “피팅 의상”이라고 함)의 이미지를 수신할 수 있다. 피팅 의상의 이미지도 카메라(20) 또는 사용자 단말기(30) 등의 외부 장치로부터 수신할 수 있다. Also, the computer device 10 may receive an image of a costume (hereinafter referred to as “fitting clothes”) to be virtually put on a person of the original image. An image of fitting clothes may also be received from an external device such as the camera 20 or the user terminal 30 .

컴퓨터 장치(10)는 원본 이미지와 피팅 의상 이미지를 합성하여 피팅 의상을 입고 있는 사람(모델)의 이미지(이하 "결과 이미지" 또는 "합성 이미지"라고 함)를 출력할 수 있다. The computer device 10 may output an image (hereinafter, referred to as a “result image” or “composite image”) of a person (model) wearing the fitting clothes by synthesizing the original image and the fitting clothes image.

일 실시예에서 컴퓨터 장치(10)는 결과 이미지를 디스플레이(40)를 통해 사용자에게 출력할 수 있다. 디스플레이(40)는 컴퓨터 장치(10)에 독립적인 장치일 수도 있고, 예를 들어 컴퓨터 장치(10)가 스마트폰인 경우 디스플레이(40)가 이 컴퓨터 장치(10)(즉 스마트폰)에 매립된 디스플레이 일 수도 있다. In an embodiment, the computer device 10 may output the result image to the user through the display 40 . The display 40 may be a device independent of the computer device 10 , for example, if the computer device 10 is a smartphone, the display 40 is embedded in the computer device 10 (ie, a smartphone). It could also be a display.

도2는 일 실시예에 따른 가상 피팅 시스템의 블록도이다. 2 is a block diagram of a virtual fitting system according to an embodiment.

도2를 참조하면, 일 실시예에서 컴퓨터 장치(10)는 전처리부(100), 세그먼테이션부(200), 의상 변형부(300), 및 이미지 합성부(400)를 포함할 수 있다. 여기서 전처리부(100), 세그먼테이션부(200), 의상 변형부(300), 및 이미지 합성부(400)의 각각은 원본 이미지 및/또는 피팅 의상 이미지를 처리하는 하나 이상의 프로그램, 딥러닝 알고리즘 등의 소프트웨어로 구현될 수 있고, 컴퓨터 장치(10)의 저장장치(예컨대 SDD 또는 하드 디스크)에 저장되어 있다가 메모리에 로딩되어 실행될 수 있다.Referring to FIG. 2 , in an embodiment, the computer device 10 may include a preprocessor 100 , a segmentation unit 200 , a clothing transforming unit 300 , and an image synthesizing unit 400 . Here, each of the pre-processing unit 100, the segmentation unit 200, the clothes transforming unit 300, and the image synthesizing unit 400 is one or more programs that process the original image and/or the fitting clothes image, deep learning algorithms, etc. It may be implemented as software, and may be stored in a storage device (eg, SDD or hard disk) of the computer device 10 and then loaded into a memory and executed.

전처리부(100), 세그먼테이션부(200), 의상 변형부(300), 및 이미지 합성부(400)의 각각의 동작에 대해 이하의 도면을 참조하여 후술하기로 한다. The respective operations of the preprocessor 100 , the segmentation unit 200 , the garment transforming unit 300 , and the image synthesizing unit 400 will be described later with reference to the following drawings.

도3은 일 실시예에 따른 전처리부(100)의 동작을 설명하는 도면이고 도4는 전처리부(100)의 동작을 설명하는 흐름도이다. 3 is a diagram for explaining the operation of the preprocessor 100 according to an embodiment, and FIG. 4 is a flowchart for explaining the operation of the preprocessor 100 .

전처리부(100)는 원본 이미지에서 기존 의상을 제거하는 동작을 수행할 수 있다. 기존 의상이 제거된 상태에서 새로운 의상(즉, 피팅 의상)을 입힐 경우 자유롭게 입힐 수 있으므로 전처리부(100)에서 기존 의상을 제거한다. The preprocessor 100 may perform an operation of removing the existing clothes from the original image. When the new clothes (ie, fitting clothes) are put on while the existing clothes are removed, the pre-processing unit 100 removes the existing clothes because the clothes can be freely worn.

도3과 도4를 참조하면, 단계(S110)에서 전처리부(100)는 원본 이미지(I)에서 포즈 정보(P)를 추출한다. 포즈 정보(P)(또는 간단히 "포즈(P)"라고도 함)는 이미지에서 사람의 포즈의 특징점을 찾아내는 일반적인 포즈 추출 알고리즘을 이용하여 추출될 수 있다. 도3의 도면에서 포즈(P) 이미지는 포즈 추출 알고리즘에서 찾은 특징점 및 이 특징점들을 연결한 선들을 나타낸다. 3 and 4, in step S110, the preprocessor 100 extracts the pose information P from the original image I. The pose information P (or simply referred to as a “pose P”) may be extracted using a general pose extraction algorithm that finds feature points of a human pose in an image. In the drawing of FIG. 3 , the pose (P) image represents the feature points found by the pose extraction algorithm and lines connecting the feature points.

그 후 단계(S120)에서, 원본 이미지(I)로부터 제1 세그먼테이션(S)을 생성한다. 세그먼테이션(S)은 "세그먼테이션 맵", "시맨틱 이미지", "시맨틱 맵" 등으로 불리기도 하며, 이미지에 있는 모든 픽셀을 미리 지정된 클래스(class)로 분류하여 각 클래스에 대응하는 채널(channel) 이미지를 생성한 것이다. 예컨대 얼굴, 헤어, 상의, 하의, 팔, 배경 등이 각각 미리 지정된 클래스가 되고, 제1 세그먼테이션(S)은 이 클래스의 개수만큼의 채널로 이루어진 이미지 데이터이며, 도3에서는 편의상 각 클래스를 각기 다른 컬러로 표시하여 하나의 세그먼테이션(S) 이미지로 도시하였다. Thereafter, in step S120 , a first segmentation S is generated from the original image I. Segmentation (S) is also called "segmentation map", "semantic image", "semantic map", etc., and classifies all pixels in an image into a pre-specified class to create a channel image corresponding to each class. will create For example, a face, hair, top, bottom, arm, background, etc. are each a predefined class, and the first segmentation S is image data composed of as many channels as the number of this class. It is shown in color as one segmentation (S) image.

제1 세그먼테이션(S)의 생성을 위해, 딥러닝 알고리즘, 인공신경망 알고리즘, 인코더와 디코더로 연결된 이미지 생성 알고리즘 등 공지의 세그먼테이션 생성 알고리즘(110)을 사용하여 제1 세그먼테이션(S)을 생성할 수 있다. For the generation of the first segmentation S, the first segmentation S may be generated using a known segmentation generation algorithm 110 such as a deep learning algorithm, an artificial neural network algorithm, an image generation algorithm connected by an encoder and a decoder, etc. .

그 후 단계(S130)에서, 포즈(P)와 제1 세그먼테이션(S)에 기초하여 원본 이미지(I)에서 기존 의상(즉, 본 실시예에서는 상의를 의미함)을 제거한다. 예를 들어, 우선 포즈(P)의 특징점들 중에서 양쪽 어깨 - 팔꿈치 - 손목 경로를 따라 특정 색(도면에서는 회색)을 브러쉬로 칠한다. 즉 원본 이미지에서 원형 모양의 브러쉬로 회색을 칠하여 기존 의상을 제거할 수 있다. 그 후 도5(a)에 도시한 것처럼, 포즈(P)의 특징점 정보에 기반하여, 양쪽 어깨와 골반을 기준으로 사각형 영역을 만들어서 이에 대응하는 원본 이미지의 영역에 회색을 칠하고, 도5(b)에 도시한 것처럼 목 포인트를 중심으로 하는 사각형 영역을 만들어서 이에 대응하는 원본 이미지의 영역에 회색을 칠하여, 원본 이미지에서 기존의상 및 그 주위 영역을 제거한다. Thereafter, in step S130 , existing clothes (that is, clothes in this embodiment) are removed from the original image I based on the pose P and the first segmentation S. For example, first, a specific color (gray in the drawing) is painted with a brush along the shoulder-elbow-wrist path among the feature points of the pose (P). In other words, you can remove the existing clothes from the original image by painting gray with a circular brush. Then, as shown in Fig. 5 (a), based on the feature point information of the pose (P), a rectangular region is created based on both shoulders and the pelvis, and gray is painted in the region of the original image corresponding thereto, and Fig. 5 ( As shown in b), a rectangular area centered on the neck point is created and the corresponding area of the original image is painted gray to remove the existing clothes and the surrounding area from the original image.

그 후 최후의 합성 이미지(결과 이미지)에서도 반드시 유지해야 할 클래스 영역(예컨대 얼굴 및 하의)이 있을 경우, 이 유지해야 할 클래스의 영역을 제1 세그먼테이션(S)에 기초하여 다시 원본 이미지와 동일하게 돌려놓고, 이에 따라 도5(c)에 도시한 것과 같이 기존의상이 제거된 이미지(이하 "의상제거 이미지(Ia)"라고도 함)를 얻는다. 의상제거 이미지(Ia)는 피팅 의상으로 대체할 기존 의상만 제거할 수도 있지만, 도5(c)에 도시한 것처럼, 피팅 의상이 들어갈 영역을 최대한 넓게 마련하기 위해, 팔과 어깨 부근까지도 모두 제거할 수도 있다. 의상제거 이미지(Ia)는 나중의 이미지 합성 단계에서 딥러닝 알고리즘에게 "이 부분은 반드시 유지하라"라는 정보를 제공하기 위해 사용될 수 있다. After that, even in the final composite image (result image), if there is a class area (eg, face and bottom) that must be maintained, the area of the class to be maintained is set to the same as the original image again based on the first segmentation (S). Then, as shown in FIG. 5(c) , an image from which the existing clothes are removed (hereinafter also referred to as “clothes removed image Ia”) is obtained. In the costume removal image (Ia), only the existing clothes to be replaced with fitting clothes can be removed, but as shown in FIG. may be The costume removal image (Ia) can be used to provide information to the deep learning algorithm in the later image synthesis step, "You must keep this part".

도6은 일 실시예에 따른 세그먼테이션부(200)의 동작을 설명하는 도면이고 도7은 세그먼테이션부(200)의 동작을 설명하는 흐름도이다. 세그먼테이션부(200)는 입힐 옷(즉, 피팅 의상)에 맞는 새로운 세그먼테이션(이하 "제2 세그먼테이션"이라고도 함)을 생성하기 위한 기능 블록이며, 여기서 생성된 제2 세그먼테이션을 기준으로 이후 합성이 진행될 수 있다. 즉, 기존 의상의 정보를 완전히 없애고 피팅 의상의 형상을 반영한 세그먼테이션을 생성하고 이를 이후의 합성에 사용함으로써 피팅 의상을 신체에 자유롭게 맞출 수 있다. 또한 포즈 정보를 함께 줘서 자세를 추정 가능하게 할 수 있다. FIG. 6 is a diagram illustrating an operation of the segmentation unit 200 according to an exemplary embodiment, and FIG. 7 is a flowchart illustrating an operation of the segmentation unit 200 . The segmentation unit 200 is a functional block for generating a new segmentation (hereinafter also referred to as “second segmentation”) suitable for clothes to be worn (ie, fitting clothes), and subsequent synthesis may be performed based on the generated second segmentation. have. That is, by completely removing the information of the existing clothes, creating a segmentation reflecting the shape of the fitting clothes, and using this for subsequent synthesis, the fitting clothes can be freely fitted to the body. In addition, it is possible to estimate the posture by providing pose information together.

도7의 단계(210)에서 원본 이미지(I)에서 포즈(P)를 추출하고 단계(S220)에서 제1 세그먼테이션(S)를 생성한다. 이 단계(S210,S220)는 도4의 단계(S120,S220)과 동일하며 이 단계(S120,S220)에서 생성된 포즈(P)와 제1 세그먼테이션(S)을 이용하면 된다. In step 210 of FIG. 7, a pose P is extracted from the original image I, and a first segmentation S is generated in step S220. These steps (S210, S220) are the same as steps (S120 and S220) of FIG. 4, and the pose P and the first segmentation (S) generated in these steps (S120, S220) may be used.

그 후 단계(S230)에서, 포즈(P)에 기초하여 제1 세그먼테이션(S)에서 의상을 제거하여 의상제거 세그먼테이션(Sa)을 생성한다. 예를 들어 제1 세그먼테이션(S)에서 기존 의상에 해당하는 영역에 원형 모양의 브러쉬로 검은색(배경을 의미)을 칠하여 기존 의상을 제거함으로써 의상제거 세그먼테이션(Sa)을 생성한다. 예컨대 제1 세그먼테이션(S)에서 상의 부분(도6의 원본 이미지(I)에서 주황색)에 검은색을 칠하여 기존 의상을 제거한다. 또한 이 때 포즈(P)의 특징점들 중에서 양쪽 어깨 - 팔꿈치 - 손목 경로를 따라 검은색을 브러쉬로 칠하여 팔 영역도 제거하고, 그 후 합성 이미지(결과 이미지)에서 반드시 유지해야 할 클래스 영역(예컨대 얼굴 및 하의)을 제1 세그먼테이션(S)에 기초하여 다시 원본 이미지와 동일하게 돌려놓으며, 이에 따라 도6에 도시한 것과 같이 의상과 팔의 영역이 제거된 의상제거 세그먼테이션(Sa)을 얻는다. Thereafter, in step S230 , the clothes are removed in the first segmentation S based on the pose P to generate the clothes removal segmentation Sa. For example, in the first segmentation (S), the clothes removal segmentation (Sa) is generated by painting black (meaning the background) in the area corresponding to the existing clothes with a circular brush to remove the existing clothes. For example, in the first segmentation S, black is painted on the upper part (orange in the original image I in FIG. 6) to remove the existing garment. Also, at this time, among the features of the pose (P), the arm area is also removed by painting black along the shoulder-elbow-wrist path with a brush, and then the class area (eg, face) that must be maintained in the composite image (result image). and bottom) is returned to the same as the original image again based on the first segmentation S, and accordingly, as shown in FIG. 6 , a clothing removal segmentation Sa from which the clothing and arm regions are removed is obtained.

그 후 단계(S240)에서, 의상제거 세그먼테이션(Sa), 포즈(P), 및 피팅 의상(c)에 기초하여 제2 세그먼테이션(

)을 생성한다. 제2 세그먼테이션(

)의 생성은 딥러닝 알고리즘, 인공신경망 알고리즘 등 공지의 세그먼테이션 생성 알고리즘(220)을 사용할 수 있다. After that, in step S240, the second segmentation (

) is created. Second segmentation (

) may be generated using a well-known segmentation generation algorithm 220 such as a deep learning algorithm or an artificial neural network algorithm.

제2 세그먼테이션(

)은 피팅 의상(c)이 반영된 세그먼테이션이며, 따라서 일 실시예에서 제2 세그먼테이션(

)을 생성하기 위해, 원본 이미지에서 기존 의상이 제거된 의상제거 세그먼테이션(Sa), 및 포즈(P)와 피팅의상(c)이 입력데이터로서 활용된다. 피팅 의상(c)은 제2 세그먼테이션(

)에 반영되어야 할 의상 정보를 주기 위해 입력되고, 의상제거 세그먼테이션(Sa)은 제2 세그먼테이션(

) 생성을 위한 기본 이미지로서 이용되고, 포즈(P)는 피팅 의상(c)의 어느 부위가 어느 각도로 틀어지는지 등의 포즈 정보를 알려주기 위해 입력될 수 있다. Second segmentation (

) is the segmentation in which the fitting garment (c) is reflected, and thus, in one embodiment, the second segmentation (

), the clothes removal segmentation (Sa) in which the existing clothes are removed from the original image, and the pose (P) and the fitting clothes (c) are used as input data. The fitting clothes (c) are the second segmentation (

) is input to give clothing information to be reflected in the second segmentation (Sa)

) is used as a basic image for generation, and the pose P may be input to inform pose information such as which part of the fitting garment c is twisted at which angle.

제2 세그먼테이션(

)은 예컨대 도6에 도시한 이미지와 같이 생성되고, 제1 세그먼테이션(S)과의 비교에서 알 수 있듯이, 제2 세그먼테이션(

)의 상의(주황색) 영역은 기존 의상이 아니라 피팅 의상과 유사한 영역을 갖게 됨을 이해할 것이다. 또한 제2 세그먼테이션(

)에서도 예컨대 얼굴, 헤어, 상의, 하의, 팔, 배경 등이 각각 미리 지정된 클래스가 되고 제2 세그먼테이션(

)은 이 클래스의 개수만큼의 채널로 이루어진 이미지 데이터이며, 도6에서는 편의상 각 클래스를 각기 다른 컬러로 표시하여 하나의 이미지로 나타내었음을 이해할 것이다. Second segmentation (

) is generated, for example, as in the image shown in FIG. 6, and as can be seen from the comparison with the first segmentation S, the second segmentation (

), it will be understood that the area of the upper garment (orange) will have an area similar to the fitting garment, not the existing garment. Also, the second segmentation (

), for example, face, hair, top, bottom, arm, background, etc. are each a predefined class, and the second segmentation (

) is image data composed of as many channels as the number of this class, and it will be understood that each class is displayed in a different color for convenience in FIG. 6 and is represented as one image.

의상제거 이미지(Ia), 의상제거 세그먼테이션(Sa), 및 제2 세그먼테이션(

)의 차이와 관련하여, 의상제거 이미지(Ia)는 피팅 의상(c)이 크더라도 피팅 의상을 수용할 영역을 가능한 최대한 크게 만들기 위해, 원본 이미지에서 삭제하는 영역이 비교적 많은 반면, 의상제거 세그먼테이션(Sa)은 세그먼테이션부(200)에서 제2 세그먼테이션(

)을 만들기 위한 입력 데이터로 사용하기 위한 것이며 기존 의상만 제거하면 되므로 많은 영역을 제거하지 않아도 된다. 그리고 제2 세그먼테이션(

)은 피팅 의상(c)이 반영된 세그먼테이션이며 의상 변형부(300)와 이미지 합성부(400)에서 사용될 수 있다. Costume removal image (Ia), clothing removal segmentation (Sa), and second segmentation (

) in relation to the difference in ), the removal of clothing image (Ia) has relatively many areas to be deleted from the original image, in order to make the area to accommodate the fitting clothing as large as possible even if the fitting clothing (c) is large, whereas the removal of clothing segmentation ( Sa) is the second segmentation in the segmentation unit 200 (

) to be used as input data for making ), and there is no need to remove many areas because only the existing clothes need to be removed. and the second segmentation (

) is a segmentation in which the fitting garment (c) is reflected, and may be used in the clothes transforming unit 300 and the image synthesizing unit 400 .

도8은 일 실시예에 따른 의상 변형부(300)의 동작을 설명하는 도면이고 도9는 의상 변형부(300)의 동작을 설명하는 흐름도이다. 의상 변형부(300)는 원본 이미지의 사람의 포즈에 기초하여 피팅 의상을 변형하는 기능부이다. 8 is a view for explaining the operation of the clothes transforming unit 300 according to an embodiment, and FIG. 9 is a flowchart for explaining the operation of the clothes transforming unit 300 . The clothes transforming unit 300 is a functional unit that transforms the fitting clothes based on the pose of the person in the original image.

도8과 도9를 참조하면, 우선 단계(S310)에서, 입력 데이터를 의상 변형을 수행하는 딥러닝 알고리즘에 입력하여 의상의 변형(와핑) 정도를 나타내는 출력변수(θ)를 생성한다. 일 실시예에서, 제2 세그먼테이션의 의상(클래스)을 나타내는 채널(

c) 및 피팅 의상(c)을 입력 데이터로 사용할 수 있다. 이 경우, 피팅 의상(c)은 와핑의 대상이 되는 데이터이고, 채널(

c)은 변형되는 의상이 차지하는 영역을 설정하기 위한 데이터이다. Referring to FIGS. 8 and 9 , first, in step S310, input data is input to a deep learning algorithm that performs clothing transformation to generate an output variable θ indicating the degree of transformation (warping) of the clothes. In one embodiment, the channel representing the costume (class) of the second segmentation (

c) and fitting clothes (c) can be used as input data. In this case, the fitting clothes (c) are data to be warped, and the channel (

c) is data for setting the area occupied by the clothes to be transformed.

대안적으로, 도8에 도시한 것처럼 의상제거 이미지(Ia), 포즈(P), 및 제2 세그먼테이션의 의상(클래스)을 나타내는 채널(

c), 및 피팅 의상(c)을 딥러닝 알고리즘의 입력 데이터로서 사용할 수 있다. 이 경우 포즈(P)는 의상의 어느 부위가 어느 각도로 틀어져서 변형되는지를 알려주기 위한 데이터로 사용되고 의상제거 이미지(Ia)는, 의상에 가려지지 않아야 할 부분의 정보를 주기 위해 사용될 수 있다. 예를 들어, 치마나 바지 등의 하의 때문에 피팅 의상(상의) 아래쪽이 제거되어야 하는데, 이 경우 단순히 피팅 의상을 사람의 포즈에 따라 변형해야 할 뿐만 아니라 피팅 의상의 아래쪽 영역을 제거해야 하며, 따라서 이렇게 제거되어야 하는 부분까지 반영하기 위해 의상제거 이미지(Ia)가 사용된다. Alternatively, as shown in Fig. 8, a channel representing the costume (class) of the costume removal image (Ia), the pose (P), and the second segmentation (

c), and fitting clothes (c) can be used as input data of the deep learning algorithm. In this case, the pose P may be used as data for informing which part of the clothes is deformed by being distorted at what angle, and the clothes removal image Ia may be used to give information about parts that should not be covered by the clothes. For example, the bottom of the fitting outfit (top) needs to be removed because of the bottom of the skirt or pants. In order to reflect even the part to be removed, the clothing removal image Ia is used.

위와 같이 의상제거 이미지(Ia), 포즈(P), 및 제2 세그먼테이션의 의상(클래스)을 나타내는 채널(

c)을 딥러닝 알고리즘의 인코더(310)에 입력하고 또한 피팅 의상(c)을 별도의 딥러닝 알고리즘의 인코더(310)에 입력 데이터로서 입력하여 각각 다단의 컨볼루션 연산 및 다운샘플링이 이루어지고 그 후 각각의 결과를 결합(Concat)하여 디코딩하면 출력변수(θ)를 얻을 수 있다. 여기서 출력변수(θ)는 의상의 변형(와핑: warping)의 정도를 정의하는 값이고, 출력변수(θ)와 피팅 의상(c)을 예컨대 공지의 와핑 알고리즘(320)에 입력하여 변형된 의상(W)을 출력할 수 있다. As above, the channel (Ia), the pose (P), and the channel (

c) is input to the encoder 310 of the deep learning algorithm, and the fitting costume (c) is input as input data to the encoder 310 of a separate deep learning algorithm, so that multi-stage convolution operation and downsampling are performed, respectively, and the After that, each result is combined (Concat) and decoded to obtain an output variable (θ). Here, the output variable (θ) is a value defining the degree of deformation (warping) of the clothes, and the output variable (θ) and the fitting clothes (c) are input to, for example, a known warping algorithm 320 to transform the clothes ( W) can be printed.

상기와 같이 인코더와 디코더로 이루어진 딥러닝 알고리즘은 공지의 기계학습 알고리즘을 이용할 수 있으며 본 발명의 실시예에 맞는 학습 데이터로 적절히 학습하여 구현할 수 있다. As described above, the deep learning algorithm composed of the encoder and the decoder can use a known machine learning algorithm, and can be implemented by appropriately learning with the learning data according to the embodiment of the present invention.

도10은 대안적 실시예에 따른 의상 변형부(300)를 블록도로 개략적으로 도시하였다. Fig. 10 schematically shows in a block diagram a garment transforming unit 300 according to an alternative embodiment.

도10(a)의 실시예는 입력 데이터로서 의상제거 이미지(Ia)와 피팅 의상(c)을 사용한다. 예컨대 의상제거 이미지(Ia)와 피팅 의상(c)을 각각 딥러닝 알고리즘의 인코더(310)에 입력하여 출력변수(θ)를 생성하고, 출력변수(θ)에 기초하여 피팅 의상(c)을 와핑하여 변형된 피팅 의상(W)을 출력할 수 있다. 또 다른 대안적 실시예에서, 의상제거 이미지(Ia) 대신 원본 이미지(I)를 사용할 수 있다. The embodiment of Fig. 10 (a) uses the clothes removal image Ia and the fitting clothes c as input data. For example, an output variable θ is generated by inputting the clothes removal image Ia and the fitting clothes c to the encoder 310 of the deep learning algorithm, and warping the fitting clothes c based on the output variable θ Thus, the deformed fitting clothes W can be output. In another alternative embodiment, the original image I may be used instead of the costume removal image Ia.

도10(b)의 실시예는 입력 데이터로서 원본 이미지(I), 제1 세그먼테이션의 의상 클래스를 나타내는 채널(Sc), 및 피팅 의상(c)을 사용할 수 있다. 예를 들어, 원본 이미지(I)와 제1 세그먼테이션의 의상 클래스를 나타내는 채널(Sc) 및 피팅 의상(c)을 각각 딥러닝 알고리즘의 인코더(310)에 입력하여 출력변수(θ)를 생성하고, 출력변수(θ)에 기초하여 피팅 의상(c)을 와핑하여 변형된 피팅 의상(W)을 출력할 수 있다. The embodiment of FIG. 10( b ) may use the original image I, the channel Sc indicating the first segmentation clothes class, and the fitting clothes c as input data. For example, by inputting the original image (I) and the channel (Sc) representing the clothing class of the first segmentation and the fitting clothing (c) to the encoder 310 of the deep learning algorithm, respectively, an output variable (θ) is generated, The deformed fitting garment W may be output by warping the fitting garment c based on the output variable θ.

도10(c)의 실시예는 입력 데이터로서 원본 이미지(I), 제2 세그먼테이션의 의상 클래스를 나타내는 채널(

c), 및 피팅 의상(c)을 사용할 수 있다. 예를 들어, 원본 이미지(I)와 제2 세그먼테이션의 의상 클래스를 나타내는 채널(

c) 및 피팅 의상(c)을 각각 딥러닝 알고리즘의 인코더(310)에 입력하여 출력변수(θ)를 생성하고, 출력변수(θ)에 기초하여 피팅 의상(c)을 와핑하여 변형된 피팅 의상(W)을 출력할 수 있다. 도10(b)와 도10(c)의 또 다른 대안적 실시예에서, 원본 이미지(I) 대신 의상제거 이미지(Ia)를 사용할 수도 있다. 도11은 일 실시예에 따른 이미지 합성부(400)의 동작을 설명하는 도면이고 도12는 이미지 합성부(400)의 동작을 설명하는 흐름도이다. The embodiment of Fig. 10(c) shows an original image (I) as input data, and a channel (

c), and fitting clothes (c) can be used. For example, the channel representing the original image (I) and the clothing class of the second segmentation (

c) and the fitting garment (c) are input to the encoder 310 of the deep learning algorithm, respectively, to generate an output variable (θ), and based on the output variable (θ), the fitting garment (c) is warped to transform the fitting garment (W) can be printed. In another alternative embodiment of Figs. 10(b) and 10(c), the costume-removed image Ia may be used instead of the original image I. 11 is a diagram for explaining the operation of the image synthesizing unit 400 according to an exemplary embodiment, and FIG. 12 is a flowchart for explaining the operation of the image synthesizing unit 400 .

일 실시예에서, 제1 세그먼테이션(S) 또는 제2 세그먼테이션(

)의 의상 영역 중 피팅 의상(W)의 영역이 중첩되지 않는 비정렬 영역(Mmisalign)을 생성하고, 이 비정렬 영역(Mmisalign), 원본 이미지(I) 또는 원본 이미지를 이미지 처리한 이미지(예컨대 Ia), 및 변형된 피팅 의상(W)을 이미지 합성 알고리즘(410)에 입력 데이터로 입력하여 합성 이미지(

)를 생성할 수 있다. In one embodiment, the first segmentation (S) or the second segmentation (S)

), an unaligned area (Mmisalign) is created in which the area of the fitting garment (W) does not overlap, and the misaligned area (Mmisalign), the original image (I), or an image obtained by processing the original image (eg, Ia ), and the transformed fitting garment (W) as input data to the image synthesis algorithm 410 to obtain a composite image (

) can be created.

이 때 비정렬 영역(Mmisalign)을 생성하기 위해, 예를 들어, 세그먼테이션(예컨대, 제1 세그먼테이션(S) 또는 제2 세그먼테이션(

) 중 하나)의 의상 영역과 상기 변형된 피팅 의상(W) 영역이 중첩되는 정렬 영역(Malign)을 우선 생성하고, 상기 세그먼테이션의 의상 영역에서 상기 정렬 영역(Malign)을 제외하여 비정렬 영역(Mmisalign)을 생성할 수 있다. At this time, in order to create a misaligned region (Mmisalign), for example, segmentation (eg, the first segmentation (S) or the second segmentation (S))

) first) to create an alignment region (Malign) in which the garment region of one) and the deformed fitting garment (W) region overlap, and excluding the alignment region (Malign) from the segmentation garment region (Mmisalign) ) can be created.

이 때 사용되는 세그먼테이션은 제1 세그먼테이션(S) 또는 제2 세그먼테이션(

)일 수 있으며, 바람직하게는 제2 세그먼테이션(

)을 사용한다. 이 경우 도12의 단계(S410)와 같이, 제2 세그먼테이션(

)의 의상 영역의 채널(

c) 중 변형된 의상(W)이 커버하지 못하는 비정렬 영역(Mmisalign) 생성하고, 그 후 단계(S420)에서, 제1 데이터 세트(Ia, P, W) 및 제2 데이터 세트(

, Mmisalign)를 입력 데이터로서 이미지 합성 알고리즘에 입력하여 합성 이미지(

)를 생성할 수 있다. The segmentation used at this time is the first segmentation (S) or the second segmentation (

), preferably the second segmentation (

) is used. In this case, as in step S410 of FIG. 12 , the second segmentation (

) of the clothing area's channel (

c) A misaligned region (Mmisalign) that is not covered by the deformed garment W is generated, and then, in step S420, the first data set (Ia, P, W) and the second data set (

, Mmisalign) as input data to the image synthesis algorithm,

) can be created.

이 때 비정렬 영역(Mmisalign)은 도13(a)에 도시한 것처럼 생성할 수 있다. 즉 제2 세그먼테이션(

)의 의상 영역의 채널(

c)과 변형된 피팅 의상(W) 영역이 중첩되는 부분을 정렬 영역(Malign)으로서 우선 생성하고, 제2 세그먼테이션(

)의 의상 영역 채널(

c)에서 정렬 영역(Malign)을 제외한 영역을 비정렬 영역(Mmisalign)으로 정의할 수 있다. At this time, the misaligned region Mmisalign may be created as shown in FIG. 13(a). That is, the second segmentation (

) of the clothing area's channel (

c) and the area where the deformed fitting garment (W) overlaps are first created as an alignment area (Malign), and the second segmentation (

) of the clothing area channel (

In c), an area excluding the alignment area Malign may be defined as a misalignment area Mmisalign.

한편 단계(S420)에서 이미지 합성 알고리즘(410)에 입력되는 제1 데이터 세트는 의상제거 이미지(Ia), 포즈(P), 및 변형된 피팅 의상(W)을 포함하고, 제2 데이터 세트는 제2 세그먼테이션(

) 및 비정렬 영역(Mmisalign)을 포함할 수 있다. 여기서 제1 데이터 세트는 가상 피팅의 결과물로서 출력하기 위해 합성해야 할 이미지를 제공한다. 즉 기본적으로 원본 이미지(I) 및 변형된 피팅 의상(W)을 포함할 수 있으나, 바람직하게는, 원본 이미지(I)에서 기존 의상(c)을 제거한 의상제거 이미지(Ia) 및 변형된 피팅 의상(W)을 입력 데이터로 할 수 있다. 이 때 의상제거 이미지(Ia)는 합성 이미지에서도 유지해야 할 영역을 지정하는 역할을 할 수 있다. 또한 본 발명의 일 실시예에서, 포즈(P)에 관한 정보도 제1 데이터 세트에 포함될 수 있다. 이 때 포즈(P) 정보는 피팅 의상을 원본 이미지에 합성시 피팅 의상의 텍스쳐를 생성할 때 포즈(P)를 참고하여 생성하도록 입력하는 것으로, 예컨대 이미지 합성 알고리즘이 새로 생성해야 할 부분, 즉 비정렬 영역(Mmisalign)에 텍스쳐를 만들 때 포즈(P)를 참고하여 생성할 할 수 있다. On the other hand, the first data set input to the image synthesis algorithm 410 in step S420 includes the clothes removed image (Ia), the pose (P), and the deformed fitting clothes (W), and the second data set is the second data set 2 segmentation (

) and a misaligned region (Mmisalign). Here, the first data set provides an image to be synthesized to output as a result of virtual fitting. That is, it may basically include the original image (I) and the deformed fitting clothes (W), but preferably, the clothes removal image (Ia) and the modified fitting clothes (Ia) in which the existing clothes (c) are removed from the original image (I) (W) can be used as input data. In this case, the clothes removal image Ia may serve to designate an area to be maintained even in the composite image. In addition, in an embodiment of the present invention, information about the pose P may also be included in the first data set. At this time, the pose (P) information is input so that when the texture of the fitting clothes is generated when the fitting clothes are synthesized with the original image, they are generated by referring to the pose (P). When creating a texture in the alignment area (Mmisalign), it can be created by referring to the pose (P).

단계(S420)에서 이미지 합성 알고리즘(410)에 입력되는 제2 데이터 세트는 텍스쳐를 유지해야 영역과 새로 생성해야 할 영역에 관한 정보를 주는 역할을 한다. 일 실시예에서 제2 데이터 세트는 제2 세그먼테이션(

) 및 비정렬 영역(Mmisalign)을 포함한다. The second data set input to the image synthesizing algorithm 410 in step S420 serves to provide information on the region to be maintained and the region to be newly created. In one embodiment, the second data set is a second segmentation (

) and a misaligned region (Mmisalign).

일 실시예에서 제2 데이터 세트를 이미지 합성 알고리즘에 입력할 때 도13(b)에 도시한 것처럼 제2 세그먼테이션(

) 데이터를 분리되어 입력할 수 있다. 즉 제2 세그먼테이션(

)을 피팅 의상 영역(

c)과 이 영역(

c)을 제외한 나머지 영역(

_c)으로 분리하고 이렇게 분리된 제2 세그먼테이션(

) 데이터와 비정렬 영역(Mmisalign)을 이미지 합성 알고리즘(410)에 입력할 수 있다. 제2 세그먼테이션(

)을 피팅 의상 영역(

c)과 나머지 영역(

_c)으로 분리하여 입력하는 것은, 이미지 합성 알고리즘(410)이 이렇게 입력 데이터를 개별적으로 입력받는 경우 피팅 의상을 합성할 영역을 보다 정확히 인식할 수 있기 때문이다. 그러나 대안적 실시예에서는 제2 세그먼테이션(

)을 위와 같이 분리하지 않고 하나의 데이터로 입력할 수도 있음은 물론이다. In one embodiment, when the second data set is input to the image synthesis algorithm, the second segmentation (

) data can be entered separately. That is, the second segmentation (

) to the fitting garment area (

c) and this area (

c) except for the area (

_c) and the second segmentation (

) data and the misaligned region (Mmisalign) may be input to the image synthesis algorithm 410 . Second segmentation (

) to the fitting garment area (

c) and the rest of the area (

The reason for inputting separately as _c) is that when the image synthesis algorithm 410 receives input data individually in this way, it is possible to more accurately recognize a region for synthesizing fitting clothes. However, in an alternative embodiment, the second segmentation (

Of course, it is also possible to input as one data without separating the ) as above.

한편 도14와 도15(a)에 도시한 것처럼, 일 실시예에서 이미지 합성 알고리즘은 복수의 컨볼루션 연산 블록이 다단으로 연결된 디코더로 구현될 수 있다. 도14의 실시예에서는 이미지 합성 알고리즘을 컨볼루션과 업샘플링 연산을 포함하는 복수의 잔차블록(ResBlk)(415)이 다단으로 연결된 디코더로 구성하였고, 이 때 제1 데이터 세트와 제2 데이터 세트는 상기 복수의 잔차블록의 각 잔차블록마다 각각 입력 데이터로서 입력된다. Meanwhile, as shown in FIGS. 14 and 15 ( a ), in an embodiment, the image synthesis algorithm may be implemented as a decoder in which a plurality of convolution operation blocks are connected in multiple stages. In the embodiment of Fig. 14, the image synthesis algorithm is configured as a decoder in which a plurality of residual blocks (ResBlk) 415 including convolution and upsampling operations are connected in multiple stages, in this case, the first data set and the second data set are Each residual block of the plurality of residual blocks is input as input data, respectively.

잔차블록(ResBlk) 알고리즘은 딥러닝 알고리즘의 학습시 최적화의 난이도를 낮추는 기법으로 공지된 기술이다. 즉 실제로 내재한 매핑(mapping) 결과값(예컨대, H)을 직접 학습하는 것이 어려우므로 입력값(x)과 결과값(H)을 차이, 즉 (F: F(x)=H(x)-x)를 학습의 잔차 부분으로 보고 이 부분을 학습하는 기법이며, 따라서 입력(x)이 컨볼루션단으로 입력됨과 동시에 입력을 분기하여 곧바로 출력단에도 연결시키는 블록도로 표현한다. The residual block (ResBlk) algorithm is a technique known as a technique for lowering the difficulty of optimization when learning a deep learning algorithm. That is, since it is difficult to directly learn the mapping result (eg, H) inherent in the actual mapping, the difference between the input value (x) and the result value (H), that is, (F: F(x)=H(x)- This is a technique for learning x) by looking at it as the residual part of learning, so it is expressed as a block diagram in which the input (x) is input to the convolution stage, and at the same time the input is branched and connected directly to the output stage.

본 발명의 일 실시예에서는 이러한 잔차블록(ResBlk)의 기법을 채용하여, 도15(b)에 도시한 것처럼 각각의 잔차블록을 정렬인식 잔차블록(ALIAS ResBlk)(415)으로 구성하였고, 각각의 정렬인식 잔차블록(415)은 입력(hi)이 정렬인식 정규화(417)단을 2번 거치는 경로와 1번 거치는 경로로 각각 입력된 후 그 결과를 합하여 출력(hi+1)하는 구조로 구성된다. 이 때 제1 데이터 세트는 각각의 정렬인식 잔차블록(415)의 입력단으로 입력되며, 제2 데이터 세트는 각각의 정렬인식 정규화(417)에 입력될 수 있다. In an embodiment of the present invention, such a residual block (ResBlk) technique is employed, and each residual block is composed of an alignment recognition residual block (ALIAS ResBlk) 415 as shown in FIG. 15(b), and each The alignment recognition residual block 415 has a structure in which the input (hi) is inputted to a path that passes through the alignment recognition normalization stage 417 twice and a path that passes once, and then sums the results to output (hi+1). . In this case, the first data set may be input to the input terminal of each alignment recognition residual block 415 , and the second data set may be input to each alignment recognition normalization 417 .

또한 일 실시예에서 각각의 정렬인식 정규화(417)는 공지의 SPADE(Spatially-Adaptive Denormalization) 알고리즘의 배치 정규화(Batch Norm) 구조를 채용하여 본 발명에 적용한 것으로, 예를 들어 도16에 도시한 구조를 가질 수 있다. 일반적으로 배치 정규화(Batch Norm)는 딥러닝 알고리즘의 각 레이어(layer)마다 출력을 정규화함으로써 출력값이 비선형성을 유지하도록 하여 알고리즘의 성능을 향상시키는 기법이며, 도16에 도시한 것처럼 세그먼테이션 맵을 각각 복수회 컨볼루션 연산한 결과에 따라 γ(스케일링 파라미터)와 β(쉬프트 파라미터)를 산출하고 이 값을 각 레이어의 출력값에 각각 곱하고 더하여서 다음 레이어로 출력하는 역할을 한다. In addition, in one embodiment, each alignment-aware normalization 417 is applied to the present invention by adopting a batch normalization (Batch Norm) structure of a known SPADE (Spatially-Adaptive Denormalization) algorithm, for example, the structure shown in FIG. can have In general, batch normalization is a technique for improving the performance of the algorithm by normalizing the output for each layer of the deep learning algorithm so that the output value maintains non-linearity, and as shown in FIG. It calculates γ (scaling parameter) and β (shift parameter) according to the result of multiple convolution operations, multiplies these values with the output values of each layer, adds them, and outputs them to the next layer.

본 발명의 일 실시예에서 정렬인식 정규화(417)는 이러한 정렬인식 정규화(417) 블록에 제2 데이터 세트를 입력 데이터로서 입력한다. 즉 도16에 도시한 것처럼 피팅 의상 영역(

c)과 나머지 영역(

_c) 및 비정렬 영역(Mmisalign)을 γ와 β를 생성하기 위한 컨볼루션 연산의 입력 데이터로 입력하고 γ와 β의 연산 앞단에 비정렬 영역(Mmisalign)을 표준화하여 입력할 수 있다. In an embodiment of the present invention, the alignment-aware normalization 417 inputs the second data set as input data to the alignment-aware normalization 417 block. That is, as shown in Fig. 16, the fitting clothing area (

c) and the rest of the area (

_c) and the misaligned region (Mmisalign) can be input as input data of a convolution operation for generating γ and β, and the misaligned region (Mmisalign) can be standardized and input before the operation of γ and β.

대안적 실시예에서, 이미지 합성부(400)에 입력하는 제1 데이터 세트와 제2 데이터 세트의 일부 데이터를 변경하거나 생략할 수도 있음은 물론이다. 예를 들어 도17은 예시적인 실시예에 따른 이미지 합성부(400)의 동작을 블록도로 개략적으로 도시하였다. Of course, in an alternative embodiment, some data of the first data set and the second data set input to the image synthesizing unit 400 may be changed or omitted. For example, Fig. 17 is a block diagram schematically illustrating the operation of the image synthesizing unit 400 according to an exemplary embodiment.

도17(a)은 제1 데이터 세트에서 포즈(P)를 생략하고 제2 데이터 세트에서 제2 세그먼테이션(

) 대신 제1 세그먼테이션(S)을 사용하는 실시예를 나타내었다. 도17(b)는 제1 데이터 세트에서 포즈(P)를 생략하고 변형된 피팅의상(W) 대신 피팅의상(c)의 이미지를 사용하여 이미지를 합성하는 실시예를 나타내었다. 도17(c)는 제1 데이터 세트에서 의상제거 이미지(Ia) 대신 원본 이미지(I)를 사용하고 포즈(P)를 생략한 경우의 실시예를 나타내었다. 이와 같이 구체적 실시 형태에 따라 이미지 합성부(400)에 입력하는 제1 데이터 세트와 제2 데이터 세트의 일부 데이터를 다양한 조합으로 변경하거나 생략하여 이미지를 합성할 수 있음을 이해할 것이다. Fig. 17(a) omits the pause P in the first data set and shows the second segmentation (P) in the second data set.

) instead of the first segmentation (S) is shown. Fig. 17(b) shows an example in which the pose (P) is omitted from the first data set and images are synthesized using the image of the fitting outfit (c) instead of the modified fitting outfit (W). Fig. 17(c) shows an embodiment in which the original image I is used instead of the costume removal image Ia in the first data set and the pose P is omitted. As described above, it will be understood that, according to a specific embodiment, an image may be synthesized by changing or omitting some data of the first data set and the second data set input to the image synthesizing unit 400 in various combinations.

도18은 일 실시예에 따른 본 발명의 효과를 설명하는 도면이다. 18 is a view for explaining the effects of the present invention according to an embodiment.

일반적으로 가상 피팅이 완성된 합성 이미지의 해상도가 커질수록 원본 이미지의 사람 영역과 변형된 피팅 의상의 영역 사이에 어색한 부분이 많이 보이게 되는데, 본 발명에서는 상술한 것과 같이 비정렬 영역(Mmisalign)을 이미지 합성 이미지의 입력 데이터중 하나로 입력함으로써 이미지 합성 알고리즘(410)이 비정렬 영역을 인식하게 하고 이 비정렬 영역에 텍스쳐를 생성하여 채워 넣도록 함으로써 자연스러운 피팅 의상을 구현할 수 있다. In general, as the resolution of the synthetic image on which the virtual fitting is completed increases, a lot of awkward parts are seen between the human area of the original image and the area of the deformed fitting clothing. By inputting as one of the input data of the composite image, the image synthesis algorithm 410 recognizes the unaligned area and creates and fills the unaligned area with a texture, thereby realizing a natural fitting outfit.

도18(a)는 종래기술에 따른 합성 이미지로서, 사람의 어깨 부분에 표시한 빨간색 영역이 비정렬 영역(Mmisalign)에 해당하는데, 종래의 피팅 기술에서는 이미지 합성 알고리즘이 이 부분의 처리를 제대로 하지 못하여 텍스쳐가 제대로 생성되지 않고 부자연스러운 출력을 나타내었다. Figure 18 (a) is a composite image according to the prior art. The red area marked on the shoulder of a person corresponds to the misalignment area (Mmisalign). In the conventional fitting technology, the image synthesis algorithm does not properly process this portion. As a result, the texture was not properly created and an unnatural output was displayed.

이에 반해 도18(b)는 본 발명의 일 실시예에 따른 합성 이미지로서, 이미지 합성 알고리즘(410)이 빨간색의 비정렬 영역(Mmisalign)을 인식하여 이 영역에 텍스쳐를 생성하여 채워 넣음으로써 종래에 비해 보다 자연스러운 출력 결과를 나타내었음을 확인할 수 있다. On the other hand, FIG. 18( b ) is a composite image according to an embodiment of the present invention. The image synthesis algorithm 410 recognizes a red misaligned area (Mmisalign) and creates and fills the texture in this area. It can be seen that a more natural output result was displayed compared to that.

이상과 같이 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 명세서의 기재로부터 다양한 수정 및 변형이 가능함을 이해할 수 있다. 그러므로 본 발명의 범위는 설명된 실시예에 국한되어 정해져서는 아니되며 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다. As described above, those of ordinary skill in the art to which the present invention pertains can understand that various modifications and variations are possible from the description of this specification. Therefore, the scope of the present invention should not be limited to the described embodiments and should be defined by the claims described below as well as the claims and equivalents.

10: 컴퓨터 장치
100: 전처리부
200: 세그먼테이션부
300: 의상 변형부
400: 이미지 합성부10: computer device
100: preprocessor
200: segmentation unit
300: costume transformation part
400: image synthesizing unit

Claims

A virtual fitting method for synthesizing fitting clothes with an original image using a computer device, comprising:
generating a segmentation from the original image;
generating a deformed fitting garment by deforming the fitting garment (c) based on the original image;
generating a misalignment region (Mmisalign) in which regions of the deformed fitting clothes do not overlap among the segmentation regions of the existing clothes; and
and generating a composite image by inputting input data including the original image and fitting clothes into an image synthesis algorithm.

The method according to claim 1,
In the generating of the unaligned region, an alignment region (Malign) in which the segmentation garment region and the deformed fitting clothing region overlap is created, and the unaligned region is excluded from the segmentation garment region A virtual fitting method using a computer device, characterized in that it generates a.

The method according to claim 2,
The segmentation is a first segmentation (S) in which the original image is segmented or a second segmentation (S) in which fitting clothes are included as a class instead of existing clothes in the first segmentation (

), characterized in that, a virtual fitting method using a computer device.

The method of claim 3, wherein the computer device comprises:
In the first segmentation, the clothes removal segment (Sa) is generated by removing the existing clothes region, and the second segmentation (Sa) is performed using data including the clothes removal segmentation (Sa) and the fitting clothes.

), a virtual fitting method using a computer device, characterized in that it generates.

The method of claim 1,
In the step of generating the composite image, a first data set including the original image or a costume removal image in which the existing clothes are removed from the original image, and the deformed fitting clothes, and a second data set including the unaligned region A virtual fitting method using a computer device, characterized in that a data set is used as input data of the image synthesis algorithm.

The method according to claim 5,
Extracting pose information (P) including feature points of the body from the original image, and removing a predetermined region from the original image based on the pose information (P) to generate the clothes removal image , a virtual fitting method using a computer device.

The method according to claim 5,
Extract the pose information (P) including the feature points of the body from the original image, remove a predetermined region from the original image based on the pose information (P), and maintain it in the composite image among the segments of the segmentation class A virtual fitting method using a computer device, characterized in that the clothes removal image is generated by overlapping images of a preset class to be performed.

The method according to claim 5,
The image synthesis algorithm consists of a decoder in which a plurality of residual blocks (ResBlk) including convolution and upsampling operations are connected in multiple stages, and the first data set and the second data set are each residual block of the plurality of residual blocks. A virtual fitting method using a computer device, characterized in that each is input as input data to the .

A computer-readable recording medium in which a computer program for executing the virtual fitting method according to any one of claims 1 to 8 is recorded.