KR102446711B1

KR102446711B1 - An item image generation model, a method for generating an item image using the item image generation model, and an apparatus for generating an item image

Info

Publication number: KR102446711B1
Application number: KR1020220007294A
Authority: KR
Inventors: 정재원
Original assignee: 오드컨셉 주식회사
Priority date: 2022-01-18
Filing date: 2022-01-18
Publication date: 2022-09-26
Also published as: KR20230111571A

Abstract

A product image generation model according to an embodiment of the present invention comprises: an encoder unit including a first encoder for extracting a first feature from a first image related to a first product and a second encoder for extracting a second feature from a second image related to a second product; a mapping network which transforms a merged feature generated based on the first feature and the second feature to generate a first target feature for generating an image related to a third product to be coordinated with the first product and the second product; and a generator for generating a fake image related to a coordinating product related to a third product category from the first target feature. The generator is learned based on a difference between the fake image and a real image related to the third product category. The present invention can increase the accuracy of the product image generation model.

Description

A product image generation model, a method for generating a product image using the product image generation model, and a product image generation apparatus ITEM IMAGE}

본 출원은 상품 이미지 생성 방법 및 상품 이미지 생성 장치에 관한 것이다. 구체적으로 본 출원은 코디할 상품 이미지를 생성하는 상품 이미지 생성 모델 및 상품 이미지 생성 모델을 이용하여 코디할 상품 이미지를 생성하는 방법 및 장치에 관한 것이다. The present application relates to a product image generating method and a product image generating apparatus. Specifically, the present application relates to a product image generation model for generating a product image to be coordinated, and a method and apparatus for generating a product image to be coordinated using the product image generation model.

인공지능 기술이 발전하면서 다양한 산업 분야에서 인공지능 기술들이 활용되고 있다. 특히, 이커머스 분야에서는 인공지능 기법을 이용하여 사용자의 선호도에 따라 상품을 추천해주는 기술들이 활발하게 연구되고 있다.As artificial intelligence technology develops, artificial intelligence technologies are being used in various industries. In particular, in the field of e-commerce, technologies for recommending products according to user preferences using artificial intelligence techniques are being actively studied.

종래에는 사용자의 구매 이력이나 제품 열람 이력 등에 기초하여 사용자에게 개인화된 상품을 추천해왔다. 특히 사용자의 구매 이력이나 제품 열람 정보를 다른 사용자의 구매 이력이나 제품 열람 정보 기술과 비교하여, 특정 상품을 구매한 사용자에게, 사용자와 유사한 구매 이력 정보를 가지는 다른 사용자의 구매 이력이나 제품 열람 정보를 이용하여 추천 상품을 제공해왔다. 다만, 종래에는 복수의 상품 카테고리와 관련된 상품들에 어울리는 상품을 추천하는 인공지능 모델에 대한 연구는 충분하지 않았다.Conventionally, personalized products have been recommended to users based on the user's purchase history or product browsing history. In particular, by comparing a user's purchase history or product viewing information with another user's purchase history or product viewing information technology, the purchase history or product viewing information of other users with similar purchase history information to the user is provided to the user who purchased a specific product. Used to provide recommended products. However, conventionally, research on artificial intelligence models for recommending products suitable for products related to a plurality of product categories has not been sufficient.

이에, 복수의 상품 카테고리와 관련된 상품들에 코디할 상품 정보를 자동적으로 생성하는 새로운 인공지능 모델, 및 인공지능 모델의 학습 방법의 개발이 요구된다.Accordingly, it is required to develop a new artificial intelligence model for automatically generating product information to be coordinated with products related to a plurality of product categories, and a learning method of the artificial intelligence model.

본 발명이 해결하고자 하는 일 과제는, 제1 상품 카테고리와 관련된 제1 상품과 제2 상품 카테고리와 관련된 제2 상품을 포함하는 복수의 상품들에 코디할 코디 상품 정보를 생성하는 상품 이미지 생성 모델, 이를 이용한 상품 이미지 생성 방법, 및 상품 이미지 생성 장치를 제공하는 것이다. One problem to be solved by the present invention is a product image generation model for generating coordinated product information to be coordinated with a plurality of products including a first product related to a first product category and a second product related to a second product category; It is to provide a product image generating method using the same, and a product image generating apparatus.

본 발명이 해결하고자 하는 과제가 상술한 과제로 제한되는 것은 아니며, 언급되지 아니한 과제들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The problem to be solved by the present invention is not limited to the above-described problems, and the problems not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the present specification and the accompanying drawings. .

본 출원의 일 실시예에 따른 제1 상품 카테고리와 관련된 제1 상품에 대한 이미지와 제2 상품 카테고리와 관련된 제2 상품에 대한 이미지에 기초하여, 제1 상품과 제2 상품과 코디할 코디 상품과 관련된 코디 이미지를 생성하는 상품 이미지 생성 모델에 있어서, 상품 이미지 생성 모델은, 상기 제1 상품과 관련된 제1 이미지로부터 제1 피처를 추출하는 제1 인코더 및 상기 제2 상품과 관련된 제2 이미지로부터 제2 피처를 추출하는 제2 인코더를 포함하는 인코더부; 상기 제1 피처와 상기 제2 피처에 기초하여 생성된 병합 피처를 변환하여, 상기 제1 상품 및 상기 제2 상품과 코디할 제3 상품과 관련된 이미지를 생성하기 위한 제1 대상 피처를 생성하는 맵핑 네트워크(Mapping network); 및 상기 제1 대상 피처로부터 제3 상품 카테고리와 관련된 코디 상품과 관련된 페이크 이미지를 생성하는 생성자(Generator);를 포함하되, 상기 생성자는, 상기 페이크 이미지 및 상기 제3 상품 카테고리와 관련된 리얼 이미지 간의 차이에 기초하여 학습될 수 있다. Based on the image of the first product related to the first product category and the image of the second product related to the second product category according to an embodiment of the present application, the first product and the second product and the coordinated product to be coordinated; In the product image generation model for generating a related coordinating image, the product image generation model includes: a first encoder for extracting a first feature from a first image related to the first product, and a second image related to the second product an encoder unit including a second encoder for extracting 2 features; Mapping for generating a first target feature for generating an image related to a third product to be coordinated with the first product and the second product by transforming the merged feature generated based on the first feature and the second feature network (Mapping network); and a generator for generating a fake image related to a coordinated product related to a third product category from the first target feature, wherein the generator includes: a difference between the fake image and a real image related to the third product category can be learned based on

본 출원의 일 실시예에 따른 제1 상품 카테고리와 관련된 제1 상품에 대한 이미지와 제2 상품 카테고리와 관련된 제2 상품에 대한 이미지에 기초하여, 제1 상품과 제2 상품과 코디할 코디 상품과 관련된 코디 이미지를 생성하는 상품 이미지 생성 모델을 학습시키는 방법에 있어서, 상기 방법은, 상기 제1 상품과 관련된 제1 이미지 및 상기 제2 상품과 관련된 제2 이미지를 포함하는 복수의 이미지를 획득하는 단계; 상기 제1 이미지로부터 제1 피처를 추출하고, 상기 제2 이미지로부터 제2 피처를 추출하는 단계; 상기 제1 피처와 상기 제2 피처에 기초하여 병합 피처를 생성하는 단계; 상기 병합 피처를 변환하여 상기 제1 상품 및 상기 제2 상품과 코디할 제3 상품과 관련된 이미지를 생성하기 위한 제1 대상 피처를 획득하는 단계; 상기 제1 대상 피처로부터 제3 상품 카테고리와 관련된 페이크 이미지를 생성하는 단계; 및 상기 페이크 이미지 및 상기 제3 상품 카테고리와 관련된 리얼 이미지 간의 차이에 기초하여 상기 상품 이미지 생성 모델을 학습시키는 단계;를 포함할 수 있다.Based on the image of the first product related to the first product category and the image of the second product related to the second product category according to an embodiment of the present application, the first product and the second product and the coordinated product to be coordinated; A method for training a product image generation model for generating a related coordinating image, the method comprising: obtaining a plurality of images including a first image related to the first product and a second image related to the second product ; extracting a first feature from the first image and extracting a second feature from the second image; generating a merge feature based on the first feature and the second feature; obtaining a first target feature for generating an image related to a third product to be coordinated with the first product and the second product by converting the merged feature; generating a fake image related to a third product category from the first target feature; and learning the product image generation model based on a difference between the fake image and the real image related to the third product category.

본 발명의 과제의 해결 수단이 상술한 해결 수단들로 제한되는 것은 아니며, 언급되지 아니한 해결 수단들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The solutions to the problems of the present invention are not limited to the above-described solutions, and solutions not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the present specification and the accompanying drawings. will be able

본 출원의 실시예에 따른 상품 이미지 생성 모델, 이를 이용한 상품 이미지 생성 방법, 및 상품 이미지 생성 장치에 의하면, 제1 상품과 관련된 제1 이미지와 제2 상품과 관련된 제2 이미지들을 포함하는 복수의 이미지로부터 제1 상품 및 제2 상품들과 어울리는 코디 상품과 관련된 코디 이미지를 자동적으로 생성할 수 있다. According to the product image generation model, the product image generation method using the same, and the product image generation apparatus according to an embodiment of the present application, a plurality of images including a first image related to a first product and second images related to a second product It is possible to automatically generate a coordinating image related to a coordinating product matching the first product and the second product.

또한, 본 출원의 실시예에 따른 상품 이미지 생성 모델, 이를 이용한 상품 이미지 생성 방법, 및 상품 이미지 생성 장치에 의하면, 적대적 생성 신경망을 이용한 경쟁적 학습, 추출자를 통하여 추출한 추출 정보들 간의 비교 결과에 따른 로스 부여, 및/또는 이미지 임베딩 네트워크를 통하여 생성된 피처에 기초한 로스 부여 등을 포함하여 다양한 방식으로 상품 이미지 생성 모델을 학습시킴으로써, 상품 이미지 생성 모델의 정확도를 높일 수 있다.In addition, according to the product image generation model, the product image generation method using the product image generation model, and the product image generation apparatus according to the embodiment of the present application, competitive learning using an adversarial generation neural network, loss according to the comparison result between extracted information extracted through the extractor By learning the product image generation model in various ways, including giving and/or giving a loss based on a feature generated through an image embedding network, the accuracy of the product image generation model may be increased.

본 출원의 실시예에 따른 상품 이미지 생성 모델, 이를 이용한 상품 이미지 생성 방법, 및 상품 이미지 생성 장치에 의하면, 정제 네트워크를 통하여 코디 이미지의 노이즈를 제거하거나 퀄리티를 높임으로써, 사용자에게 고품질의 코디 이미지를 제공할 수 있다.According to the product image generation model, the product image generation method using the same, and the product image generation apparatus according to the embodiment of the present application, a high-quality coordination image is provided to the user by removing noise or increasing the quality of the coordination image through a refinement network. can provide

본 발명의 효과가 상술한 효과들로 제한되는 것은 아니며, 언급되지 아니한 효과들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.Effects of the present invention are not limited to the above-described effects, and effects not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the present specification and accompanying drawings.

도 1은 본 출원의 일 실시예에 따른 상품 이미지 생성 장치에 관한 개략도이다.
도 2는 본 출원의 일 실시예에 따른 상품 이미지 생성 모델을 도시한 개략도이다.
도 3은 본 출원의 일 실시예에 따른 상품 이미지 생성 모델의 학습 방법의 양상들을 도시한 개략도이다.
도 4는 본 출원의 다른 실시예에 따른 상품 이미지 생성 모델을 도시한 개략도이다.
도 5는 본 출원의 일 실시예에 따른 상품 이미지 생성 모델의 학습 방법을 도시한 순서도이다.
도 6은 본 출원의 일 실시예에 따른 상품 이미지 생성 모델을 이용하여 코디 이미지를 출력하는 방법을 도시한 순서도이다. 1 is a schematic diagram of an apparatus for generating a product image according to an embodiment of the present application.
2 is a schematic diagram illustrating a product image generation model according to an embodiment of the present application.
3 is a schematic diagram illustrating aspects of a learning method of a product image generation model according to an embodiment of the present application.
4 is a schematic diagram illustrating a product image generation model according to another embodiment of the present application.
5 is a flowchart illustrating a learning method of a product image generation model according to an embodiment of the present application.
6 is a flowchart illustrating a method of outputting a coordinating image using a product image generation model according to an embodiment of the present application.

본 출원의 상술한 목적, 특징들 및 장점은 첨부된 도면과 관련된 다음의 상세한 설명을 통해 보다 분명해질 것이다. 다만, 본 출원은 다양한 변경을 가할 수 있고 여러 가지 실시예들을 가질 수 있는 바, 이하에서는 특정 실시예들을 도면에 예시하고 이를 상세히 설명하고자 한다.The above-mentioned objects, features and advantages of the present application will become more apparent from the following detailed description in conjunction with the accompanying drawings. However, since the present application may have various changes and may have various embodiments, specific embodiments will be exemplified in the drawings and described in detail below.

명세서 전체에 걸쳐서 동일한 참조번호들은 원칙적으로 동일한 구성요소들을 나타낸다. 또한, 각 실시예의 도면에 나타나는 동일한 사상의 범위 내의 기능이 동일한 구성요소는 동일한 참조부호를 사용하여 설명하며, 이에 대한 중복되는 설명은 생략하기로 한다.Throughout the specification, like reference numerals refer to like elements in principle. In addition, components having the same function within the scope of the same idea shown in the drawings of each embodiment will be described using the same reference numerals, and overlapping descriptions thereof will be omitted.

본 출원과 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 출원의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 본 명세서의 설명 과정에서 이용되는 숫자(예를 들어, 제1, 제2 등)는 하나의 구성요소를 다른 구성요소와 구분하기 위한 식별기호에 불과하다.If it is determined that a detailed description of a known function or configuration related to the present application may unnecessarily obscure the gist of the present application, the detailed description thereof will be omitted. In addition, numbers (eg, first, second, etc.) used in the description process of the present specification are merely identification symbols for distinguishing one component from other components.

또한, 이하의 실시예에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다.In addition, the suffixes "module" and "part" for the components used in the following embodiments are given or mixed in consideration of only the ease of writing the specification, and do not have distinct meanings or roles by themselves.

이하의 실시예에서, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.In the following examples, the singular expression includes the plural expression unless the context clearly dictates otherwise.

이하의 실시예에서, 포함하다 또는 가지다 등의 용어는 명세서상에 기재된 특징, 또는 구성요소가 존재함을 의미하는 것이고, 하나 이상의 다른 특징들 또는 구성요소가 부가될 가능성을 미리 배제하는 것은 아니다.In the following embodiments, terms such as include or have means that the features or components described in the specification are present, and the possibility that one or more other features or components may be added is not excluded in advance.

도면에서는 설명의 편의를 위하여 구성 요소들이 그 크기가 과장 또는 축소될 수 있다. 예컨대, 도면에서 나타난 각 구성의 크기 및 두께는 설명의 편의를 위해 임의로 나타낸 것으로, 본 발명이 반드시 도시된 바에 한정되지 않는다.In the drawings, the size of the components may be exaggerated or reduced for convenience of description. For example, the size and thickness of each component shown in the drawings are arbitrarily indicated for convenience of description, and the present invention is not necessarily limited to the illustrated bar.

어떤 실시예가 달리 구현 가능한 경우에 특정한 프로세스의 순서는 설명되는 순서와 다르게 수행될 수도 있다. 예를 들어, 연속하여 설명되는 두 프로세스가 실질적으로 동시에 수행될 수도 있고, 설명되는 순서와 반대의 순서로 진행될 수 있다.In cases where certain embodiments are otherwise implementable, the order of specific processes may be performed differently from the order in which they are described. For example, two processes described in succession may be performed substantially simultaneously, or may be performed in an order opposite to the order described.

이하의 실시예에서, 구성 요소 등이 연결되었다고 할 때, 구성 요소들이 직접적으로 연결된 경우뿐만 아니라 구성요소들 중간에 구성 요소들이 개재되어 간접적으로 연결된 경우도 포함한다.In the following embodiments, when components are connected, it includes not only cases in which components are directly connected, but also cases in which components are interposed between components and connected indirectly.

예컨대, 본 명세서에서 구성 요소 등이 전기적으로 연결되었다고 할 때, 구성 요소 등이 직접 전기적으로 연결된 경우뿐만 아니라, 그 중간에 구성 요소 등이 개재되어 간접적으로 전기적 연결된 경우도 포함한다.For example, in the present specification, when it is said that components and the like are electrically connected, it includes not only the case where the components are directly electrically connected, but also the case where the components are interposed therebetween to be indirectly electrically connected.

본 출원의 일 실시예에 따르면, 상기 상품 이미지 생성 모델은, 상기 페이크 이미지 및 리얼 이미지를 획득하고, 상기 리얼 이미지와 상기 페이크 이미지를 비교하여 상기 페이크 이미지의 진위를 판별하는 판별자(Discriminator);를 더 포함할 수 있다. According to an embodiment of the present application, the product image generation model may include: a discriminator for obtaining the fake image and the real image, and determining the authenticity of the fake image by comparing the real image with the fake image; may further include.

본 출원의 일 실시예에 따르면, 상기 생성자는 상기 페이크 이미지를 상기 리얼 이미지에 근사하도록 학습되며, 상기 판별자는 상기 페이크 이미지를 가짜라고 판단하고, 상기 리얼 이미지는 진짜라고 판단하도록 학습될 수 있다. According to an embodiment of the present application, the generator may be trained to approximate the fake image to the real image, and the discriminator may be trained to determine that the fake image is fake, and determine that the real image is real.

본 출원의 일 실시예에 따르면, 상기 상품 이미지 생성 모델은, 상기 페이크 이미지 및 상기 리얼 이미지 중 적어도 하나의 이미지로부터 엣지 정보를 추출하는 엣지 추출기(Edge extractor), 및 상기 페이크 이미지 및 상기 리얼 이미지 중 적어도 하나의 이미지로부터 피처 정보를 추출하는 피처 추출기(Feature Extractor)를 포함하는 추출자(Extractor)를 더 포함할 수 있다. According to an embodiment of the present application, the product image generation model includes an edge extractor for extracting edge information from at least one of the fake image and the real image, and among the fake image and the real image. An extractor including a feature extractor for extracting feature information from at least one image may be further included.

본 출원의 일 실시예에 따르면, 상기 추출자는, 상기 엣지 추출기를 통하여, 상기 페이크 이미지로부터 제1 엣지 정보를 추출하고 상기 리얼 이미지로부터 제2 엣지 정보를 획득하고, 상기 피처 추출기를 통하여, 상기 페이크 이미지로부터 제1 피처 정보를 획득하고 리얼 이미지로부터 제2 피처 정보를 획득하고, 상기 생성자는, 상기 제1 엣지 정보와 상기 제2 엣지 정보의 차이에 기초하여 학습되거나, 상기 제1 피처 정보와 상기 제2 피처 정보의 차이에 기초하여 학습될 수 있다. According to an embodiment of the present application, the extractor extracts first edge information from the fake image through the edge extractor, obtains second edge information from the real image, and through the feature extractor, the fake obtain first feature information from an image and second feature information from a real image, wherein the generator is learned based on a difference between the first edge information and the second edge information, or the first feature information and the It may be learned based on the difference in the second feature information.

본 출원의 일 실시예에 따르면, 상기 상품 이미지 생성 모델은, 상기 페이크 이미지 및 상기 리얼 이미지 중 적어도 하나에 기초하여 제2 대상 피처를 추출하는 이미지 임베딩 네트워크를 포함하되, 상기 생성자는, 상기 제1 대상 피처와 상기 제2 대상 피처 간의 차이에 기초한 로스, 또는 상기 제2 대상 피처에 기초하여 생성된 페이크 이미지와 상기 리얼 이미지 간의 차이에 기초한 로스에 기초하여, 상기 리얼 이미지와 상기 페이크 이미지가 근사되도록 훈련될 수 있다. According to an embodiment of the present application, the product image generation model includes an image embedding network for extracting a second target feature based on at least one of the fake image and the real image, wherein the creator includes the first based on a loss based on a difference between a target feature and the second target feature, or a loss based on a difference between a fake image generated based on the second target feature and the real image, such that the real image and the fake image are approximated can be trained

본 출원의 일 실시예에 따르면, 상기 상품 이미지 생성 모델은, 상기 페이크 이미지에 기초하여 제2 대상 피처를 추출하고, 상기 리얼이미지에 기초하여 제2 대상 피처를 추출하는 이미지 임베딩 네트워크를 포함하되, 상기 생성자는, 상기 페이크 이미지에 기초하여 추출된 제2 대상 피처와 상기 리얼 이미지에 기초하여 추출된 제2 대상 피처 간의 로스에 기초하여 훈련될 수 있다. According to an embodiment of the present application, the product image generation model includes an image embedding network that extracts a second target feature based on the fake image and extracts a second target feature based on the real image, The generator may be trained based on a loss between a second target feature extracted based on the fake image and a second target feature extracted based on the real image.

본 출원의 일 실시예에 따르면, 상기 상품 이미지 생성 모델은, 상기 생성자를 통하여 획득된 상기 페이크 이미지의 노이즈를 제거하거나 상기 페이크 이미지의 퀄리티를 증대시키는 정제 네트워크(Refinement network)를 더 포함할 수 있다. According to an embodiment of the present application, the product image generation model may further include a refinement network that removes noise of the fake image acquired through the generator or increases the quality of the fake image. .

본 출원의 일 실시예에 따르면, 상기 병합 피처는, 상기 제1 피처와 상기 제2 피처 각각에 노멀라이제이션을 수행한 후 병합하여 생성될 수 있다.According to an embodiment of the present application, the merged feature may be generated by performing normalization on each of the first and second features and then merging them.

본 출원의 일 실시예에 따른 제1 상품 카테고리와 관련된 제1 상품에 대한 이미지와 제2 상품 카테고리와 관련된 제2 상품에 대한 이미지에 기초하여, 제1 상품과 제2 상품과 코디할 코디 상품과 관련된 코디 이미지를 생성하는 상품 이미지 생성 모델을 학습시키는 방법에 있어서, 상기 방법은, 상기 제1 상품과 관련된 제1 이미지 및 상기 제2 상품과 관련된 제2 이미지를 포함하는 복수의 이미지를 획득하는 단계; 상기 제1 이미지로부터 제1 피처를 추출하고, 상기 제2 이미지로부터 제2 피처를 추출하는 단계; 상기 제1 피처와 상기 제2 피처에 기초하여 병합 피처를 생성하는 단계; 상기 병합 피처를 변환하여 상기 제1 상품 및 상기 제2 상품과 코디할 제3 상품과 관련된 이미지를 생성하기 위한 제1 대상 피처를 획득하는 단계; 상기 대상 피처로부터 제3 상품 카테고리와 관련된 페이크 이미지를 생성하는 단계; 및 리얼 이미지를 획득하고, 상기 리얼 이미지와 상기 페이크 이미지를 비교하여 상기 페이크 이미지의 진위를 판단하고, 판단 결과에 기초하여 상기 상품 이미지 생성 모델을 학습시키는 단계;를 포함할 수 있다.Based on the image of the first product related to the first product category and the image of the second product related to the second product category according to an embodiment of the present application, the first product and the second product and the coordinated product to be coordinated; A method for training a product image generation model for generating a related coordinating image, the method comprising: obtaining a plurality of images including a first image related to the first product and a second image related to the second product ; extracting a first feature from the first image and extracting a second feature from the second image; generating a merge feature based on the first feature and the second feature; obtaining a first target feature for generating an image related to a third product to be coordinated with the first product and the second product by converting the merged feature; generating a fake image related to a third product category from the target feature; and obtaining a real image, determining the authenticity of the fake image by comparing the real image with the fake image, and learning the product image generation model based on the determination result.

본 출원의 일 실시예에 따르면, 상기 상품 이미지 생성 모델의 학습 방법을 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체가 제공될 수 있다.According to an embodiment of the present application, a computer-readable recording medium in which a program for executing the learning method of the product image generation model is recorded may be provided.

이하에서는 도 1 내지 도 6을 참고하여 본 출원의 실시예들에 따른 상품 이미지 생성 모델, 상품 이미지 생성 방법, 및 상품 이미지 생성 장치(혹은 상품 이미지 생성 서버, 이하 상품 이미지 생성 장치로 지칭)에 관하여 설명한다.Hereinafter, with reference to FIGS. 1 to 6 , a product image generation model, a product image generation method, and a product image generation apparatus (or a product image generation server, hereinafter referred to as a product image generation apparatus) according to embodiments of the present application Explain.

도 1은 본 출원의 일 실시예에 따른 상품 이미지 생성 장치에 관한 개략도이다. 1 is a schematic diagram of an apparatus for generating a product image according to an embodiment of the present application.

본 출원의 일 실시예에 따른 상품 이미지 생성 장치(1000)는 상품 카테고리별 복수의 이미지를 포함하는 학습 세트에 기초하여 상품 이미지 생성 모델(100)을 학습시킬 수 있다. 또한, 상품 이미지 생성 장치(1000)는 상품 이미지 생성 모델(100)을 이용하여 상품 카테고리별 복수의 대상 이미지들로부터, 복수의 대상 이미지와 관련된 상품들과 코디할 상품과 관련된 코디 이미지를 획득할 수 있다.The product image generation apparatus 1000 according to an embodiment of the present application may train the product image generation model 100 based on a training set including a plurality of images for each product category. Also, the product image generating apparatus 1000 may obtain products related to a plurality of target images and a coordination image related to a product to be coordinated from a plurality of target images for each product category by using the product image generation model 100 . have.

본 출원의 일 실시예에 따른 상품 이미지 생성 장치(1000)는 송수신부(1100), 메모리(1200), 및 프로세서(1300)를 포함할 수 있다. The product image generating apparatus 1000 according to an embodiment of the present application may include a transceiver 1100 , a memory 1200 , and a processor 1300 .

송수신부(1100)는 사용자 단말을 포함한 임의의 외부 기기와 통신을 수행할 수 있다. 예컨대, 상품 이미지 생성 장치(1000)는, 송수신부(1100)를 통해, 상품 카테고리별 이미지들을 획득할 수 있다. 또한, 상품 이미지 생성 장치(1000)는, 송수신부(1100)를 통해, 상품 이미지 생성 모델(100)을 실행시키기 위한 임의의 실행 데이터를 획득할 수 있다. 여기서 실행 데이터는 상품 이미지 생성 모델(100)의 구조 정보, 계층 정보, 연산 라이브러리, 및 상품 이미지 생성 모델(100)에 포함된 가중치와 관련된 파라미터 세트를 포함하여 상품 이미지 생성 모델(100)을 실행시키기 위한 임의의 적절한 데이터를 포괄하는 의미일 수 있다. 또한, 상품 이미지 생성 장치(100)는, 송수신부(1100)를 통하여, 상품 이미지 생성 모델(100)을 통하여 획득한 코디 이미지를 사용자 단말을 포함한 임의의 외부 기기로 송신하거나 출력할 수 있다. The transceiver 1100 may communicate with any external device including a user terminal. For example, the product image generating apparatus 1000 may acquire images for each product category through the transceiver 1100 . Also, the product image generating apparatus 1000 may acquire arbitrary execution data for executing the product image generating model 100 through the transceiver 1100 . Here, the execution data includes the structural information of the product image generation model 100 , hierarchical information, an operation library, and a parameter set related to weights included in the product image generation model 100 to execute the product image generation model 100 . It may be meant to encompass any suitable data for Also, the product image generating apparatus 100 may transmit or output the coordinating image acquired through the product image generation model 100 to any external device including the user terminal through the transceiver 1100 .

상품 이미지 생성 장치(1000)는, 송수신부(1100)를 통해 네트워크에 접속하여 각종 데이터를 송수신할 수 있다. 송수신부(1100)는 크게 유선 타입과 무선 타입을 포함할 수 있다. 유선 타입과 무선 타입은 각각의 장단점을 가지므로, 경우에 따라서 상품 이미지 생성 장치(1000)에는 유선 타입과 무선 타입이 동시에 마련될 수도 있다. 여기서, 무선 타입의 경우에는 주로 와이파이(Wi-Fi) 같은 WLAN(Wireless Local Area Network) 계열의 통신 방식을 이용할 수 있다. 또는, 무선 타입의 경우에는 셀룰러 통신, 예컨대, LTE, 5G 계열의 통신 방식을 이용할 수 있다. 다만, 무선 통신 프로토콜이 상술한 예시에 제한되는 것은 아니며, 임의의 적절한 무선 타입의 통신 방식을 이용하는 것도 가능하다. 유선 타입의 경우에는 LAN(Local Area Network)이나 USB(Universal Serial Bus) 통신이 대표적인 예이며 그 외의 다른 방식도 가능하다. The product image generating apparatus 1000 may connect to a network through the transceiver 1100 to transmit/receive various data. The transceiver 1100 may largely include a wired type and a wireless type. Since the wired type and the wireless type have their respective strengths and weaknesses, in some cases, the product image generating apparatus 1000 may be provided with both the wired type and the wireless type at the same time. Here, in the case of the wireless type, a wireless local area network (WLAN)-based communication method such as Wi-Fi may be mainly used. Alternatively, in the case of the wireless type, cellular communication, for example, LTE, 5G-based communication method may be used. However, the wireless communication protocol is not limited to the above-described example, and any suitable wireless type communication method may be used. In the case of the wired type, LAN (Local Area Network) or USB (Universal Serial Bus) communication is a representative example, and other methods are also possible.

메모리(1200)는 각종 정보를 저장할 수 있다. 메모리(1200)에는 각종 데이터가 임시적으로 또는 반영구적으로 저장될 수 있다. 메모리(1200)의 예로는 하드 디스크(HDD: Hard Disk Drive), SSD(Solid State Drive), 플래쉬 메모리(flash memory), 롬(ROM: Read-Only Memory), 램(RAM: Random Access Memory) 등이 있을 수 있다. 메모리(1200)는 상품 이미지 생성 장치(1000)에 내장되는 형태나 탈부착 가능한 형태로 제공될 수 있다. 메모리(1200)에는 상품 이미지 생성 장치(1000)를 구동하기 위한 운용 프로그램(OS: Operating System)이나 상품 이미지 생성 장치(1000)의 각 구성을 동작시키기 위한 프로그램을 비롯해 상품 이미지 생성 장치(1000)의 동작에 필요한 각종 데이터가 저장될 수 있다.The memory 1200 may store various types of information. Various data may be temporarily or semi-permanently stored in the memory 1200 . Examples of the memory 1200 include a hard disk (HDD), a solid state drive (SSD), a flash memory, a read-only memory (ROM), a random access memory (RAM), and the like. This can be. The memory 1200 may be provided in a form embedded in the product image generating apparatus 1000 or in a detachable form. The memory 1200 includes an operating system (OS) for driving the product image generating apparatus 1000 or a program for operating each component of the product image generating apparatus 1000 , and the product image generating apparatus 1000 . Various data necessary for operation may be stored.

프로세서(1300)는 상품 이미지 생성 장치(1000)의 전반적인 동작을 제어할 수 있다. 예컨대, 프로세서(1300)는 후술할 상품 이미지 생성 모델(100)을 학습시키는 동작 및/또는 상품 이미지 생성 모델(100)을 이용하여 코디 이미지를 획득하는 동작 등을 포함하여 상품 이미지 생성 장치(1000)의 전반적인 동작을 제어할 수 있다. 구체적으로 프로세서(1300)는 메모리(1200)로부터 상품 이미지 생성 장치(1000)의 전반적인 동작을 위한 프로그램을 로딩하여 실행할 수 있다. 프로세서(1300)는 하드웨어나 소프트웨어 또는 이들의 조합에 따라 AP(Application Processor), CPU(Central Processing Unit)나 이와 유사한 장치로 구현될 수 있다. 이때, 하드웨어적으로는 전기적 신호를 처리하여 제어 기능을 수행하는 전자 회로 형태로 제공될 수 있으며, 소프트웨어적으로는 하드웨어적 회로를 구동시키는 프로그램이나 코드 형태로 제공될 수 있다.The processor 1300 may control the overall operation of the product image generating apparatus 1000 . For example, the processor 1300 may include an operation of learning a product image generation model 100 and/or an operation of acquiring a coordinating image using the product image generation model 100 to be described later. You can control the overall operation of Specifically, the processor 1300 may load and execute a program for the overall operation of the product image generating apparatus 1000 from the memory 1200 . The processor 1300 may be implemented as an application processor (AP), a central processing unit (CPU), or a similar device according to hardware, software, or a combination thereof. In this case, in terms of hardware, it may be provided in the form of an electronic circuit that performs a control function by processing an electrical signal, and in software, it may be provided in the form of a program or code for driving a hardware circuit.

이하에서는 도 2를 참고하여, 본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)의 구조와 상품 이미지 생성 모델(100)의 학습 방법에 대하여 구체적으로 서술한다. 도 2는 본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)을 도시한 개략도이다.Hereinafter, with reference to FIG. 2 , the structure of the product image generation model 100 and the learning method of the product image generation model 100 according to an embodiment of the present application will be described in detail. 2 is a schematic diagram illustrating a product image generation model 100 according to an embodiment of the present application.

본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)은 적어도 하나의 인코더(111, 113, 115)를 포함하는 인코더부(110), 인코더부(110)로부터 추출된 피처들에 기초하여 생성된 병합 피처를 제1 대상 피처로 변환하는 맵핑 네트워크(120); 및 제1 대상 피처에 기초하여 페이크 이미지를 생성자(130);를 포함할 수 있다. The product image generation model 100 according to an embodiment of the present application is generated based on the encoder unit 110 including at least one encoder 111 , 113 , and 115 , and features extracted from the encoder unit 110 . a mapping network 120 that transforms the merged feature into a first target feature; and the generator 130 for generating a fake image based on the first target feature.

인코더부(110)는 적어도 하나 이상의 인코더(예컨대, 제1 인코더(111), 제2 인코더(113), 및 제N 인코더(115))를 포함할 수 있다. 인코더부(110)는 상품과 관련된 이미지를 획득하고, 이미지로부터 상품의 재질, 스타일, 크기, 비율, 모양 및/또는 색상 등을 포함한 상품의 속성과 관련된 피처를 추출할 수 있다. The encoder unit 110 may include at least one or more encoders (eg, a first encoder 111 , a second encoder 113 , and an N-th encoder 115 ). The encoder unit 110 may obtain an image related to a product, and extract features related to attributes of the product including material, style, size, proportion, shape, and/or color of the product from the image.

예컨대, 제1 인코더(111)는 제1 상품 카테고리(예컨대, 상의)와 관련된 제1 상품에 대한 제1 이미지를 수신하고, 제1 이미지로부터 제1 상품의 속성과 관련된 제1 피처를 추출할 수 있다. 제2 인코더(113)는 제1 상품 카테고리와는 상이한 제2 상품 카테고리(예컨대, 하의 또는 신발 등)와 관련된 제2 상품에 대한 제2 이미지를 수신하고, 제2 이미지로부터 제2 상품의 속성과 관련된 제2 피처를 추출할 수 있다.For example, the first encoder 111 may receive a first image for a first product related to a first product category (eg, top), and extract a first feature related to an attribute of the first product from the first image. have. The second encoder 113 receives a second image for a second product related to a second product category (eg, bottoms or shoes, etc.) different from the first product category, and determines the attributes of the second product from the second image. A related second feature may be extracted.

상품 이미지 생성 모델(100)은 제1 피처 및 제2 피처를 포함하는 복수의 피처를 획득하고, 제1 피처 및 제2 피처를 포함하는 복수의 피처를 병합하여 병합 피처(Concatenate feature)를 생성하도록 구성될 수 있다. The product image generation model 100 obtains a plurality of features including a first feature and a second feature, and merges a plurality of features including the first feature and the second feature to generate a concatenated feature (Concatenate feature) can be configured.

일 예로, 상품 이미지 생성 모델(100)은 제1 피처 및 제2 피처를 포함하는 복수의 피처들 각각에 노멀라이제이션(normalization)을 수행한 후 병합하여 병합 피처를 생성하도록 구성될 수 있다. As an example, the product image generation model 100 may be configured to generate a merge feature by performing normalization on each of a plurality of features including the first feature and the second feature and then merging them.

다른 예로, 상품 이미지 생성 모델(100)은 제1 피처 및 제2 피처를 포함하는 복수의 피처들 각각에 노멀라이제이션을 수행한 후 병합하여 중간 피처를 생성하고, 중간 피처에 노멀라이제이션을 수행하여 병합 피처를 생성하도록 구성될 수 있다. As another example, the product image generation model 100 performs normalization on each of a plurality of features including the first feature and the second feature, then merges them to generate an intermediate feature, and performs normalization on the intermediate feature to create a merge feature.

여기서 노멀라이제이션이란 임의의 벡터를 유닛화하는 임의의 기법과 관련된 데이터 처리 기법을 지칭하며, 상품 이미지 생성 모델(100)은 L1 노멀라이제이션 및/또는 L2 노멀라이제이션을 통하여 인코더부(110)로부터 획득한 복수의 피처들 또는 중간 피처를 정규화할 수 있다. Here, normalization refers to a data processing technique related to an arbitrary technique for uniting an arbitrary vector, and the product image generation model 100 is the encoder unit 110 through L1 normalization and/or L2 normalization. A plurality of features or intermediate features obtained from can be normalized.

이때, 병합 피처는 제1 상품과 제2 상품과 관련된 피처를 포함할 뿐, 제1 상품 및 제2 상품과 코디할 제3 상품이 속한 제3 상품 카테고리와 관련된 피처는 포함하지 않을 수 있다. 따라서, 본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)은 병합 피처를 변환하여, 제1 상품 및 제2 상품과 코디할 제3 상품 카테고리에 속한 제3 상품을 생성하기 위한 제1 대상 피처를 획득할 수 있다. 예컨대, 상품 이미지 생성 모델(100)은 맵핑 네트워크(120)를 포함할 수 있으며, 상품 이미지 생성 모델(100)은 맵핑 네트워크(120)를 통하여 병합 피처를 변환하여 제3 상품과 관련된 이미지를 생성하기 위한 제1 대상 피처를 획득할 수 있다. 맵핑 네트워크(120)는 특정 바운더리 내에 제1 대상 피처가 생성되도록 하는 효과를 제공할 수 있다. In this case, the merge feature may include only features related to the first product and the second product, but may not include features related to the third product category to which the first product and the second product and the third product to be coordinated belong. Accordingly, the product image generation model 100 according to an embodiment of the present application converts the merge feature to generate a third product belonging to a third product category to be coordinated with the first product and the second product. features can be obtained. For example, the product image generation model 100 may include a mapping network 120 , and the product image generation model 100 converts a merge feature through the mapping network 120 to generate an image related to a third product. It is possible to obtain a first target feature for The mapping network 120 may provide the effect of causing the first target feature to be created within a specific boundary.

본 출원의 일 실시예에 따른 맵핑 네트워크(120)는 콘볼루션 뉴럴 네트워크(Convolutional Neural Networks(CNN)) 또는 완전 연결 계층(Fully connected(Dense) layer)로 구성된 다중 퍼셉트론(Multi-Layer Perceptron, MLP) 신경망일 수 있다. The mapping network 120 according to an embodiment of the present application is a multi-layer perceptron (MLP) composed of a convolutional neural network (CNN) or a fully connected (Dense) layer. It could be a neural network.

다만 필요에 따라, 본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)은 맵핑 네트워크를 포함하지 않을 수 있다. 이때, 상품 이미지 생성 모델(100)은 병합 피처를 직접 후술할 생성자(Generator)에 입력하도록 구성될 수 있다. However, if necessary, the product image generation model 100 according to an embodiment of the present application may not include a mapping network. In this case, the product image generation model 100 may be configured to directly input the merge feature into a generator, which will be described later.

생성자(120)는 제1 대상 피처를 획득하고, 제1 대상 피처에 기초하여 제3 상품 카테고리와 관련된 코디 상품과 관련된 페이크 이미지를 생성할 수 있다. The generator 120 may obtain a first target feature and generate a fake image related to a coordinated product related to a third product category based on the first target feature.

이때, 상품 이미지 생성 모델(100)(혹은 상품 이미지 생성 장치(1000))은 코디 상품과 관련된 리얼 이미지를 획득하고, 리얼 이미지와 페이크 이미지에 기초하여, 페이크 이미지가 리얼 이미지에 근사되도록 생성자를 학습시킬 수 있다. 예컨대, 상품 이미지 생성 모델(100)은 리얼 이미지와 페이크 이미지의 차이의 절대값과 관련된 L1 로스를 생성자(130)에 부여하여 생성자(130)를 학습시킬 수 있다. 다른 예를 들면, 상품 이미지 생성 모델(100)은 리얼 이미지와 페이크 이미지를 비교하여 유클리디안 거리와 관련된 L2 로스를 획득하고, L2 로스를 생성자(130)에 부여하여 생성자(130)를 학습시킬 수 있다. At this time, the product image generation model 100 (or the product image generation device 1000 ) acquires a real image related to the coordinated product, and learns the constructor so that the fake image approximates the real image based on the real image and the fake image. can do it For example, the product image generation model 100 may train the generator 130 by imparting the L1 loss related to the absolute value of the difference between the real image and the fake image to the generator 130 . For another example, the product image generation model 100 compares the real image with the fake image to obtain the L2 loss related to the Euclidean distance, and to train the generator 130 by giving the L2 loss to the generator 130 . can

이하에서는 도 3을 참고하여 본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)의 학습 방법에 대하여 구체적으로 서술하기로 한다. 도 3은 본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)의 학습 방법의 양상들을 도시한 개략도이다. Hereinafter, a learning method of the product image generation model 100 according to an embodiment of the present application will be described in detail with reference to FIG. 3 . 3 is a schematic diagram illustrating aspects of a learning method of the product image generation model 100 according to an embodiment of the present application.

본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)은 판별자(140)를 더 포함할 수 있다. 판별자(140)는 리얼 이미지와 생성자(130)로부터 생성된 페이크 이미지를 획득하고, 리얼 이미지와 페이크 이미지를 비교하여 페이크 이미지의 진위를 판별하는 동작을 수행할 수 있다. 이때, 생성자(130)와 판별자(140)는 적대적 생성 신경망(Generative Adversarial Network, GAN)을 구성하여 경쟁적 방식으로 학습될 수 있다. 구체적으로 판별자(140)는 페이크 이미지를 가짜라고 판단하고, 리얼 이미지는 진짜라고 판단하도록 학습되며, 생성자(130)는 판별자(140)의 판단 결과에 기초하여, 판별자(140)가 페이크 이미지를 진짜라고 판단하도록 페이크 이미지를 리얼 이미지에 근사하여 출력하도록 학습될 수 있다. 예컨대, 판별자(140)는 리얼 이미지 및 페이크 이미지 각각에 대하여 진위와 관련된 지표(예컨대, 진짜일 확률값 혹은 임의의 형태의 점수)를 연산할 수 있으며, 생성자(130)는, 판별자(140)가 페이크 이미지의 진위 지표를 리얼 이미지의 진위 지표(혹은 목표하는 진위 지표)에 근사하여 연산하도록, 페이크 이미지를 출력하도록 학습될 수 있다. The product image generation model 100 according to an embodiment of the present application may further include a discriminator 140 . The discriminator 140 may obtain a real image and a fake image generated by the generator 130 , and may perform an operation of determining the authenticity of the fake image by comparing the real image and the fake image. In this case, the generator 130 and the discriminator 140 may be trained in a competitive manner by configuring a Generative Adversarial Network (GAN). Specifically, the discriminator 140 determines that the fake image is fake and learns to determine that the real image is real, and the generator 130 determines that the discriminator 140 is a fake based on the judgment result of the discriminator 140. It can be learned to output a fake image by approximating the real image to determine that the image is real. For example, the discriminator 140 may calculate an index related to authenticity (eg, a probability value of being genuine or a score in any form) for each of the real image and the fake image, and the generator 130 may include the discriminator 140 . may be learned to output the fake image to calculate the authenticity index of the fake image by approximating the authenticity index (or the target authenticity index) of the real image.

한편, 생성자(130)와 판별자(140)의 학습 진행 단계에 따른 학습 횟수(즉 학습 밸런스)는 적절한 값으로 미리 설정될 수 있다. 예컨대, 학습 초기에는 생성자(130)의 학습 횟수가 판별자(140)보다 학습 횟수보다 상대적으로 크도록 설정될 수 있다. 다른 예로, 학습 중기에는 생성자(130)의 학습 횟수와 판별자(140)의 학습 횟수의 차이가 미리 설정된 임계값보다 작도록 구성될 수 있다. 또 다른 예로, 학습 말기에는 판별자(140)의 학습 횟수가 생성자(130)의 학습 횟수보다 같거나 크도록 설정될 수 있다. 다만, 이는 예시에 불과하며, 생성자(130)의 학습 횟수와 판별자(140)의 학습 횟수가 임의의 적절한 값으로 미리 설정될 수 있다.Meanwhile, the number of learning times (ie, learning balance) according to the learning progress stage of the generator 130 and the discriminator 140 may be preset to an appropriate value. For example, in the initial stage of learning, the number of learning times of the generator 130 may be set to be relatively larger than the number of learning times of the discriminator 140 . As another example, in the middle learning period, the difference between the number of times of learning of the generator 130 and the number of times of learning of the discriminator 140 may be configured to be smaller than a preset threshold value. As another example, at the end of learning, the number of times of learning of the discriminator 140 may be set to be equal to or greater than the number of times of learning of the generator 130 . However, this is only an example, and the number of times of learning of the generator 130 and the number of times of learning of the discriminator 140 may be preset to any appropriate value.

본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)은 페이크 이미지 및/또는 리얼 이미지로부터 엣지 정보를 추출하는 엣지 추출기(152, Edge extractor), 및 페이크 이미지 및/또는 리얼 이미지로부터 피처 정보를 추출하는 피처 추출기(154, Feature extractor)를 포함하는 추출자(150)를 더 포함할 수 있다. Product image generation model 100 according to an embodiment of the present application is an edge extractor 152 (Edge extractor) for extracting edge information from a fake image and/or real image, and feature information from the fake image and/or real image It may further include an extractor 150 including a feature extractor (154) to extract.

구체적으로 추출자(150)는 엣지 추출기(152)를 통하여 페이크 이미지로부터 제1 엣지 정보를 추출할 수 있다. 또는 추출자(150)는 엣지 추출기(152)를 통하여 리얼 이미지로부터 제2 엣지 정보를 추출할 수 있다. 이때, 생성자(130)는 페이크 이미지와 관련된 제1 엣지 정보와 리얼 이미지와 관련된 제2 엣지 정보에 기초하여 학습될 수 있다. 구체적으로 생성자(130)는, 페이크 이미지와 관련된 제1 엣지 정보와 리얼 이미지와 관련된 제2 엣지 정보의 차이에 기초하여, 제1 엣지 정보와 제2 엣지 정보 간의 차이를 감소시키도록, 페이크 이미지를 생성하도록 학습될 수 있다. Specifically, the extractor 150 may extract the first edge information from the fake image through the edge extractor 152 . Alternatively, the extractor 150 may extract the second edge information from the real image through the edge extractor 152 . In this case, the generator 130 may be learned based on the first edge information related to the fake image and the second edge information related to the real image. Specifically, the generator 130 generates a fake image to reduce the difference between the first edge information and the second edge information based on the difference between the first edge information related to the fake image and the second edge information related to the real image. can be learned to generate.

또한, 추출자(150)는 피처 추출기(154)를 통하여 페이크 이미지로부터 제1 피처 정보를 추출할 수 있다. 또는 추출자(150)는 피처 추출기(154)를 통하여 리얼 이미지로부터 제2 피처 정보를 추출할 수 있다. 이때, 생성자(130)는 페이크 이미지와 관련된 제1 피처 정보와 리얼 이미지와 관련된 제2 피처 정보에 기초하여 학습될 수 있다. 구체적으로 생성자(140)는, 페이크 이미지와 관련된 제1 피처 정보와 리얼 이미지와 관련된 제2 피처 정보의 차이에 기초하여 제1 피처 정보와 제2 피처 정보 간의 차이를 감소시키도록, 페이크 이미지를 생성하도록 학습시킬 수 있다.Also, the extractor 150 may extract the first feature information from the fake image through the feature extractor 154 . Alternatively, the extractor 150 may extract the second feature information from the real image through the feature extractor 154 . In this case, the generator 130 may be learned based on the first feature information related to the fake image and the second feature information related to the real image. Specifically, the generator 140 generates a fake image to reduce a difference between the first feature information and the second feature information based on a difference between the first feature information related to the fake image and the second feature information related to the real image. can be taught to do.

보다 구체적으로 상품 이미지 생성 모델(100)(혹은 상품 이미지 생성 장치(1000))은 제1 피처 정보와 제2 피처 정보의 차이(또는 제1 엣지 정보와 제2 엣지 정보의 차이)의 절대값과 관련된 L1 로스를 생성자(130)에 부여하여 생성자(130)를 학습시킬 수 있다. 또는 상품 이미지 생성 모델(100)(혹은 상품 이미지 생성 장치(1000))은 제1 피처 정보와 제2 피처 정보 간(또는 제1 엣지 정보와 제2 엣지 정보 간)의 유클리디안 거리와 관련된 L2 로스를 생성자(130)에 부여하여 생성자(130)를 학습시킬 수 있다.In more detail, the product image generation model 100 (or the product image generation apparatus 1000 ) compares the absolute value of the difference between the first feature information and the second feature information (or the difference between the first edge information and the second edge information) and The generator 130 may be trained by giving the relevant L1 loss to the generator 130 . Alternatively, the product image generation model 100 (or the product image generation device 1000 ) is L2 related to the Euclidean distance between the first feature information and the second feature information (or between the first edge information and the second edge information). The generator 130 may be trained by giving a loss to the generator 130 .

이하에서는 도 4를 참고하여 본 출원의 다른 실시예에 따른 상품 이미지 생성 모델(100)의 구조와 상품 이미지 생성 모델(100)의 학습 방법에 대하여 구체적으로 서술한다. 도 4는 본 출원의 다른 실시예에 따른 상품 이미지 생성 모델(100)을 도시한 개략도이다. Hereinafter, the structure of the product image generation model 100 and the learning method of the product image generation model 100 according to another embodiment of the present application will be described in detail with reference to FIG. 4 . 4 is a schematic diagram illustrating a product image generation model 100 according to another embodiment of the present application.

본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)은 적어도 하나의 인코더(111, 113, 115)를 포함하는 인코더부(110), 인코더부(110)로부터 추출된 피처들에 기초하여 생성된 병합 피처를 제1 대상 피처로 변환하는 맵핑 네트워크(120), 제1 대상 피처에 기초하여 페이크 이미지를 생성자(130), 리얼 이미지 및 페이크 이미지 각각에 대하여 엣지 정보와 피처 정보를 추출하는 추출하는 추출자(150), 리얼 이미지 및/또는 페이크 이미지에 기초하여 제1 대상 피처에 대응되는 제2 대상 피처를 생성하는 이미지 임베딩 네트워크(160), 및 페이크 이미지의 노이즈를 제거하거나 퀄리티를 높이기 위한 정제 네트워크(170, Refinement Network)를 포함할 수 있다. 한편 도 4에서는 도시하지 않았지만, 상품 이미지 생성 모델(100)은 페이크 이미지와 리얼 이미지에 기초하여 생성자(130)를 경쟁적으로 학습시키기 위한 판별자(140)를 포함할 수 있다. The product image generation model 100 according to an embodiment of the present application is generated based on the encoder unit 110 including at least one encoder 111 , 113 , and 115 , and features extracted from the encoder unit 110 . Mapping network 120 that converts the merged feature into a first target feature, extracting edge information and feature information for each of the generator 130, real image and fake image based on the first target feature and extracting the fake image The extractor 150, the image embedding network 160 for generating a second target feature corresponding to the first target feature based on the real image and/or the fake image, and refinement for removing noise or increasing the quality of the fake image It may include a network 170, Refinement Network. Meanwhile, although not shown in FIG. 4 , the product image generation model 100 may include a discriminator 140 for competitively learning the generator 130 based on a fake image and a real image.

도 2 및 도 3에서 전술한 인코더부(110), 맵핑 네트워크(120), 생성자(130), 판별자(140), 추출자(150)와 관련된 내용은 도 4에서도 유추적용될 수 있다. 따라서, 인코더부(110), 맵핑 네트워크(120), 생성자(130), 판별자(140), 추출자(150)와 관련된 세부적인 내용들은 생략될 수 있다. 다만 이는 설명의 편의를 위한 것으로 제한적으로 해석되어서는 아니된다.The contents related to the encoder unit 110 , the mapping network 120 , the generator 130 , the discriminator 140 , and the extractor 150 described above in FIGS. 2 and 3 may be analogously applied to FIG. 4 as well. Accordingly, details related to the encoder unit 110 , the mapping network 120 , the generator 130 , the discriminator 140 , and the extractor 150 may be omitted. However, this is for convenience of explanation and should not be construed as being limited.

상품 이미지 생성 모델(100)의 추출자(150)는 전술한 바와 같이, 엣지 추출기(152), 및 피처 추출기(154)를 포함할 수 있다. 이때, 추출자(150)는, 추출자(150)는 엣지 추출기(152)를 통하여 페이크 이미지로부터 제1 엣지 정보를 추출할 수 있다. 또는 추출자(150)는 엣지 추출기(152)를 통하여 리얼 이미지로부터 제2 엣지 정보를 추출할 수 있다. As described above, the extractor 150 of the product image generation model 100 may include an edge extractor 152 and a feature extractor 154 . In this case, the extractor 150 may extract the first edge information from the fake image through the edge extractor 152 . Alternatively, the extractor 150 may extract the second edge information from the real image through the edge extractor 152 .

또한, 추출자(150)는 피처 추출기(154)를 통하여 페이크 이미지로부터 제1 피처 정보를 추출할 수 있다. 또는 추출자(150)는 엣지 추출기(154)를 통하여 리얼 이미지로부터 제2 피처 정보를 추출할 수 있다. Also, the extractor 150 may extract the first feature information from the fake image through the feature extractor 154 . Alternatively, the extractor 150 may extract the second feature information from the real image through the edge extractor 154 .

상품 이미지 생성 모델(100)의 이미지 임베딩 네트워크(160)는 리얼 이미지 및/또는 페이크 이미지를 획득하고, 리얼 이미지 및/또는 페이크 이미지에 기초하여 제1 대상 피처에 대응되는 제2 대상 피처를 생성할 수 있다. 일 예로, 이미지 임베딩 네트워크(160)는 리얼 이미지를 획득하고, 리얼 이미지로부터 제1 대상 피처에 대응되는 제2 대상 피처를 생성할 수 있다. 다른 예로, 이미지 임베딩 네트워크(160)는 페이크 이미지를 획득하고, 페이크 이미지로부터 제1 대상 피처에 대응되는 제2 대상 피처를 생성할 수 있다.The image embedding network 160 of the product image generation model 100 acquires a real image and/or a fake image, and generates a second target feature corresponding to the first target feature based on the real image and/or the fake image. can As an example, the image embedding network 160 may acquire a real image and generate a second target feature corresponding to the first target feature from the real image. As another example, the image embedding network 160 may obtain a fake image and generate a second target feature corresponding to the first target feature from the fake image.

이때, 상품 이미지 생성 모델(100)(혹은 상품 이미지 생성 장치(1000))은 제1 대상 피처와 제2 대상 피처의 차이에 기초하여 제1 대상 피처와 제2 대상 피처가 서로 근사되도록 생성자(130)(혹은 이미지 임베딩 네트워크(160))를 학습시킬 수 있다. 예컨대, 상품 이미지 생성 모델(100)은 제1 대상 피처와 제2 대상 피처를 비교하고, 비교 결과에 따라 제1 대상 피처와 제2 대상 피처의 차이를 감소시키도록 생성자(130)(혹은 이미지 임베딩 네트워크(160))에 로스를 부여할 수 있다. 예컨대, 생성자(130)는 페이크 이미지로부터 추출된 제2 대상 피처와 제1 대상 피처의 차이에 따른 로스를 생성자(130)에 부여함으로써, 생성자(130)가 학습될 수 있다. 혹은, 생성자(130)는 리얼 이미지로부터 추출된 제2 대상 피처와 제1 대상 피처의 차이에 따른 로스를 생성자(130)에 부여함으로써, 생성자(130)가 학습될 수 있다.In this case, the product image generation model 100 (or the product image generation apparatus 1000 ) generates the first target feature and the second target feature to approximate each other based on the difference between the first target feature and the second target feature. ) (or the image embedding network 160 ) may be trained. For example, the product image generation model 100 compares the first target feature with the second target feature, and the generator 130 (or image embedding) reduces the difference between the first target feature and the second target feature according to the comparison result. A loss may be given to the network 160 . For example, the generator 130 may learn the generator 130 by giving the generator 130 a loss according to the difference between the second target feature extracted from the fake image and the first target feature. Alternatively, the generator 130 may learn the generator 130 by giving the generator 130 a loss according to the difference between the second target feature extracted from the real image and the first target feature.

또한, 상품 이미지 생성 모델(100)은 제2 대상 피처(또는 제1 대상 피처)에 기초하여 생성된 페이크 이미지와 리얼 이미지의 차이가 감소되도록 생성자(130)를 학습시킬 수 있다. 일 예로, 이미지 임베딩 네트워크(160)를 통하여 '리얼 이미지'로부터 추출된 제2 대상 피처로부터 생성된 페이크 이미지와 리얼 이미지와의 차이에 따른 로스(예컨대, L1, L2 로스)를 생성자(130)에 부여함으로써 생성자(130)가 학습될 수 있다. 다른 예로, 이미지 임베딩 네트워크(160)를 통하여 '페이크 이미지'로부터 추출된 제2 대상 피처로부터 생성된 페이크 이미지와 리얼 이미지와의 차이에 따른 로스(예컨대, L1, L2 로스)를 생성자(130)에 부여함으로써 생성자(130)가 학습될 수 있다.Also, the product image generation model 100 may train the generator 130 to reduce the difference between the fake image and the real image generated based on the second target feature (or the first target feature). As an example, a loss (eg, L1, L2 loss) according to a difference between a fake image and a real image generated from the second target feature extracted from the 'real image' through the image embedding network 160 is provided to the generator 130 . By giving, the generator 130 can be learned. As another example, the loss (eg, L1, L2 loss) according to the difference between the fake image and the real image generated from the second target feature extracted from the 'fake image' through the image embedding network 160 to the generator 130 By giving, the generator 130 can be learned.

한편 이상에서 설명한, 이미지 임베딩 네트워크(160)를 통하여 추출된 제2 대상 피처에 기초하여 생성자(130)를 학습시키는 각 예시들은 적절하게 조합될 수 있다. 예컨대, 상품 이미지 생성 장치(1000)는 페이크 이미지로부터 추출된 제2 대상 피처 및 리얼 이미지로부터 추출된 제2 대상 피처 간의 차이에 따른 로스 함수를 추가적으로 획득하고, 이에 기초하여 생성자(130)를 학습시키도록 구성될 수 있다. 또한, 상품 이미지 생성 장치(1000)는 페이크 이미지로부터 추출된 제2 대상 피처로부터 생성된 페이크 이미지, 리얼 이미지로부터 추출된 제2 대상 피처로부터 생성된 페이크 이미지, 및/또는 리얼 이미지 간의 차이에 기초하여 생성자(130)를 학습시키도록 구현될 수 있다. Meanwhile, the examples of learning the generator 130 based on the second target feature extracted through the image embedding network 160 described above may be appropriately combined. For example, the product image generating apparatus 1000 additionally acquires a loss function according to the difference between the second target feature extracted from the fake image and the second target feature extracted from the real image, and trains the generator 130 based on this. It can be configured to Also, the product image generating apparatus 1000 may be configured based on a difference between the fake image generated from the second target feature extracted from the fake image, the fake image generated from the second target feature extracted from the real image, and/or the real image. It may be implemented to train the constructor 130 .

상품 이미지 생성 모델(100)의 정제 네트워크(170)는 생성자(130)를 통하여 출력되는 페이크 이미지의 노이즈를 제거하거나 페이크 이미지의 퀄리티를 증대시키는 동작을 수행할 수 있다. 예컨대, 정제 네트워트(170)는 페이크 이미지의 성능을 높이거나 페이크 이미지의 노이즈를 제거한 타겟 이미지와 페이크 이미지에 기초하여, 페이크 이미지가 타겟 이미지에 근사되도록 훈련될 수 있다. 이때, 훈련이 완료된 정제 네트워크(170)는 생성자(130)로부터 획득된 페이크 이미지에 기초하여 페이크 이미지의 성능이 증대된 코디 이미지를 출력할 수 있다.The refinement network 170 of the product image generation model 100 may perform an operation of removing noise of the fake image output through the generator 130 or increasing the quality of the fake image. For example, the refinement network 170 may be trained so that the fake image approximates the target image based on the target image and the fake image in which the performance of the fake image is improved or noise of the fake image is removed. In this case, the trained refining network 170 may output a coordinating image in which the performance of the fake image is increased based on the fake image obtained from the generator 130 .

한편 도 2 내지 도 4에서는 인코더부(100)가 복수의 인코더들(111, 113, 115)을 포함하며, 각각의 이미지들이 개별적으로 대응되는 인코더에 입력되는 것으로 도시하였다. 다만, 이는 예시에 불과하며, 인코더부(100)는 단일의 인코더를 포함하며, 단일의 인코더를 통하여 제1 이미지 및 제2 이미지를 포함한 복수의 이미지들로부터 각각 피처를 추출하도록 구성될 수도 있다. Meanwhile, in FIGS. 2 to 4 , the encoder unit 100 includes a plurality of encoders 111 , 113 , and 115 , and each image is individually input to a corresponding encoder. However, this is only an example, and the encoder unit 100 includes a single encoder, and may be configured to extract features from a plurality of images including the first image and the second image through the single encoder, respectively.

상품 이미지 생성 장치(1000)는 도 2 내지 도 4에서 서술한 바와 같이 학습을 완료한 상품 이미지 생성 모델(100)을 이용하여, 코디 이미지를 획득할 수 있다. 구체적으로 상품 이미지 생성 장치(1000)는 학습이 완료된 상품 이미지 생성 모델(100) 및/또는 상품 이미지 생성 모델(100)을 실행하기 위한 실행 데이터를 획득하고, 제1 상품 카테고리와 관련된 제1 대상 상품에 대한 제1 대상 이미지 및 제2 상품 카테고리와 관련된 제2 대상 상품에 대한 제2 대상 이미지를 포함하는 복수의 대상 이미지들을 획득할 수 있다. 이때, 상품 이미지 생성 장치(1000)는 상품 이미지 생성 모델(100)을 이용하여 제1 대상 상품 및 제2 대상 상품과 코디할 코디 상품과 관련된 코디 이미지를 획득할 수 있다. The product image generating apparatus 1000 may acquire a coordinating image by using the product image generating model 100 that has completed learning as described with reference to FIGS. 2 to 4 . Specifically, the product image generating apparatus 1000 obtains execution data for executing the product image generation model 100 and/or the product image generation model 100 on which the learning has been completed, and a first target product related to the first product category. A plurality of target images including a first target image for , and a second target image for a second target product related to the second product category may be acquired. In this case, the product image generating apparatus 1000 may use the product image generation model 100 to obtain the first target product and the second target product, and a coordination image related to the coordinated product to be coordinated.

상품 이미지 생성 모델(100)은 제1 상품 카테고리와 관련된 제1 이미지, 제2 상품 카테고리와 관련된 제2 이미지를 포함한 복수의 이미지에 기초하여 제1, 2 상품 카테고리와는 상이한 제3 상품 카테고리와 관련된 이미지를 출력하도록 학습되었기 때문에, 코디 이미지는 제1 대상 상품 및 제2 대상 상품과는 상이한 상품 카테고리의 상품과 관련된 이미지일 수 있다. The product image generation model 100 relates to a third product category different from the first and second product categories based on a plurality of images including a first image related to a first product category and a second image related to a second product category. Since it has been learned to output an image, the coordination image may be an image related to a product of a product category different from the first target product and the second target product.

구체적으로 상품 이미지 생성 장치(1000)는, 제1 이미지 및 제2 이미지를 포함한 복수의 이미지를 수신하고, 제1 대상 상품 및 제2 대상 상품과 코디할 코디 상품과 관련된 코디 이미지를 출력하도록 학습된 상품 이미지 생성 모델(100)을 통하여, 코디 이미지를 획득할 수 있다. Specifically, the product image generating apparatus 1000 receives a plurality of images including the first image and the second image, and is trained to output the first target product and the second target product and the coordination image related to the coordinated product to be coordinated. Through the product image generation model 100, it is possible to obtain a coordinating image.

이하에서는 도 5를 참고하여 본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)의 학습 방법을 설명한다. 도 5는 본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)의 학습 방법을 도시한 순서도이다. 상품 이미지 생성 모델(100)의 학습 방법을 설명함에 있어서, 앞서 도 2 내지 도 4에서 서술한 설명과 중복되는 일부 실시예는 생략될 수 있다. 다만, 이는 설명의 편의를 위한 것일 뿐이며, 제한적으로 해석되어서는 아니된다.Hereinafter, a learning method of the product image generation model 100 according to an embodiment of the present application will be described with reference to FIG. 5 . 5 is a flowchart illustrating a learning method of the product image generation model 100 according to an embodiment of the present application. In describing the learning method of the product image generation model 100 , some exemplary embodiments that overlap with the descriptions described above with reference to FIGS. 2 to 4 may be omitted. However, this is only for convenience of explanation, and should not be construed as limiting.

본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)의 학습 방법은 제1 상품과 관련된 제1 이미지 및 제2 상품과 관련된 제2 이미지를 포함하는 복수의 이미지를 획득하는 단계(S1100), 제1 이미지로부터 제1 피처를 추출하고, 제2 이미지로부터 제2 피처를 추출하는 단계(S1200), 제1 피처와 제2 피처에 기초하여 병합 피처를 생성하는 단계(S1300), 병합 피처를 변환하여 제1 상품 및 제2 상품과 코디할 제3 상품과 관련된 이미지를 생성하기 위한 제1 대상 피처를 획득하는 단계(S1400), 제1 대상 피처로부터 제3 상품 카테고리와 관련된 페이크 이미지를 생성하는 단계(S1500), 및 리얼 이미지를 획득하고, 리얼 이미지와 페이크 이미지에 기초하여 상품 이미지 생성 모델(100)을 학습시키는 단계(S1600)를 포함할 수 있다. The learning method of the product image generation model 100 according to an embodiment of the present application includes: acquiring a plurality of images including a first image related to a first product and a second image related to a second product (S1100); extracting a first feature from the first image, extracting a second feature from the second image (S1200), generating a merged feature based on the first feature and the second feature (S1300), transforming the merged feature to obtain a first target feature for generating an image related to a third product to be coordinated with the first product and the second product (S1400), and generating a fake image related to the third product category from the first target feature (S1500), and acquiring a real image, and learning the product image generation model 100 based on the real image and the fake image (S1600).

제1 상품과 관련된 제1 이미지 및 제2 상품과 관련된 제2 이미지를 포함하는 복수의 이미지를 획득하는 단계(S1100)에서는, 상품 이미지 생성 장치(1000)는 제1 상품 카테고리와 관련된 제1 상품에 대한 제1 이미지, 및 제2 상품 카테고리와 관련된 제2 상품에 대한 제2 이미지를 포함하여 복수의 상품 이미지를 획득할 수 있다. 예컨대, 상품 이미지 생성 장치(1000)는 송수신부(1100)를 통하여 제1 상품 카테고리(예컨대, 상의)와 관련된 제1 상품에 대한 제1 이미지를 수신할 수 있다. 또한, 상품 이미지 생성 장치(1000)는 송수신부(1100)를 통하여 제2 상품 카테고리(예컨대, 하의 또는 신발 등)과 관련된 제2 상품에 대한 제2 이미지를 수신할 수 있다. In the step of acquiring a plurality of images including the first image related to the first product and the second image related to the second product ( S1100 ), the product image generating apparatus 1000 is configured to select the first product related to the first product category. A plurality of product images may be acquired, including a first image for the product and a second image for a second product related to the second product category. For example, the product image generating apparatus 1000 may receive a first image of a first product related to a first product category (eg, top) through the transceiver 1100 . Also, the product image generating apparatus 1000 may receive a second image of a second product related to a second product category (eg, bottoms or shoes, etc.) through the transceiver 1100 .

제1 이미지로부터 제1 피처를 추출하고, 제2 이미지로부터 제2 피처를 추출하는 단계(S1200)에서는, 상품 이미지 생성 장치(1000)는 인코더부(110)에 제1 이미지 및 제2 이미지를 포함하여 복수의 상품 이미지를 입력하고, 인코더부(1100)를 통하여 출력되는 피처를 획득할 수 있다. 인코더부(110)는 전술한 바와 같이, 적어도 하나 이상의 인코더(111, 113, 115)를 포함할 수 있다. 이때, 제1 인코더(111)는 제1 이미지를 수신하고, 제1 이미지에 기초하여 제1 상품의 속성과 관련된 제1 피처를 추출할 수 있다. 또한, 제2 인코더(113)는 제2 이미지를 획득하고, 제2 이미지에 기초하여 제2 상품의 속성과 관련된 제2 피처를 추출할 수 있다. 또한, 제3 인코더(115)는 제N 이미지를 획득하고, 제N 이미지에 기초하여 제N 상품의 속성과 관련된 제N 피처를 추출할 수 있다. 여기서 상품의 속성과 관련된 제1 피처 내지 제N 피처는 각 상품의 재질, 스타일, 크기, 비율, 모양 및/또는 색상과 관련될 수 있다. In the step of extracting the first feature from the first image and the second feature from the second image ( S1200 ), the product image generating apparatus 1000 includes the first image and the second image in the encoder unit 110 . Thus, a plurality of product images may be input, and features output through the encoder unit 1100 may be obtained. As described above, the encoder unit 110 may include at least one or more encoders 111 , 113 , and 115 . In this case, the first encoder 111 may receive the first image, and extract a first feature related to the attribute of the first product based on the first image. Also, the second encoder 113 may obtain the second image and extract a second feature related to the attribute of the second product based on the second image. Also, the third encoder 115 may obtain an N-th image, and extract an N-th feature related to an N-th product attribute based on the N-th image. Here, the first to Nth features related to the attribute of the product may be related to the material, style, size, proportion, shape, and/or color of each product.

제1 피처와 제2 피처에 기초하여 병합 피처를 생성하는 단계(S1300)에서는, 상품 이미지 생성 장치(1000)는 제1 피처와 제2 피처를 포함하여 인코더부(110)를 통하여 획득된 복수의 피처들을 병합하여 병합 피처를 생성할 수 있다. 예컨대, 상품 이미지 생성 장치(1000)는 제1 피처 및 제2 피처를 포함하는 복수의 피처들 각각에 노멀라이제이션(normalization)을 수행한 후 병합하여 병합 피처를 생성하도록 구성될 수 있다. 다른 예를, 상품 이미지 생성 장치(1000)는 제1 피처 및 제2 피처를 포함하는 복수의 피처들 각각에 노멀라이제이션을 수행한 후 병합하여 중간 피처를 생성하고, 중간 피처에 노멀라이제이션을 수행하여 병합 피처를 생성하도록 구성될 수 있다. In the step of generating the merged feature based on the first feature and the second feature ( S1300 ), the product image generating apparatus 1000 includes the first feature and the second feature, and includes a plurality of acquired features through the encoder unit 110 . You can merge features to create a merge feature. For example, the product image generating apparatus 1000 may be configured to generate a merge feature by performing normalization on each of a plurality of features including the first feature and the second feature and then merging them. In another example, the product image generating apparatus 1000 performs normalization on each of a plurality of features including the first feature and the second feature, then merges them to generate an intermediate feature, and normalizes the intermediate feature. can be configured to create a merge feature by performing

병합 피처를 변환하여 제1 상품 및 제2 상품과 코디할 제3 상품과 관련된 이미지를 생성하기 위한 제1 대상 피처를 획득하는 단계(S1400)에서는, 상품 이미지 생성 장치(1000)는 병합 피처를 변환하여, 제1 상품 및 제2 상품과 코디할 제3 상품이 속한 제3 상품 카테고리와 관련된 이미지를 생성하기 위한 제1 대상 피처를 획득할 수 있다. 예컨대, 상품 이미지 생성 장치(1000)는 맵핑 네트워크(120)를 통하여 병합 피처를 변환하여 제3 상품 카테고리와 관련된 페이크 이미지를 생성하기 위한 제1 대상 피처를 획득하도록 구성될 수 있다. In the step S1400 of converting the merged feature to obtain a first target feature for generating an image related to the first product, the second product, and the third product to be coordinated, the product image generating apparatus 1000 converts the merged feature Accordingly, the first target feature for generating an image related to the third product category to which the first product and the second product and the third product to be coordinated belong may be acquired. For example, the product image generating apparatus 1000 may be configured to obtain a first target feature for generating a fake image related to a third product category by converting the merged feature through the mapping network 120 .

제1 대상 피처로부터 제3 상품 카테고리와 관련된 페이크 이미지를 생성하는 단계(S1500)에서는, 상품 이미지 생성 장치(1000)는, 제1 대상 피처에 기초하여 제3 상품 카테고리와 관련된 코디 상품과 관련된 페이크 이미지를 생성하는 생성자(130)를 통하여, 페이크 이미지를 생성하거나 획득할 수 있다. In the step of generating a fake image related to the third product category from the first target feature ( S1500 ), the product image generating apparatus 1000 , based on the first target feature, a fake image related to the coordinated product related to the third product category. A fake image may be generated or acquired through the generator 130 that generates .

리얼 이미지를 획득하고, 리얼 이미지와 페이크 이미지에 기초하여 상품 이미지 생성 모델(100)을 학습시키는 단계(S1600)에서는, 상품 이미지 생성 장치(1000)는 생성자(130)를 통하여 출력되는 페이크 이미지 및 제3 상품 카테고리와 관련된 리얼 이미지 간의 차이에 기초하여 상품 이미지 생성 모델(100)을 학습시킬 있다. 구체적으로, 상품 이미지 생성 장치(1000)는 페이크 이미지 및 리얼 이미지를 비교하고, 비교 결과에 기초하여 상품 이미지 생성 모델(100)(예컨대 생성자(130))를 학습시킬 수 있다. In the step (S1600) of acquiring a real image and learning the product image generation model 100 based on the real image and the fake image, the product image generating apparatus 1000 includes the fake image outputted through the generator 130 and the 3 The product image generation model 100 may be trained based on the difference between the real images related to the product category. Specifically, the product image generating apparatus 1000 may compare the fake image and the real image, and train the product image generating model 100 (eg, the generator 130 ) based on the comparison result.

일 예로, 상품 이미지 생성 장치(1000)는 페이크 이미지와 리얼 이미지 간의 차이에 기초하여 L1 로스 혹은 L2 로스를 생성자(130)에 부여하여, 생성자(130)를 학습시킬 수 있다.As an example, the product image generating apparatus 1000 may provide an L1 loss or an L2 loss to the generator 130 based on a difference between the fake image and the real image to train the generator 130 .

일 예로, 상품 이미지 생성 장치(1000)는 경쟁적 학습 기법을 채택하여 상품 이미지 생성 모델(100)의 생성자(130)를 학습시킬 수 있다. 구체적으로 상품 이미지 생성 모델(100)은 경쟁적 학습을 위하여 판별자(140)를 더 포함할 수 있다. 이때, 판별자(140)는 리얼 이미지와 페이크 이미지를 획득하고, 리얼 이미지와 페이크 이미지를 비교하여 페이크 이미지의 진위를 판별하는 동작을 수행할 수 있다. 이때, 생성자(130)와 판별자(140)는 적대적 생성 신경망(GAN)을 구성하여 경쟁적으로 학습될 수 있다. 구체적으로 판별자(140)는 페이크 이미지를 가짜라고 판단하고 리얼 이미지는 진짜라고 판단하도록 학습될 수 있다. 또한, 생성자(130)는, 판별자(140)의 판단 결과에 기초하여, 판별자(140)가 페이크 이미지를 진짜라고 판단하도록, 페이크 이미지를 리얼 이미지에 근사하여 출력하도록 학습될 수 있다. 예컨대, 판별자(140)는 리얼 이미지 및 페이크 이미지 각각에 대하여 진위와 관련된 지표(예컨대, 진짜일 확률값 혹은 임의의 형태의 점수)를 연산할 수 있으며, 생성자(130)는, 판별자(140)가 페이크 이미지의 진위 지표를 리얼 이미지의 진위 지표(혹은 목표하는 진위 지표)에 근사하여 연산하도록, 페이크 이미지를 출력하도록 훈련될 수 있다. As an example, the product image generating apparatus 1000 may train the creator 130 of the product image generating model 100 by adopting a competitive learning technique. Specifically, the product image generation model 100 may further include a discriminator 140 for competitive learning. In this case, the discriminator 140 may obtain a real image and a fake image, and compare the real image and the fake image to determine the authenticity of the fake image. In this case, the generator 130 and the discriminator 140 may be competitively learned by configuring an adversarial generative neural network (GAN). Specifically, the discriminator 140 may be trained to determine that the fake image is fake and that the real image is real. Also, the generator 130 may be trained to approximate the fake image to the real image so that the discriminator 140 determines that the fake image is real based on the determination result of the discriminator 140 . For example, the discriminator 140 may calculate an index related to authenticity (eg, a probability value of being genuine or a score in any form) for each of the real image and the fake image, and the generator 130 may include the discriminator 140 . may be trained to output the fake image to calculate the authenticity index of the fake image by approximating the authenticity index (or the target authenticity index) of the real image.

다른 예로, 상품 이미지 생성 장치(1000)는 페이크 이미지로부터 제1 엣지 정보 및/또는 제1 피처 정보를 추출하고, 리얼 이미지로부터 제2 엣지 정보 및/또는 제2 피처 정보를 추출하고, 제1 엣지 정보 및 제2 엣지 정보 간의 차이 또는 제1 피처 정보 및 제2 피처 정보 간의 차이에 기초하여 상품 이미지 생성 모델(100)을 학습시킬 수 있다. As another example, the product image generating apparatus 1000 extracts first edge information and/or first feature information from a fake image, extracts second edge information and/or second feature information from a real image, and the first edge The product image generation model 100 may be trained based on a difference between the information and the second edge information or a difference between the first feature information and the second feature information.

구체적으로 상품 이미지 생성 모델(100)은, 전술한 바와 같이, 페이크 이미지 및/또는 리얼 이미지로부터 엣지 정보를 추출하는 엣지 추출기(152, Edge extractor) 및 페이크 이미지 및/또는 리얼 이미지로부터 피처 정보를 추출하는 피처 추출기(154, Feature extractor)를 포함하는 추출자(150)를 더 포함할 수 있다. 구체적으로 추출자(150)는 엣지 추출기(152)를 통하여 페이크 이미지로부터 제1 엣지 정보를 추출할 수 있다. 또는 추출자(150)는 엣지 추출기(152)를 통하여 리얼 이미지로부터 제2 엣지 정보를 추출할 수 있다. 또한, 추출자(150)는 피처 추출기(154)를 통하여 페이크 이미지로부터 제1 피처 정보를 추출할 수 있다. 또는 추출자(150)는 피처 추출기(154)를 통하여 리얼 이미지로부터 제2 피처 정보를 추출할 수 있다.Specifically, as described above, the product image generation model 100 includes an edge extractor 152 (Edge extractor) that extracts edge information from a fake image and/or a real image and extracts feature information from the fake image and/or the real image. It may further include an extractor 150 including a feature extractor (154, Feature extractor). Specifically, the extractor 150 may extract the first edge information from the fake image through the edge extractor 152 . Alternatively, the extractor 150 may extract the second edge information from the real image through the edge extractor 152 . Also, the extractor 150 may extract the first feature information from the fake image through the feature extractor 154 . Alternatively, the extractor 150 may extract the second feature information from the real image through the feature extractor 154 .

이때, 상품 이미지 생성 모델(100)의 생성자(130)는 페이크 이미지와 관련된 제1 엣지 정보와 리얼 이미지와 관련된 제2 엣지 정보에 기초하여 학습될 수 있다. 구체적으로 생성자(130)는, 페이크 이미지와 관련된 제1 엣지 정보와 리얼 이미지와 관련된 제2 엣지 정보의 차이에 기초하여, 제1 엣지 정보와 제2 엣지 정보 간의 차이를 감소시키도록, 페이크 이미지를 출력하도록 학습될 수 있다. In this case, the generator 130 of the product image generation model 100 may be learned based on the first edge information related to the fake image and the second edge information related to the real image. Specifically, the generator 130 generates a fake image to reduce the difference between the first edge information and the second edge information based on the difference between the first edge information related to the fake image and the second edge information related to the real image. It can be learned to output.

다른 예를 들면, 상품 이미지 생성 모델(100)의 생성자(130)는 페이크 이미지와 관련된 제1 피처 정보와 리얼 이미지와 관련된 제2 피처 정보에 기초하여 학습될 수 있다. 구체적으로 생성자(130)는, 페이크 이미지와 관련된 제1 피처 정보와 리얼 이미지와 관련된 제2 피처 정보의 차이에 기초하여, 제1 피처 정보와 제2 피처 정보 간의 차이를 감소시키도록, 페이크 이미지를 출력하도록 학습될 수 있다.As another example, the generator 130 of the product image generation model 100 may be learned based on first feature information related to a fake image and second feature information related to a real image. Specifically, the generator 130 generates a fake image based on a difference between the first feature information related to the fake image and the second feature information related to the real image to reduce the difference between the first feature information and the second feature information. It can be learned to output.

한편 이상에서 설명한, 페이크 이미지 및 리얼 이미지에 기초하여 생성자(130)를 학습시키는 각 예시들은 적절하게 조합될 수 있다. 예컨대, 상품 이미지 생성 장치(1000)는 페이크 이미지 및 리얼 이미지의 차이에 기초한 로스, 및 추출자(150)를 통하여 추출된 페이크 이미지에 대한 제1 피처 정보(혹은 제1 엣지 정보)와 리얼 이미지에 대한 제2 피처 정보(혹은 제2 엣지 정보)의 차이에 기초한 로스를 생성자(130)에 부여하여 생성자(130)를 학습시킬 수 있다. Meanwhile, the examples of learning the generator 130 based on the fake image and the real image described above may be appropriately combined. For example, the product image generating apparatus 1000 includes a loss based on the difference between the fake image and the real image, and the first feature information (or first edge information) for the fake image extracted through the extractor 150 and the real image. The generator 130 may be trained by giving the generator 130 a loss based on a difference in the second feature information (or second edge information).

한편, 상품 이미지 생성 모델(100)은 전술한 바와 같이 이미지 임베딩 네트워크(160)를 포함할 수 있다. 이미지 임베딩 네트워크(160)는 리얼 이미지를 획득하고, 리얼 이미지로부터 맵핑 네트워크(120)를 통하여 출력되는 제1 대상 피처에 대응되는 제2 대상 피처를 생성할 수 있다. 또는 이미지 임베딩 네트워크(160)는 페이크 이미지를 획득하고, 페이크 이미지부터 맵핑 네트워크(120)를 통하여 출력되는 제1 대상 피처에 대응되는 제2 대상 피처를 생성할 수 있다. 또는 이미지 임베딩 네트워크(160)는 리얼 이미지 및 페이크 이미지를 획득하고, 리얼 이미지 및 페이크 이미지로부터 맵핑 네트워크(120)를 통하여 출력되는 제1 대상 피처에 대응되는 제2 대상 피처를 생성할 수 있다.Meanwhile, the product image generation model 100 may include the image embedding network 160 as described above. The image embedding network 160 may acquire a real image and generate a second target feature corresponding to the first target feature output from the real image through the mapping network 120 . Alternatively, the image embedding network 160 may acquire a fake image and generate a second target feature corresponding to the first target feature output from the fake image through the mapping network 120 . Alternatively, the image embedding network 160 may acquire a real image and a fake image, and generate a second target feature corresponding to the first target feature output through the mapping network 120 from the real image and the fake image.

이때, 상품 이미지 생성 모델(100)(혹은 상품 이미지 생성 장치(1000))는 제1 대상 피처와 제2 대상 피처의 차이에 기초하여 제1 대상 피처와 제2 대상 피처가 서로 근사되도록 생성자(130)(혹은 이미지 임베딩 네트워크(160))를 학습시킬 수 있다. 예컨대, 상품 이미지 생성 모델(100)은 제1 대상 피처와 제2 대상 피처를 비교하고, 비교 결과에 따라 제1 대상 피처와 제2 대상 피처의 차이를 감소시키도록 생성자(130)(혹은 이미지 임베딩 네트워크(160))에 로스를 부여할 수 있다. 예컨대, 상품 이미지 생성 모델(100)은 페이크 이미지로부터 추출된 제2 대상 피처와 제1 대상 피처의 차이에 따른 로스를 생성자(130)에 부여함으로써, 생성자(130)가 학습될 수 있다. 혹은, 상품 이미지 생성 모델(100)은 리얼 이미지로부터 추출된 제2 대상 피처와 제1 대상 피처의 차이에 따른 로스를 생성자(130)에 부여함으로써, 생성자(130)가 학습될 수 있다.In this case, the product image generating model 100 (or the product image generating apparatus 1000 ) generates the first target feature and the second target feature to approximate each other based on the difference between the first target feature and the second target feature. ) (or the image embedding network 160 ) may be trained. For example, the product image generation model 100 compares the first target feature with the second target feature, and the generator 130 (or image embedding) reduces the difference between the first target feature and the second target feature according to the comparison result. A loss may be given to the network 160 . For example, in the product image generation model 100 , the generator 130 may be learned by giving a loss according to the difference between the second target feature extracted from the fake image and the first target feature to the generator 130 . Alternatively, in the product image generation model 100 , the generator 130 may be learned by giving the generator 130 a loss according to the difference between the second target feature extracted from the real image and the first target feature.

또한, 상품 이미지 생성 모델(100)은 제2 대상 피처(또는 제1 대상 피처)에 기초하여 생성된 페이크 이미지와 리얼 이미지의 차이가 감소되도록 생성자(130)를 학습시킬 수 있다. 일 예로, 이미지 임베딩 네트워크(160)를 통하여 '리얼 이미지'로부터 추출된 제2 대상 피처로부터 생성된 페이크 이미지와 리얼 이미지와의 차이에 따른 로스(예컨대, L1, L2 로스)가 생성자(130)에 부여됨으로써 생성자(130)가 학습될 수 있다. 다른 예로, 이미지 임베딩 네트워크(160)를 통하여 '페이크 이미지'로부터 추출된 제2 대상 피처로부터 생성된 페이크 이미지와 리얼 이미지와의 차이에 따른 로스(예컨대, L1, L2 로스)가 생성자(130)에 부여됨으로써 생성자(130)가 학습될 수 있다.Also, the product image generation model 100 may train the generator 130 to reduce the difference between the fake image and the real image generated based on the second target feature (or the first target feature). As an example, the loss (eg, L1, L2 loss) according to the difference between the fake image and the real image generated from the second target feature extracted from the 'real image' through the image embedding network 160 is sent to the generator 130 . By being given, the generator 130 can be learned. As another example, the loss (eg, L1, L2 loss) according to the difference between the fake image and the real image generated from the second target feature extracted from the 'fake image' through the image embedding network 160 is to the generator 130 By being given, the generator 130 can be learned.

이하에서는 도 6을 참고하여 본 출원의 일 실시예에 따른 학습이 완료된 상품 이미지 생성 모델(100)을 이용하여 코디 이미지를 획득(또는 출력)하는 방법에 대하여 보다 구체적으로 설명하기로 한다. 도 6은 본 출원의 일 실시예에 따른 상품 이미지 생성 모델(100)을 이용하여 코디 이미지를 출력하는 방법을 도시한 순서도이다. Hereinafter, with reference to FIG. 6 , a method of acquiring (or outputting) a coordinating image using the product image generation model 100 that has been trained according to an embodiment of the present application will be described in more detail. 6 is a flowchart illustrating a method of outputting a coordinating image using the product image generation model 100 according to an embodiment of the present application.

본 출원의 일 실시예에 따른 코디 이미지를 출력하는 방법은, 학습이 완료된 상품 이미지 생성 모델(100)을 획득하는 단계(S2100), 제1 상품 카테고리와 관련된 제1 대상 상품에 대한 제1 대상 이미지 및 제2 상품 카테고리와 관련된 제2 대상 상품에 대한 제2 대상 이미지를 포함하는 복수의 이미지들을 획득하는 단계(S2200), 상품 이미지 생성 모델(100)을 이용하여 제1 대상 상품 및 제2 대상 상품과 코디할 코디 상품과 관련된 코디 이미지를 획득하는 단계(S2300), 및 코디 이미지를 출력하는 단계(S2400)를 포함할 수 있다. The method of outputting a coordinating image according to an embodiment of the present application includes: acquiring a product image generation model 100 on which learning is completed (S2100), a first target image for a first target product related to a first product category and acquiring a plurality of images including a second target image for a second target product related to a second product category ( S2200 ), using the product image generation model 100 , the first target product and the second target product and obtaining a coordinating image related to a coordinating product to be coordinated (S2300), and outputting a coordinating image (S2400).

학습이 완료된 상품 이미지 생성 모델(100)을 획득하는 단계(S2100)에서는, 상품 이미지 생성 장치(1000)는 학습 완료된 상품 이미지 생성 모델(100) 및/또는 상품 이미지 생성 모델(100)을 실행시키기 위한 실행 데이터를 획득할 수 있다. 학습이 완료된 상품 이미지 생성 모델(100)은 도 2 및 도 3에서 설명한 인코더부(100), 맵핑 네트워크(120), 생성자(130), 판별자(140) 및/또는 추출자(150)를 포함할 수 있다. 혹은 학습이 완료된 상품 이미지 생성 모델(100)은 도 4에서 설명한 인코더부(100), 맵핑 네트워크(120), 생성자(130), 판별자(140), 추출자(150), 이미지 임베딩 네트워크(160), 및/또는 정제 네트워크(170)를 포함할 수 있다.In the step (S2100) of acquiring the product image generation model 100 on which the learning has been completed, the product image generation apparatus 1000 is configured to execute the learned product image generation model 100 and/or the product image generation model 100 . Execution data can be obtained. The product image generation model 100 that has been trained includes the encoder unit 100, the mapping network 120, the generator 130, the discriminator 140 and/or the extractor 150 described with reference to FIGS. 2 and 3 . can do. Alternatively, the product image generation model 100 that has been trained is the encoder unit 100 , the mapping network 120 , the generator 130 , the discriminator 140 , the extractor 150 , and the image embedding network 160 described with reference to FIG. 4 . ), and/or a refinement network 170 .

제1 상품 카테고리와 관련된 제1 대상 상품에 대한 제1 대상 이미지 및 제2 상품 카테고리와 관련된 제2 대상 상품에 대한 제2 대상 이미지를 포함하는 복수의 이미지들을 획득하는 단계(S2200)에서는, 상품 이미지 생성 장치(1000)는 제1 상품 카테고리(예컨대, 상의)와 관련된 제1 대상 상품에 대한 제1 대상 이미지 및 제2 상품 카테고리(예컨대, 하의 또는 신발 등)와 관련된 제2 대상 상품에 대한 제2 대상 이미지를 포함하는 복수의 대상 이미지들을 획득할 수 있다.In the step of acquiring a plurality of images including a first target image for a first target product related to a first product category and a second target image for a second target product related to a second product category ( S2200 ), the product image The generating device 1000 may provide a first target image for a first target product related to a first product category (eg, tops) and a second target product for a second target product related to a second product category (eg, bottoms or shoes, etc.) A plurality of target images including the target image may be acquired.

상품 이미지 생성 모델을 이용하여 제1 대상 상품 및 제2 대상 상품과 코디할 코디 상품과 관련된 코디 이미지를 획득하는 단계(S2300)에서는, 상품 이미지 생성 장치(1000)는 상품 이미지 생성 모델(100)을 이용하여 제1 대상 상품 및 제2 대상 상품과 코디할 코디 상품과 관련된 코디 이미지를 획득할 수 있다. 이때, 코디 상품은 제1 상품 카테고리(예컨대, 상의) 및 제2 상품 카테고리(예컨대, 하의 또는 신발 등)과는 상이한 제3 상품 카테고리(예컨대, 외투)와 관련된 상품일 수 있다. 구체적으로 상품 이미지 생성 장치(1000)는 제1 대상 이미지 및 제2 대상 이미지를 포함하는 복수의 대상 이미지들을 상품 이미지 생성 모델(100)에 입력하고, 상품 이미지 생성 모델(100)을 통하여 출력되는 코디 이미지를 획득할 수 있다. In the step (S2300) of obtaining a coordinated image related to the first target product, the second target product, and the coordinated product to be coordinated using the product image generation model, the product image generation device 1000 generates the product image generation model 100 . Coordination images related to the first target product, the second target product, and the coordinated product to be coordinated may be acquired by using the first target product and the second target product. In this case, the coordination product may be a product related to a third product category (eg, coat) different from the first product category (eg, top) and the second product category (eg, bottoms or shoes, etc.). Specifically, the product image generating apparatus 1000 inputs a plurality of target images including the first target image and the second target image to the product image generation model 100 , and a coordinator outputted through the product image generation model 100 . image can be obtained.

코디 이미지를 출력하는 단계(S2400)에서는, 상품 이미지 생성 장치(1000)는 임의의 출력부(예컨대, 디스플레이 등)를 통하여 코디 이미지를 출력하거나, 임의의 외부 장치(예컨대, 사용자 단말)에 코디 이미지를 송신할 수 있다. In the step of outputting the coordinating image ( S2400 ), the product image generating apparatus 1000 outputs the coordinating image through an arbitrary output unit (eg, a display, etc.) or the coordinating image to an arbitrary external device (eg, a user terminal). can be sent.

한편, 전술한 바와 같이, 상품 이미지 생성 모델(100)은 정제 네트워크(170)를 더 포함할 수 있다. 이때, 코디 이미지를 출력하는 단계(S2400)에서의 코디 이미지는 정제 네트워크(170)를 통하여 노이즈가 제거되거나 퀄리티가 증대된 코디 이미지를 포괄하는 의미일 수 있다. Meanwhile, as described above, the product image generation model 100 may further include a refinement network 170 . In this case, the coordinating image in the step of outputting the coordinating image ( S2400 ) may mean encompassing the coordinating image in which noise is removed or the quality is increased through the refinement network 170 .

본 출원의 실시예에 따른 상품 이미지 생성 모델, 이를 이용한 상품 이미지 생성 방법, 및 상품 이미지 생성 장치에 의하면, 복수의 상품들과 관련된 복수의 이미지로부터 복수의 상품들과 어울리는 코디 상품과 관련된 이미지를 자동적으로 생성할 수 있다. According to the product image generation model, the product image generation method using the same, and the product image generation apparatus according to an embodiment of the present application, an image related to a coordinating product matching a plurality of products is automatically generated from a plurality of images related to a plurality of products can be created with

또한, 본 출원의 실시예에 따른 상품 이미지 생성 모델, 이를 이용한 상품 이미지 생성 방법, 및 상품 이미지 생성 장치에 의하면, 적대적 생성 신경망을 이용한 경쟁자 학습, 추출자를 통하여 추출한 추출 정보들 간의 비교 결과에 따른 로스 부여, 및/또는 이미지 임베딩 네트워크를 통하여 생성된 피처에 기초한 로스 부여 등을 포함하여 다양한 방식으로 상품 이미지 생성 모델을 학습시킬 수 있기 때문에, 상품 이미지 생성 모델의 정확도를 높일 수 있다.In addition, according to the product image generation model, the product image generation method using the product image generation model, and the product image generation apparatus according to the embodiment of the present application, the loss according to the comparison result between the extraction information extracted through the competitor learning using the adversarial generation neural network and the extractor Since the product image generation model can be trained in various ways, including granting and/or loss based on features generated through the image embedding network, the accuracy of the product image generating model can be increased.

상술한 상품 이미지 생성 장치(1000)의 다양한 동작들은 상품 이미지 생성 장치(1000)의 메모리(1200)에 저장될 수 있으며, 상품 이미지 생성 장치 (1000)의 프로세서(1300)는 메모리(1200)에 저장된 동작들을 수행하도록 제공될 수 있다. The various operations of the product image generating apparatus 1000 described above may be stored in the memory 1200 of the product image generating apparatus 1000 , and the processor 1300 of the product image generating apparatus 1000 is stored in the memory 1200 . may be provided to perform operations.

이상에서 실시 형태들에 설명된 특징, 구조, 효과 등은 본 발명의 적어도 하나의 실시 형태에 포함되며, 반드시 하나의 실시 형태에만 한정되는 것은 아니다. 나아가, 각 실시 형태에서 예시된 특징, 구조, 효과 등은 실시 형태들이 속하는 분야의 통상의 지식을 가지는 자에 의해 다른 실시 형태들에 대해서도 조합 또는 변형되어 실시 가능하다. 따라서 이러한 조합과 변형에 관계된 내용들은 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.Features, structures, effects, etc. described in the above embodiments are included in at least one embodiment of the present invention, and are not necessarily limited to only one embodiment. Furthermore, the features, structures, effects, etc. illustrated in each embodiment can be combined or modified for other embodiments by those of ordinary skill in the art to which the embodiments belong. Accordingly, the contents related to such combinations and modifications should be interpreted as being included in the scope of the present invention.

또한, 이상에서 실시 형태를 중심으로 설명하였으나 이는 단지 예시일 뿐 본 발명을 한정하는 것이 아니며, 본 발명이 속하는 분야의 통상의 지식을 가진 자라면 본 실시 형태의 본질적인 특성을 벗어나지 않는 범위에서 이상에 예시되지 않은 여러 가지의 변형과 응용이 가능함을 알 수 있을 것이다. 즉, 실시 형태에 구체적으로 나타난 각 구성 요소는 변형하여 실시할 수 있는 것이다. 그리고 이러한 변형과 응용에 관계된 차이점들은 첨부된 청구 범위에서 규정하는 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.In addition, although the embodiment has been mainly described in the above, this is only an example and does not limit the present invention, and those of ordinary skill in the art to which the present invention pertains in the range that does not deviate from the essential characteristics of the present embodiment. It will be appreciated that various modifications and applications not illustrated are possible. That is, each component specifically shown in the embodiment can be implemented by modification. And the differences related to these modifications and applications should be construed as being included in the scope of the present invention defined in the appended claims.

Claims

The product image generating device may coordinate with the first product and the second product based on a first image for a first product related to a first product category and a second image for a second product related to a second product category A method for training a product image generation model for generating a coordination image related to a coordination product, the method comprising:
The product image generation model may include: an encoder unit including a first encoder for extracting a first feature from the first image and a second encoder for extracting a second feature from the second image; a mapping network for generating a first target feature for generating the coordinating image by transforming a concatenate feature generated based on the first feature and the second feature; and a generator that generates a fake image related to the coordinated product belonging to the third product category;
obtaining a plurality of images including a first image related to the first product and a second image related to the second product;
extracting a first feature from the first image via the first encoder and extracting a second feature from the second image via the second encoder;
generating a merge feature based on the first feature and the second feature;
transforming the merged feature through the mapping network to obtain the first target feature;
generating a fake image related to the coordinated product from the first target feature through the generator; and
Learning the product image generation model based on the difference between the generated fake image and the real image related to the coordinated product;
A training method for a product image generation model.

The method of claim 1,
The product image generation model further comprises a;
The step of learning the product image generation model includes:
training the generator to generate the fake image by approximating the real image; and
Learning the discriminator to determine that the fake image is fake and that the real image is real; further comprising:
A training method for a product image generation model.

The method of claim 1,
The product image generation model is
An edge extractor for extracting edge information from at least one of the fake image and the real image, and a feature extractor for extracting feature information from at least one of the fake image and the real image Further comprising an extractor (Extractor) comprising a,
A training method for a product image generation model.

4. The method of claim 3,
The step of learning the product image generation model includes:
extracting first edge information from the fake image and acquiring second edge information from the real image through the edge extractor; and
Learning the generator based on a difference between the first edge information and the second edge information; further comprising
A training method for a product image generation model.

4. The method of claim 3,
The step of learning the product image generation model includes:
obtaining first feature information from the fake image and second feature information from the real image through the feature extractor; and
Learning the generator based on a difference between the first feature information and the second feature information; further comprising
A training method for a product image generation model.

The method of claim 1,
The product image generation model is
An image embedding network for extracting a second target feature based on at least one of the fake image and the real image,
The step of learning the product image generation model includes:
calculating a first loss value based on the first target feature and the second target feature;
Learning the generator to generate the fake image approximate to the real image based on the first loss value; further comprising
A training method for a product image generation model.

7. The method of claim 6,
The step of learning the product image generation model includes:
calculating a second loss value based on a difference between a fake image generated based on the second target feature and the real image;
Learning the generator to generate the fake image approximate to the real image based on the second loss value; further comprising
A training method for a product image generation model.

The method of claim 1,
The product image generation model is
An image embedding network for extracting a third target feature based on the fake image and extracting a fourth target feature based on the real image,
The step of learning the product image generation model includes:
calculating a third loss value based on the third target feature and the fourth target feature;
Learning the generator based on the third loss value; further comprising
A training method for a product image generation model.

The method of claim 1,
The product image generation model is
Further comprising a refinement network that removes noise of the fake image obtained through the generator or increases the quality of the fake image,
A training method for a product image generation model.

The method of claim 1,
The merge feature is
generated by merging after performing normalization on each of the first and second features,
A training method for a product image generation model.

A computer-readable recording medium in which a program for executing the method according to any one of claims 1 to 10 is recorded on a computer.