JP7224323B2

JP7224323B2 - Image generation system and image generation method using the same

Info

Publication number: JP7224323B2
Application number: JP2020169539A
Authority: JP
Inventors: ユンジェチェー; ヨンジョンウ; ジョンウハ
Original assignee: Naver Corp
Current assignee: Naver Corp
Priority date: 2020-05-29
Filing date: 2020-10-07
Publication date: 2023-02-17
Anticipated expiration: 2040-10-07
Also published as: KR102427484B1; KR20210147507A; JP2021190062A

Description

特許法第３０条第２項適用２０１９年１２月４日ｈｔｔｐｓ：／／ａｒｘｉｖ．ｏｒｇ／ａｂｓ／１９１２．０１８６５のウェブサイトにて「ＤｉｖｅｒｓｅＩｍａｇｅＳｙｎｔｈｅｓｉｓｆｏｒＭｕｌｔｉｐｌｅＤｏｍａｉｎｓ」について発表Application of Article 30, Paragraph 2 of the Patent Act December 4, 2019 https://arxiv. Published "Diverse Image Synthesis for Multiple Domains" on the website of org/abs/1912.01865

本発明は、イメージを生成するシステム及びこれを利用したイメージ生成方法に関する。 The present invention relates to an image generation system and an image generation method using the same.

イメージの一部特徴を他の特徴に変換したり、複数のイメージを互いに合成することにより、新しいイメージを生成するイメージ生成技術は、産業界において様々な目的に活用されているだけでなく、最近では、一般ユーザにも娯楽の要素として広く活用されている。 Image generation technology, which generates a new image by converting some features of an image into other features or synthesizing multiple images with each other, has been utilized for various purposes in the industrial world. Therefore, it is widely used by general users as an element of entertainment.

このようなイメージ生成技術は、人工知能の発達により、その生成技術が日々発展しており、実際に、人の目では区別が難しい程度の水準まで至った。 With the development of artificial intelligence, such image generation technology is developing day by day, and has actually reached a level where it is difficult for the human eye to distinguish between images.

特に、イメージ生成技術は、２０１４年にヨシュア・ベンジオ（ＹｏｓｈｕａＢｅｎｇｉｏ）教授の研究チームで考案された、敵対的生成ネットワーク（ＧｅｎｅｒａｔｉｖｅＡｄｖｅｒｓａｒｉａｌＮｅｔｗｏｒｋ。略字：ＧＡＮ）に基づいて飛躍的に発展した。 In particular, image generation technology has made rapid progress based on the Generative Adversarial Network (abbreviated as GAN) devised in 2014 by a research team led by Professor Yoshua Bengio.

敵対的生成ネットワーク（ＧＡＮ）は、確率分布を学習する生成モデルと互いに異なる集合を区分する識別モデルとで構成される。このとき、イメージ生成モデル（または、生成子）は、ターゲットドメインを有する偽物イメージを作って識別モデルを最大限詐称して訓練するようになされる。そして、識別モデル（または、識別子）は、生成モデルが提示する偽物イメージと実際イメージとをターゲットドメインを基準に最大限正確に区分するように訓練される。 A generative adversarial network (GAN) consists of a generative model that learns probability distributions and a discriminative model that partitions different sets. At this time, the image generation model (or generator) is designed to create a fake image having the target domain and to train the discriminant model with maximum deception. Then, the discriminative model (or discriminator) is trained to classify the fake image presented by the generative model and the real image as accurately as possible based on the target domain.

このように、識別モデルを詐称するように生成モデルを訓練する方式を対立的プロセスという。このような敵対的生成ネットワークは、生成モデルと識別モデルとを対立的プロセスを介して発展させる過程であって、ターゲットドメインに対して実際イメージと極めて類似した類似イメージ、すなわち、偽物イメージを生成できるようになった。 This method of training a generative model to impersonate a discriminative model is called an adversarial process. Such a generative adversarial network is a process of developing a generative model and a discriminative model through an adversarial process, and can generate a similar image, that is, a fake image, which is very similar to the actual image for the target domain. It became so.

しかしながら、このような敵対的生成ネットワークにおいてイメージ生成モデル及び識別モデルは、ターゲットドメインを基準に学習されるので、ターゲットドメインが変更される場合、新しいイメージ生成モデル及び識別モデルを訓練しなければならないという限界を有する。 However, in such a generative adversarial network, the image generation model and the discriminant model are learned based on the target domain, so if the target domain is changed, new image generation models and discriminant models must be trained. have limits.

これにより、様々なターゲットドメインに対して柔軟に対処できるイメージ生成方法に対するニーズが依然として存在する。 Thus, there is still a need for an image generation method that can flexibly address different target domains.

本発明は、互いに異なるターゲットドメインに対応する様々なイメージを生成できるイメージ生成システム及びこれを利用したイメージ生成方法を提供するものである。 SUMMARY OF THE INVENTION The present invention provides an image generation system and an image generation method using the same that can generate various images corresponding to different target domains.

前述したような課題を解決するために、本発明に係るイメージ生成システムは、変換の対象になるソースイメージを受信するイメージ入力部と、基準イメージの外貌スタイルと関連したスタイルコードを入力するスタイルコード入力部と、前記スタイルコードを用いて、前記ソースイメージに前記基準イメージの外貌スタイルが反映された合成イメージを生成するイメージ生成部とを備えることができる。 To solve the above problems, the image generation system according to the present invention includes an image input unit for receiving a source image to be transformed, and a style code for inputting a style code related to the appearance style of a reference image. An input unit and an image generation unit for generating a composite image in which the appearance style of the reference image is reflected in the source image using the style code.

本発明に係るイメージ生成システムは、ドメインの特性を含むスタイルコードを用いて、スタイルコードに含まれたドメイン特性に該当するドメインを有するイメージを生成できる。 The image generation system according to the present invention can generate an image having a domain corresponding to the domain characteristics included in the style code using the style code including the characteristics of the domain.

本発明に係るイメージ生成システム及びこれを利用したイメージ生成方法を説明するための概念図である。1 is a conceptual diagram illustrating an image generation system and an image generation method using the same according to the present invention; FIG. 本発明に係るイメージ生成システム及びこれを利用したイメージ生成方法を説明するための概念図である。1 is a conceptual diagram illustrating an image generation system and an image generation method using the same according to the present invention; FIG. 本発明に係るイメージ生成方法を説明するためのフローチャートである。4 is a flow chart for explaining an image generation method according to the present invention; 本発明に係るマッピングネットワークを利用してスタイルコードを生成する方法を説明するための概念図である。FIG. 4 is a conceptual diagram for explaining a method of generating style codes using the mapping network according to the present invention; 本発明に係るマッピングネットワークを利用してスタイルコードを生成する方法を説明するための概念図である。FIG. 4 is a conceptual diagram for explaining a method of generating style codes using the mapping network according to the present invention; 本発明に係るマッピングネットワークを利用してスタイルコードを生成する方法を説明するための概念図である。FIG. 4 is a conceptual diagram for explaining a method of generating style codes using the mapping network according to the present invention; 本発明に係るスタイルエンコーダを用いてスタイルコードを生成する方法を説明するための概念図である。FIG. 4 is a conceptual diagram for explaining a method of generating style codes using a style encoder according to the present invention; 本発明に係るスタイルエンコーダを用いてスタイルコードを生成する方法を説明するための概念図である。FIG. 4 is a conceptual diagram for explaining a method of generating style codes using a style encoder according to the present invention; 本発明に係るイメージ生成システムを学習する方法を説明するための概念図である。1 is a conceptual diagram for explaining a method of learning an image generation system according to the present invention; FIG.

以下、添付された図面を参照して本明細書に開示された実施形態を詳細に説明するものの、図面符号に関係なく、同一であるか、類似した構成要素には同じ参照符号を付し、これについての重複する説明を省略する。以下の説明において使用される構成要素に対する接尾辞の「モジュール」及び「部」は、明細書作成の容易さだけが考慮されて付与されるか、混用されるものであって、それ自体で互いに区別される意味または役割を有するものではない。また、本明細書に開示された実施形態を説明するにあたって、関連した公知技術についての具体的な説明が本明細書に開示された実施形態の要旨を不明確にする恐れがあると判断される場合、その詳細な説明を省略する。また、添付された図面は、本明細書に開示された実施形態を容易に理解できるようにするためのものであり、添付された図面によって本明細書に開示された技術的思想が限定されず、本発明の思想及び技術範囲に含まれるあらゆる変更、均等物ないし代替物を含むことと理解されるべきである。 Hereinafter, the embodiments disclosed herein will be described in detail with reference to the accompanying drawings, wherein identical or similar components are denoted by the same reference numerals regardless of the drawing number, Duplicate explanations about this will be omitted. The suffixes "module" and "part" for components used in the following description are given or mixed only for ease of drafting the specification and are It has no distinct meaning or role. In addition, in describing the embodiments disclosed in this specification, it is determined that the specific description of related known technologies may obscure the gist of the embodiments disclosed in this specification. If so, detailed description thereof will be omitted. In addition, the attached drawings are provided to facilitate understanding of the embodiments disclosed herein, and the technical ideas disclosed herein are not limited by the attached drawings. , to include any modifications, equivalents or alternatives falling within the spirit and scope of the invention.

第１、第２などのように、序数を含む用語は、様々な構成要素を説明するのに使用され得るが、上記構成要素等は、前記用語等により限定されるものではない。前記用語等は、１つの構成要素を他の構成要素から区別する目的にのみ使用される。 Terms including ordinal numbers, such as first, second, etc., may be used to describe various components, but these components are not limited by such terms. The terms are only used to distinguish one element from another.

ある構成要素が他の構成要素に「連結されて」いるまたは「接続されて」いると言及されたときには、その他の構成要素に直接的に連結されているまたは接続されていることもできるが、中間に他の構成要素が存在することもできると理解されるべきであろう。それに対し、ある構成要素が他の構成要素に「直接連結されて」いるまたは「直接接続されて」いると言及されたときには、中間に他の構成要素が存在しないことと理解されるべきであろう。 When a component is referred to as being "coupled" or "connected" to another component, it can also be directly coupled or connected to the other component; It should be understood that there may be other components in between. In contrast, when a component is referred to as being "directly coupled" or "directly connected" to another component, it should be understood that there are no other components in between. deaf.

単数の表現は、文脈上明白に異なるように意味しない限り、複数の表現を含む。 Singular expressions include plural expressions unless the context clearly dictates otherwise.

本出願において、「含む」または「有する」などの用語は、明細書上に記載された特徴、数字、ステップ、動作、構成要素、部品、またはこれらを組み合わせたものが存在することを指定しようとするものであり、１つまたは複数の他の特徴や数字、ステップ、動作、構成要素、部品、またはこれらを組み合わせたものの存在または付加可能性を予め排除しないことと理解されなければならない。 In this application, terms such as "including" or "having" are intended to specify the presence of the features, numbers, steps, acts, components, parts, or combinations thereof set forth in the specification. and does not preclude the possibility of the presence or addition of one or more other features, figures, steps, acts, components, parts, or combinations thereof.

一方、本発明は、互いに異なるターゲットドメインに対応する様々なイメージを生成できるイメージ生成システム及びこれを利用したイメージ生成方法を提供するものである。 Meanwhile, the present invention provides an image generation system capable of generating various images corresponding to different target domains and an image generation method using the same.

より具体的に、本発明は、単一のイメージ生成部を利用して、互いに異なるターゲットドメインに各々対応する互いに異なるイメージを生成できるイメージ生成システム及びこれを利用したイメージ生成方法を提供するものである。 More specifically, the present invention provides an image generation system and an image generation method using the same that can generate different images corresponding to different target domains using a single image generation unit. be.

さらに、本発明は、ターゲットドメインを基準に様々な外貌スタイルを有するイメージを生成できるイメージ生成システム及びこれを利用したイメージ生成方法を提供するものである。本発明は、イメージ生成システム及びこれを利用したイメージ生成方法に関するものであって、特に、本発明に係るイメージ生成システムは、「イメージトゥイメージ（ｉｍａｇｅｔｏｉｍａｇｅ）変換（ｔｒａｎｓｌａｔｉｏｎ）」に基づいてイメージを生成できる。 Further, the present invention provides an image generation system and an image generation method using the same that can generate images having various appearance styles based on a target domain. More particularly, the present invention relates to an image generation system and an image generation method using the same, and more particularly, the image generation system according to the present invention generates an image based on "image to image translation". can generate

ここで、「イメージトゥイメージ変換」とは、与えられた入力イメージを基に新しいイメージを生成することを意味する。より具体的に、イメージトゥイメージ変換では、入力イメージの少なくとも一部分を変換することで、新しいイメージを生成することを意味できる。 Here, "image-to-image conversion" means generating a new image based on a given input image. More specifically, image-to-image transformation can mean generating a new image by transforming at least a portion of an input image.

本発明は、特に、「イメージトゥイメージ変換」を行うにあたって、単一の「イメージ生成部」だけで、様々なスタイル及びドメインに該当する新しいイメージを生成できるイメージ生成システムに関するものである。 More particularly, the present invention relates to an image generation system capable of generating new images for various styles and domains with only a single "image generator" in performing "image-to-image conversion".

このとき、イメージ生成部は、同じドメインに対する様々なスタイルのイメージを生成する、または、互いに異なるドメインに対する同じスタイルのイメージを生成できる。以下では、本発明に係るイメージ生成システムについて添付された図面とともにより具体的に説明する。図１及び図２は、本発明に係るイメージ生成システム及びこれを利用したイメージ生成方法を説明するための概念図であり、図３は、本発明に係るイメージ生成方法を説明するためのフローチャートである。 At this time, the image generator may generate images of various styles for the same domain, or may generate images of the same style for different domains. Hereinafter, the image generation system according to the present invention will be described in more detail with reference to the attached drawings. 1 and 2 are conceptual diagrams for explaining an image generation system and an image generation method using the same according to the present invention, and FIG. 3 is a flowchart for explaining the image generation method according to the present invention. be.

図１に示されたように、本発明に係るイメージ生成システム１００は、生成部（ｇｅｎｅｒａｔｏｒ、または、イメージ生成部、１１０）及びスタイルコード入力部１２０を備えるように構成されることができる（以下、説明の都合上、「生成部１１０」は「イメージ生成部１１０」と命名する）。さらに、イメージ生成システム１００は、入力部１３０及び出力部１４０のうち、少なくとも１つをさらに備えることができる。 As shown in FIG. 1, an image generation system 100 according to the present invention can be configured to include a generator (generator or image generator 110) and a style code input unit 120 (hereinafter referred to as , for convenience of explanation, the "generator 110" is named "image generator 110"). In addition, the image generation system 100 may further include at least one of the input unit 130 and the output unit 140 .

イメージ生成部１１０は、入力部１３０を介して入力されるイメージを基にイメージを生成し、生成されたイメージは、出力部１４０を介して出力されることができる。 The image generator 110 may generate an image based on the image input through the input unit 130 and output the generated image through the output unit 140 .

本発明では、説明の都合上、イメージ生成部１１０に、イメージ生成のために入力されるイメージを「ソースイメージ（ｓｏｕｒｃｅｉｍａｇｅ）」と命名する。 In the present invention, for convenience of explanation, an image input to the image generator 110 for image generation is called a 'source image'.

ここで、ソースイメージは、イメージ変換（または、イメージ生成）の基になるイメージを意味できる。イメージ生成部１１０は、ソースイメージを基に新しいイメージを生成できる。図１に示されたように、ソースイメージ１００ａは、入力部１３０を介してイメージ生成部１１０に入力されることができる。 Here, a source image can mean an image that is the basis of image transformation (or image generation). The image generator 110 can generate a new image based on the source image. As shown in FIG. 1, a source image 100a may be input to the image generator 110 through the input unit 130. The input unit 130 may be a source image.

さらに、本発明では、説明の都合上、イメージ生成部１１０により生成されたイメージを「合成イメージ（または、出力イメージ）」と命名する。図１に示されたように、合成イメージ２００は、出力部１４０を介して出力されることができる。 Furthermore, in the present invention, for convenience of explanation, the image generated by the image generation unit 110 is named "composite image (or output image)". As shown in FIG. 1, the composite image 200 can be output through the output unit 140. FIG.

このように、イメージ生成部１１０は、入力部１３０を介して入力されるソースイメージ１００ａを基に、基準イメージ１００ｂを用いて合成イメージ２００を生成できる。 As such, the image generating unit 110 can generate the synthetic image 200 based on the source image 100a input through the input unit 130 and using the reference image 100b.

このとき、イメージ生成部１１０は、スタイルコード入力部１２０を介して入力されるスタイルコードを用いて合成イメージ２００を生成できる。 At this time, the image generator 110 can generate the composite image 200 using the style code input through the style code input unit 120 .

図１に示されたように、ソースイメージ１００ａには、少なくとも１つのグラフィックオブジェクト（例えば、人のイメージ）が含まれ得る。イメージ生成部１１０は、このようなグラフィックオブジェクト（または、第１のグラフィックオブジェクト）に、スタイルコードによる外貌スタイルを反映して合成イメージ２００を生成できる。 As shown in FIG. 1, source image 100a may include at least one graphical object (eg, an image of a person). The image generation unit 110 can generate the composite image 200 by reflecting the appearance style according to the style code on the graphic object (or the first graphic object).

本発明において、グラフィックオブジェクトは、人、動物、自動車、花、かばん、山などのように、事物に対するイメージと理解されることができる。 In the present invention, graphic objects can be understood as images of things, such as people, animals, cars, flowers, bags, mountains, and the like.

本明細書では、説明の都合上、ソースイメージ１００ａに含まれたグラフィックオブジェクトを「第１のグラフィックオブジェクト」と命名する。そして、合成イメージ２００に含まれたグラフィックオブジェクトを「第３のグラフィックオブジェクト」と命名する。そして、基準イメージ１００ｂに含まれたグラフィックオブジェクトを「第２のグラフィックオブジェクト」と命名する。さらに、第２のグラフィックオブジェクトは、基準イメージ１００ｂに含まれたものだけでなく、ガウス分布から抽出されるノイズ情報によって特定されるオブジェクトを意味できる。このような、ガウス分布から抽出されるオブジェクトは、スタイルコードの抽出対象（または、スタイルコードを抽出するために参照される対象）とも表現することができる。 In this specification, for convenience of explanation, the graphic object contained in the source image 100a is named "first graphic object". A graphic object included in the composite image 200 is named a 'third graphic object'. A graphic object included in the reference image 100b is named a 'second graphic object'. Furthermore, the second graphic object can mean not only those contained in the reference image 100b, but also objects identified by noise information extracted from a Gaussian distribution. Such an object extracted from a Gaussian distribution can also be expressed as a style code extraction target (or a target referred to for style code extraction).

すなわち、第２のグラフィックオブジェクトは、基準イメージ（ｒｅｆｅｒｅｎｃｅｉｍａｇｅ）１００ｂに含まれるか、または複数の基準イメージに対するデータ分布によるガウス分布の特定ノイズに対応することができる。 That is, the second graphical object may correspond to Gaussian-distributed specific noise contained in the reference image 100b or according to the data distribution for a plurality of reference images.

以下では、説明の都合上、ガウス分布の特定ノイズに対応する第２のグラフィックオブジェクトについて別に称さずに、全て「基準イメージ」と統一して説明する。 In the following, for convenience of explanation, the second graphic object corresponding to the specific noise of Gaussian distribution will not be specifically referred to, and will be uniformly explained as a "reference image".

すなわち、以下では、説明の都合上、第２のグラフィックオブジェクトと基準イメージとを同じ意味として説明する。したがって、以下において基準イメージは、ガウス分布により特定されるオブジェクトを意味することもできる。 That is, hereinafter, for convenience of explanation, the second graphic object and the reference image have the same meaning. Therefore, reference image in the following can also mean an object specified by a Gaussian distribution.

また、本明細書では、ソースイメージと第１のグラフィックオブジェクトとを互いに同じ意味として使用することができる。すなわち、ソースイメージの外貌スタイルは、つまり、第１のグラフィックオブジェクトの外貌スタイルを意味できる。 Also, the terms source image and first graphic object may be used interchangeably herein. That is, the appearance style of the source image can mean the appearance style of the first graphic object.

ここで、スタイルコードは、基準イメージ１００ｂの外貌スタイルと関連することができる。「外貌スタイル」は、基準イメージ１００ｂの視覚的な外観を定義できる要素であって、ヘアスタイル（または、頭髪スタイル）、性別など、様々な要素によって決定されることができる。 Here, the style code can be associated with the appearance style of the reference image 100b. The 'appearance style' is a factor that can define the visual appearance of the reference image 100b, and can be determined by various factors such as hairstyle (or hair style) and gender.

前述したように、基準イメージ１００ｂは、ソースイメージ１００ａの外貌スタイルを変更するために参照される対象を意味できる。 As described above, the reference image 100b can represent an object that is referenced to change the appearance style of the source image 100a.

このように、イメージ生成部１１０は、ソースイメージ１００ａに、基準イメージの外貌スタイルに該当するスタイルコードを反映することにより、前記基準イメージの外貌スタイルが反映された合成イメージ２００を生成できる。 As such, the image generating unit 110 can generate the synthesized image 200 reflecting the appearance style of the reference image by reflecting the style code corresponding to the appearance style of the reference image in the source image 100a.

本発明において、合成イメージ２００を生成するとは、ソースイメージ１００ａ、すなわち、第１のグラフィックオブジェクトの外貌スタイルを、基準イメージ１００ｂの外貌スタイルを参照して変換（または、変更）することを意味できる。その結果、本発明では、第１のグラフィックオブジェクトの一部分が基準イメージの外貌スタイルに変換された合成イメージが生成され得る。 In the present invention, generating the composite image 200 can mean transforming (or changing) the appearance style of the source image 100a, ie, the first graphic object, with reference to the appearance style of the reference image 100b. As a result, the present invention can generate a composite image in which a portion of the first graphic object has been transformed to the appearance style of the reference image.

一方、本発明において、スタイルコードは、スタイル情報及びドメイン特性情報を含むことができる。このとき、スタイル情報は、ドメイン特性情報によるドメインと関連したスタイルに関する情報でありうる。 Meanwhile, in the present invention, the style code can include style information and domain property information. At this time, the style information may be information related to the style associated with the domain according to the domain characteristic information.

イメージ生成部１１０は、スタイルコードに含まれたスタイル情報及びドメイン特性情報に基づいて、ソースイメージ１００ａ（より具体的には、ソースイメージ１００ａに含まれた第１のグラフィックオブジェクト）の外貌スタイルを変換することにより合成イメージ２００を生成できる。このとき、イメージ生成部１１０は、合成イメージ２００が、スタイルコードに含まれたドメイン特性情報に対応するドメインを有するように、前記ソースイメージ１００ａを基に合成イメージ２００を生成できる。 The image generator 110 converts the appearance style of the source image 100a (more specifically, the first graphic object included in the source image 100a) based on the style information and domain characteristic information included in the style code. By doing so, a composite image 200 can be generated. At this time, the image generator 110 can generate the composite image 200 based on the source image 100a so that the composite image 200 has a domain corresponding to the domain characteristic information included in the style code.

その結果、合成イメージ２００に含まれた第３のグラフィックオブジェクトは、第１のグラフィックオブジェクトに、前記スタイルコードに含まれたスタイル情報及びドメイン特性情報が反映されたグラフィックオブジェクトでありうる。すなわち、第３のグラフィックオブジェクトは、第１のグラフィックオブジェクトに第２のグラフィックオブジェクトの外貌スタイルが合成されたイメージでありうる。 As a result, the third graphic object included in the synthesized image 200 may be a graphic object in which the style information and domain characteristic information included in the style code are reflected in the first graphic object. That is, the third graphic object may be an image in which the appearance style of the second graphic object is combined with the first graphic object.

このように、本発明では、スタイル情報及びドメイン特性情報が含まれたスタイルコードを用いて、ソースイメージ１００ａを基にする合成イメージ２００を生成できる。 As such, the present invention can generate the synthetic image 200 based on the source image 100a using the style code including the style information and the domain characteristic information.

すなわち、本発明に係るイメージ生成システム１００は、ソースイメージ１００ａの特定ドメインを基準イメージ１００ｂの特定ドメインに変更することにより合成イメージ２００を生成できる。 That is, the image generating system 100 according to the present invention can generate the synthesized image 200 by changing the specific domain of the source image 100a to the specific domain of the reference image 100b.

スタイルコードは、図２に示されたように、それぞれの基準イメージ１０１ｂ、１０２ｂ、１０３ｂ、１０４ｂ、１０５ｂ、１０６ｂに対するスタイル及びドメインに関する情報を含むことができる。 The style code can include information about the style and domain for each reference image 101b, 102b, 103b, 104b, 105b, 106b, as shown in FIG.

このとき、スタイルコードは、図２に示されたように、ベクトル（ｖｅｃｔｏｒ）形式を有するようになされることができる。さらに、スタイルコード入力部１２０は、このようなベクトル形式を有するスタイルコードを、適応インスタンス正規化（ａｄａｐｔｉｖｅｉｎｓｔａｎｃｅｎｏｒｍａｌｉｚａｔｉｏｎ）（ＡｄａＩＮ）を介してイメージ生成部１１０に入力することができる。 At this time, the style code may have a vector format as shown in FIG. Furthermore, the style code input unit 120 can input the style code having such a vector format to the image generator 110 through adaptive instance normalization (AdaIN).

上述したように、スタイルコードは、基準イメージ１００ｂのスタイル及びドメインを特定するための、スタイル情報及びドメイン特性情報を含むことができる。以下では、本発明に対する理解を助けるために、スタイル情報、ドメイン、及びドメイン特性情報が有する意味について説明する。 As noted above, the style code may include style information and domain property information to identify the style and domain of the reference image 100b. In order to facilitate understanding of the present invention, the meanings of style information, domains, and domain property information will be described below.

まず、「スタイル情報」は、グラフィックオブジェクトが有する外貌スタイル、すなわち、視覚的特徴（または、視覚的外観）に関する情報を意味する。 First, 'style information' means information about the appearance style of a graphic object, that is, visual characteristics (or visual appearance).

ここで、視覚的特徴は、頭髪スタイルなどのように、目に見える外貌（ａｐｐｅａｒａｎｃｅ）と関連した特徴を意味できる。 Here, visual features may refer to features related to visible appearance, such as hairstyles.

このようなスタイル情報は、複数のカテゴリー（または、スタイルカテゴリー、属性（ａｔｔｒｉｂｕｔｅ）などと命名可能である）のうち、少なくとも１つのカテゴリーに対する特徴情報を含むことができる。 Such style information may include characteristic information for at least one category of a plurality of categories (or which may be named style categories, attributes, etc.).

ここで、カテゴリーまたは属性は、グラフィックオブジェクトが有する意味のある視覚的特徴を区分するための区分基準であると理解されることができる。また、カテゴリーは、グラフィックオブジェクトの外貌スタイルを定義するための要素であると理解されることができる。 Here, categories or attributes can be understood as classification criteria for classifying meaningful visual features of graphic objects. A category can also be understood as an element for defining the appearance style of a graphic object.

一方、カテゴリーに対する特徴情報は、グラフィックオブジェクトが当該カテゴリーにおいて「どのような視覚的特徴を有するか」をデータとして表現したことを意味できる。 On the other hand, feature information for a category can mean that "what kind of visual feature the graphic object has in the category" is expressed as data.

このとき、「カテゴリーに対する特徴情報」は、「属性値（ａｔｔｒｉｂｕｔｅｖａｌｕｅ）」とも命名されることができる。 At this time, the 'feature information for the category' may also be named 'attribute value'.

「カテゴリー（または、属性）」についてより具体的に説明すれば、グラフィックオブジェクトの外貌スタイル、すなわち、視覚的特徴を表現するためのカテゴリー（または、属性）の種類は非常に様々でありうる。 To be more specific about "categories (or attributes)", the types of categories (or attributes) for representing the appearance style, ie visual characteristics, of graphic objects can vary greatly.

例えば、性別、年齢、ヘアスタイル（頭髪スタイル）、ヘア色相（頭髪色相）、皮膚色相、メーキャップ（化粧）、ひげ、顔型、表情、メガネ、アクセサリー、眉毛形状、目形状、口唇形状、鼻形状、耳形状、人中形状などが全てそれぞれの個別カテゴリー（または、属性）と理解されることができる。 For example, gender, age, hairstyle (hair style), hair color (hair color), skin color, makeup (makeup), beard, face shape, expression, glasses, accessories, eyebrow shape, eye shape, lip shape, nose shape , ear shape, philtrum shape, etc. can all be understood as respective individual categories (or attributes).

スタイル情報は、カテゴリーに対する識別情報（カテゴリー種類、カテゴリーインデックス情報等）及び当該カテゴリーに対する特徴情報を全て含むことができる。 The style information may include both identification information for the category (category type, category index information, etc.) and feature information for the category.

例えば、カテゴリーに対する識別情報は、「ヘアスタイル」であり、カテゴリーに対する特徴情報は、「金髪ウェーブ」でありうる。 For example, the identification information for the category may be "hair style" and the feature information for the category may be "blond hair wave".

このように、スタイルコードは、グラフィックオブジェクトの外貌スタイルを定義できる様々なカテゴリーのうち、少なくとも１つのカテゴリーに関する情報（カテゴリーに対する識別情報及びカテゴリーに対する特徴情報のうち、少なくとも１つを含む）を含むスタイル情報を含むことができる。 Thus, the style code includes information about at least one category (including at least one of identification information for the category and characteristic information for the category) among the various categories that can define the appearance style of the graphic object. It can contain information.

例えば、図１に示された合成イメージ２００のうち、第１の合成イメージ２０１及び第２の合成イメージ２０２を「ヘアスタイル」カテゴリー観点で説明する。この場合、第１の合成イメージ２０１は、ヘアスタイルカテゴリーに対して、第１の基準イメージ１０１ｂによる「黒色ウェーブ髪２０１ａ」に該当するカテゴリーに対する特徴情報、すなわち、スタイル情報を有することができる。そして、第２の合成イメージ２０２は、ヘアスタイルカテゴリーに対して、第２の基準イメージ１０２ｂに該当する「前髪がある金髪ウェーブ髪２０２ａ」によるカテゴリーに対する特徴情報、すなわち、スタイル情報を有することができる。 For example, of the synthetic images 200 shown in FIG. 1, the first synthetic image 201 and the second synthetic image 202 will be described in terms of the "hair style" category. In this case, the first synthetic image 201 may have feature information, ie, style information, for a category corresponding to 'black wavy hair 201a' according to the first reference image 101b for the hairstyle category. The second composite image 202 can have feature information, ie, style information, for the category of “blonde wavy hair with bangs 202a” corresponding to the second reference image 102b for the hairstyle category. .

このように、第１及び第２の合成イメージ２０１、２０２は、同じカテゴリー（例えば、「ヘアスタイル」カテゴリー）に対して互いに異なるスタイル情報を有することができる。 In this way, the first and second composite images 201, 202 can have different style information for the same category (eg, the "Hairstyles" category).

したがって、スタイルコードにどのカテゴリーのどのような特徴を有するスタイル情報が含まれるかによって合成イメージの外貌スタイルが変わることができる。 Therefore, the appearance style of the synthesized image can be changed according to which category and what feature of the style information is included in the style code.

したがって、本発明に係るイメージ生成部１１０は、ソースイメージ１００ａに対して、基準イメージ１００ｂの外貌スタイルから抽出されたスタイル情報を含むスタイルコードを反映できる。これにより、イメージ生成部１１０は、基準イメージ１００ｂの外貌スタイルを有する合成イメージ２００を生成できる。 Therefore, the image generator 110 according to the present invention can reflect the style code including the style information extracted from the appearance style of the reference image 100b to the source image 100a. Accordingly, the image generator 110 can generate the synthetic image 200 having the appearance style of the reference image 100b.

このように、イメージ生成部１１０は、スタイルコードに含まれたスタイル情報に基づいて、ソースイメージ１００ａの少なくとも１つのカテゴリーに対する変換を行うことができる。 As such, the image generator 110 can transform at least one category of the source image 100a based on the style information included in the style code.

イメージ生成部１１０は、ソースイメージ（１００ａ、または、第１のグラフィックオブジェクト）の外貌スタイルを定義するための複数のカテゴリーのうち、スタイル情報に含まれたカテゴリーと同一または対応するカテゴリーを基準に変換を行うことができる。 The image generator 110 converts the source image (100a or the first graphic object) based on a category that is the same as or corresponds to the category included in the style information among a plurality of categories for defining the appearance style of the source image (100a or the first graphic object). It can be performed.

ここで、ソースイメージ１００ａの特定カテゴリーに対して変換を行うとは、ソースイメージ１００ａの特定カテゴリーに対する特徴情報または属性値を変換することであって、このような特徴情報が変更される場合、当該カテゴリーに対する視覚的外観が変わるようになる。 Here, converting a specific category of the source image 100a means converting feature information or attribute values for a specific category of the source image 100a. The visual appearance for categories will change.

次に、ドメイン及びドメイン特性情報について説明する。 Next, domains and domain characteristic information will be described.

ドメイン（ｄｏｍａｉｎ）は、前述した、イメージ（または、グラフィックオブジェクト）の外貌スタイルを区分する互いに異なる複数のカテゴリーのうち、基準になる少なくとも１つのカテゴリーに対する特徴情報（または、属性値）を意味できる。 A domain may mean feature information (or attribute values) for at least one category that is a reference among a plurality of different categories that classify appearance styles of images (or graphic objects).

ここで、「基準」は、イメージ変換の基準、イメージ分類の基準、またはイメージ区分の基準のように、様々な意味と受け入れられることができる。 Here, "criterion" can be taken to mean various things, such as an image transformation criterion, an image classification criterion, or an image segmentation criterion.

ドメイン（ｄｏｍａｉｎ）は、互いに異なる複数のイメージが、「特定カテゴリーに対して互いに同じ属性値を有する」または「特定カテゴリーに対して互いに異なる共通属性値を有する」と表現するとき、「特定カテゴリーに対する属性値」がつまり、ドメインを意味できる。 When a plurality of different images "have the same attribute value for a specific category" or "have different common attribute values for a specific category", the domain is defined as " Attribute value" can mean domain.

例えば、複数のカテゴリーのうち、「性別」カテゴリーを基準にドメインを説明するとき、図２に示されたように、第１、第２、及び第３のイメージ２０１、２０２、２０３は、同じドメインを有する。そして、第４、第５、及び第６イメージ２０４、２０５、２０６も同じドメインを有する。しかし、第１、第２、及び第３のイメージ２０１、２０２、２０３のドメインは、第４、第５、及び第６のイメージ２０４、２０５、２０６のドメインと互いに異なることができる。すなわち、第１、第２、及び第３のイメージ２０１、２０２、２０３は、「女性」であり、第４、第５、及び第６のイメージ２０４、２０５、２０６のドメインは、「男性」である。このとき、「女性」または「男性」がつまり、ドメインを意味できる。 For example, when describing a domain based on the 'gender' category among a plurality of categories, as shown in FIG. have And the fourth, fifth and sixth images 204, 205, 206 also have the same domain. However, the domains of the first, second and third images 201, 202, 203 can be different from the domains of the fourth, fifth and sixth images 204, 205, 206. FIG. That is, the first, second and third images 201, 202, 203 are "female" and the domain of the fourth, fifth and sixth images 204, 205, 206 are "male". be. At this time, 'female' or 'male' can mean a domain.

このように、ドメインは、外貌スタイルと関連した様々なカテゴリーに対する属性値のうち、少なくとも１つであって、イメージの変換、イメージの分類、またはイメージの区分基準になる指標でありうる。 As such, the domain is at least one of attribute values for various categories related to the appearance style, and may be an image transformation, an image classification, or an index serving as an image classification criterion.

一方、スタイルコードに含まれたドメイン特性情報は、特定ドメイン（または、ターゲットドメイン）を表すデータであって、外貌スタイルを区分する特定カテゴリー（または、属性）及びこれに対する特徴情報（属性値）を含むことができる。 On the other hand, the domain characteristic information included in the style code is data representing a specific domain (or target domain), and includes a specific category (or attribute) that distinguishes the appearance style and characteristic information (attribute value) therefor. can contain.

一方、イメージ生成部１１０は、スタイルコードに含まれたドメイン特性情報に基づいて合成イメージ２００のドメインを決定できる。 Meanwhile, the image generator 110 can determine the domain of the composite image 200 based on the domain property information included in the style code.

前記イメージ生成部１１０は、合成イメージ２００がスタイルコードに含まれたドメイン特性情報によるドメインを有するようにソースイメージ１００ａを変換できる。 The image generator 110 can transform the source image 100a so that the composite image 200 has a domain according to the domain characteristic information included in the style code.

ここで、スタイルコードに含まれたドメイン特性情報は、基準イメージの特定ドメインに関する情報でありうる。すなわち、イメージ生成部１１０は、合成イメージ２００が、基準イメージの特定ドメインと同じドメインを有するようにソースイメージ１００ａを変換できる。 Here, the domain characteristic information included in the style code may be information about a specific domain of the reference image. That is, the image generator 110 can transform the source image 100a so that the synthetic image 200 has the same domain as the specific domain of the reference image.

例えば、スタイルコードに第４、第５、及び第６の基準イメージ１０４ｂ、１０５ｂ、１０６ｂによる「男性」に該当する特定ドメインに対するドメイン特性情報が含まれた場合、イメージ生成部１１０により生成された第４、第５、及び第６のイメージ２０４、２０５、２０６は、「男性」ドメインを有することができる。 For example, when the style code includes domain characteristic information for a specific domain corresponding to 'male' according to the fourth, fifth, and sixth reference images 104b, 105b, and 106b, the first Fourth, fifth, and sixth images 204, 205, 206 may have a "male" domain.

このように、イメージ生成部１１０は、合成イメージ２０４、２０５、２０６が基準イメージ（例えば、第４、第５、及び第６の基準イメージ１０４ｂ、１０５ｂ、１０６ｂ）の特定ドメイン（例えば、男性）を有するように、ソースイメージ１００ａに前記ドメイン特性情報を反映できる。 In this way, the image generator 110 ensures that the composite images 204, 205, 206 are specific domains (e.g., men) of the reference images (e.g., the fourth, fifth, and sixth reference images 104b, 105b, 106b). , the domain characteristic information can be reflected in the source image 100a.

このとき、イメージ生成部１１０は、ソースイメージ１００ａのドメインとスタイルコードに含まれたドメイン特性情報による特定ドメインとが異なる場合、これを考慮せずに合成イメージ２００のドメインを決定できる。 At this time, if the domain of the source image 100a is different from the specific domain according to the domain characteristic information included in the style code, the image generator 110 can determine the domain of the composite image 200 without considering the difference.

すなわち、イメージ生成部１１０は、ソースイメージ１００ａの特定ドメインと基準イメージ１００ｂの特定ドメインとが異なる場合、ソースイメージ１００ａの特定ドメインより、前記基準イメージ１００ｂの特定ドメインを優先して、合成イメージ（または、第３のグラフィックオブジェクト）のドメインを決定できる。その結果、合成イメージ２００は、基準イメージ１００ｂの特定ドメインを有する。 That is, when the specific domain of the source image 100a and the specific domain of the reference image 100b are different, the image generation unit 110 preferentially prioritizes the specific domain of the reference image 100b over the specific domain of the source image 100a to generate a synthesized image (or , third graphic object) can be determined. As a result, composite image 200 has the specific domain of reference image 100b.

一方、イメージ生成部１１０は、スタイルコードに基づいてソースイメージ１００ａを変換する場合、ソースイメージ１００ａの外貌的正体性を決定する少なくとも１つの外貌特徴部分を基準に、残りの部分に対する外貌スタイルを変更できる。 On the other hand, when transforming the source image 100a based on the style code, the image generation unit 110 changes the appearance style for the rest of the source image 100a based on at least one appearance characteristic portion that determines the appearance identity of the source image 100a. can.

より具体的に、ソースイメージ１００ａは、前記ソースイメージ１００ａの外貌的正体性を決定する少なくとも１つの外貌特徴部分を含むことができる。イメージ生成部１００ａは、ソースイメージ１００ａの外貌特徴部分を除いた残りの部分を中心に、前記ソースイメージ１００ａに対して基準イメージ１００ｂの外貌スタイルを反映できる。このとき、基準イメージ１００ｂの外貌スタイルは、スタイルコードに含まれたドメイン特性情報に対応する基準イメージの特定ドメインを基準に定義された外貌スタイルを意味できる。 More specifically, the source image 100a may include at least one facial feature that determines the physical identity of the source image 100a. The image generator 100a can reflect the appearance style of the reference image 100b on the source image 100a, centering on the rest of the source image 100a except for the appearance characteristic portion. At this time, the appearance style of the reference image 100b may mean an appearance style defined based on a specific domain of the reference image corresponding to the domain characteristic information included in the style code.

ソースイメージ１００ａ及び基準イメージ１００ｂが人に対応する場合、前記ソースイメージ１００ａの前記外貌特徴部分は、人の目、鼻、及び口のうち、少なくとも１つに対応する部分でありうる。このとき、前記基準イメージ１００ｂの外貌スタイルは、人の頭髪スタイル、ひげ、年齢、皮膚色、メーキャップのうち、少なくとも１つと関連したものでありうる。 When the source image 100a and the reference image 100b correspond to a person, the feature portion of the source image 100a may be a portion corresponding to at least one of the person's eyes, nose, and mouth. At this time, the appearance style of the reference image 100b may be associated with at least one of a person's hair style, beard, age, skin color, and makeup.

一方、前記ソースイメージ１００ａの外貌的正体性を決定する要素は様々でありうるし、イメージ生成部１１０は、合成イメージ２００の合成目的によって、外貌的正体性を決定する要素を異なるように決定することができる。 On the other hand, there may be various factors that determine the appearance identity of the source image 100a, and the image generator 110 may determine the factors that determine the appearance identity differently according to the purpose of synthesizing the composite image 200. can be done.

イメージ生成部１１０において、どの部分を外貌的正体性と決定するか否かは、予め入力された情報に基づいて決定されることも可能である。 In the image generator 110, it is also possible to determine which part is to be determined as the appearance authenticity based on pre-input information.

例えば、合成イメージ２００の目的が特定人物に対する様々な頭髪スタイルの変化を表すことであるならば、このとき、外貌的正体性を表す外貌特徴部分は、特定人物の目、鼻、口、顔型などに対応する部分でありうる。 For example, if the purpose of the composite image 200 is to represent various hair style changes for a specific person, then the facial features representing the physical identity are the specific person's eyes, nose, mouth, and facial features. and so on.

その結果、図１に示されたように、イメージ生成部１１０は、ソースイメージ１００ａの外貌的正体性に該当する外貌特徴部分を除いた残りの部分を中心に、前記ソースイメージ１００ａに対して基準イメージ１００ｂの外貌スタイル（例えば、ヘアスタイル）を反映できる。その結果、ソースイメージ１００ａの外貌的正体性を維持しながら、基準イメージ１００ｂの外貌スタイルを有する合成イメージ２００が生成され得る。 As a result, as shown in FIG. 1, the image generating unit 110 generates a reference image for the source image 100a, focusing on the rest of the source image 100a except for the appearance characteristic portion corresponding to the appearance authenticity of the source image 100a. The appearance style (eg, hairstyle) of the image 100b can be reflected. As a result, a composite image 200 can be generated that has the appearance style of the reference image 100b while maintaining the appearance identity of the source image 100a.

一方、ここで、外貌的正体性は、ソースイメージ１００ａに含まれたグラフィックオブジェクトのポーズ（ｐｏｓｅ）または姿勢を含むことができる。 Meanwhile, here, the physical identity may include poses or postures of graphic objects included in the source image 100a.

すなわち、イメージ生成部１１０は、ソースイメージ１００ａに含まれたグラフィックオブジェクトのポーズと同じポーズを有するグラフィックオブジェクトが含まれるように合成イメージ２００を生成できる。 That is, the image generator 110 may generate the composite image 200 so as to include graphic objects having the same poses as those of the graphic objects included in the source image 100a.

このように、本発明に係るイメージ生成システム１００は、入力部１１０を介してソースイメージを受信し（Ｓ３１０）、スタイルコード入力部１２０を介して外貌スタイルと関連したスタイルコードを受信する（Ｓ３２０）。そして、受信されたスタイルコードを用いて、スタイルコードに対応する外貌スタイルが反映されたイメージを生成できる（Ｓ３３０）。 As described above, the image generation system 100 according to the present invention receives the source image through the input unit 110 (S310), and receives the style code associated with the appearance style through the style code input unit 120 (S320). . Using the received style code, an image reflecting the appearance style corresponding to the style code can be generated (S330).

以上で説明したように、本発明に係るイメージ生成システム１００は、イメージ生成部１１０にドメインの特性情報を含むスタイルコードに基づいて合成イメージを生成できる。 As described above, the image generation system 100 according to the present invention can generate a composite image based on the style code including the domain characteristic information in the image generation unit 110 .

以下では、スタイルコードを生成する方法について添付された図面とともにより具体的に説明する。図４、図５、及び図６は、本発明に係るマッピングネットワークを利用してスタイルコードを生成する方法を説明するための概念図である。 Hereinafter, a method for generating the style code will be described in more detail with attached drawings. 4, 5, and 6 are conceptual diagrams for explaining a method of generating style codes using a mapping network according to the present invention.

前述したように、本発明に係るイメージ生成部１１０は、スタイルコード入力部１２０を介して入力されるスタイルコードにより、ソースイメージ１００ａにおいてどのドメインを基準にイメージを変換するかを決定できる。 As described above, the image generation unit 110 according to the present invention can determine which domain in the source image 100a is to be transformed based on the style code input through the style code input unit 120. FIG.

すなわち、スタイルコードは、特定ドメイン（または、ターゲットドメイン）に対するドメイン特性情報及び前記特定ドメインを基準に抽出されたスタイル情報を含むことができる。一方、スタイルコードに含まれたドメイン特性情報に基づいて、ソースイメージ１００ａの変換対象ターゲットドメインが決定される。 That is, the style code may include domain characteristic information for a specific domain (or target domain) and style information extracted based on the specific domain. On the other hand, the target domain to be transformed of the source image 100a is determined based on the domain characteristic information included in the style code.

このようなスタイルコードは、図４に示されたマッピングネットワーク４００から抽出されることができる。イメージ生成部１１０は、マッピングネットワーク４００から抽出されたスタイルコードを用いて、ソースイメージの特定ドメインを、スタイルコードに含まれたドメイン特性情報による特定ドメイン（または、ターゲットドメイン）に変換することができる。 Such style codes can be extracted from the mapping network 400 shown in FIG. The image generator 110 can transform a specific domain of the source image into a specific domain (or target domain) according to the domain characteristic information included in the style code using the style code extracted from the mapping network 400. .

より具体的に、図４に示されたように、マッピングネットワーク４００は、マッピングネットワーク部４１０、入力部４２０、及び出力部４３０のうち、少なくとも１つを備えることができる。 More specifically, as shown in FIG. 4, the mapping network 400 may include at least one of a mapping network unit 410, an input unit 420, and an output unit 430. FIG.

マッピングネットワーク部４１０は、ガウス分布４００ａからノイズ情報（ｚ１ないしｚ７）を抽出し、抽出されたノイズ情報を利用してスタイルコードを生成できる。 The mapping network unit 410 can extract noise information (z1 to z7) from the Gaussian distribution 400a and generate a style code using the extracted noise information.

このようなノイズ情報は、潜在コード（ｌａｔｅｎｔｃｏｄｅ）とも命名されることができる。 Such noise information can also be named latent code.

マッピングネットワーク部４１０は、ガウス分布４００ａからランダムにサンプリングを行うことにより、様々なドメイン及び様々なスタイルを有する様々なスタイルコードを生成できる。 Mapping network unit 410 can generate different style codes with different domains and different styles by randomly sampling from Gaussian distribution 400a.

マッピングネットワーク部４１０は、このようなガウス分布４００ａからサンプリングを行ってノイズ情報（潜在コードまたはノイズ）を抽出できる。このように抽出されたノイズ情報は、特定ドメインに対するスタイル情報になることができる。 The mapping network unit 410 can extract noise information (latent code or noise) by sampling from such a Gaussian distribution 400a. The noise information extracted in this way can be style information for a specific domain.

マッピングネットワーク部４１０は、スタイルコードに反映しようとする特定ドメインの情報とガウス分布４００ａから抽出された特定ノイズ情報とを組み合わせることができる。そして、マッピングネットワーク部４１０は、前記組み合わせに基づいて、特定ドメインに対する特性情報及び前記抽出された特定ノイズ情報に対応するスタイル情報を含むスタイルコードを生成できる。 The mapping network unit 410 can combine specific domain information to be reflected in the style code and specific noise information extracted from the Gaussian distribution 400a. Based on the combination, the mapping network unit 410 may generate a style code including characteristic information for a specific domain and style information corresponding to the extracted specific noise information.

このとき、ガウス分布４００ａは、複数のイメージに対するものであって、複数のイメージに対するデータセット（ｄａｔａｓｅｔ）の確率分布でありうる。 At this time, the Gaussian distribution 400a is for a plurality of images and may be a probability distribution of a data set for the plurality of images.

前述したように、マッピングネットワーク部４１０は、ノイズ情報からスタイルコードを変換するとき、変換されたスタイルコードにドメインの情報が含まれるようにスタイルコードを生成できる。 As described above, when the mapping network unit 410 transforms the style code from the noise information, the style code can be generated such that the transformed style code includes the domain information.

例えば、図５に示されたように、ガウス分布４００ａから特定ノイズ情報ｚ１が抽出された場合、当該ノイズ情報ｚ１がどのドメインに対することであるかによって、互いに異なるスタイルコードが生成され得る。 For example, as shown in FIG. 5, when specific noise information z1 is extracted from the Gaussian distribution 400a, different style codes may be generated depending on which domain the noise information z1 corresponds to.

すなわち、マッピングネットワーク部４００は、ガウス分布４００ａから同一ノイズ情報が抽出されても、基準になるドメインによって、互いに異なるスタイルコードを生成できる。 That is, the mapping network unit 400 can generate different style codes according to the reference domain even if the same noise information is extracted from the Gaussian distribution 400a.

このために、マッピングネットワーク部４００は、互いに異なるドメインに対するスタイルコードを出力するための複数の出力分岐があるＭＬＰ（ｍｕｌｔｉｌａｙｅｒｐｅｒｃｅｐｔｒｏｎ）（ＭＬＰｗｉｔｈｍｕｌｔｉｐｌｅｏｕｔｐｕｔｂｒａｎｃｈｅｓ）で構成されることができる。このような、同じノイズ情報に対して互いに異なるスタイルコードが生成され得る。この場合、互いに異なるスタイルコードは、各々互いに異なるターゲットドメインに対応することができる。 To this end, the mapping network unit 400 may be configured with a multilayer perceptron (MLP with multiple output branches) having a plurality of output branches for outputting style codes for different domains. Different style codes can be generated for the same noise information. In this case, different style codes may correspond to different target domains.

より具体的に、図５において特定ノイズ情報ｚ１は、図１及び図２において説明した基準イメージ１０１ｂを表すためのデータを含むことができる。 More specifically, the specific noise information z1 in FIG. 5 can include data representing the reference image 101b described in FIGS.

マッピングネットワーク部４１０は、基準イメージ１０１ｂに対応するノイズ情報ｚからスタイルコードを生成できる。この場合、マッピングネットワーク部４１０は、互いに異なる様々なドメインを基準にスタイルコードを生成できる。すなわち、マッピングネットワーク部４００は、特定ドメインを基準に互いに異なるスタイルコードを生成できる。 The mapping network unit 410 can generate a style code from the noise information z corresponding to the reference image 101b. In this case, the mapping network unit 410 can generate style codes based on different domains. That is, the mapping network unit 400 can generate different style codes based on a specific domain.

例えば、図５に示されたように、スタイルコードに含まれる特定ドメイン（ターゲットドメイン）の基準が「性別」である場合、マッピングネットワーク部４１０は、基準イメージ１０１ｂの性別（例えば、「女性」）がドメイン特性情報として含まれるようにスタイルコードを生成できる。 For example, as shown in FIG. 5, when the criterion of the specific domain (target domain) included in the style code is "gender", the mapping network unit 410 maps the gender (for example, "female") of the reference image 101b. You can generate style code so that is included as domain property information.

このとき、マッピングネットワーク部４１０は、ノイズ情報ｚから前記特定ドメインが有する特徴（例えば、「女性」の特徴：長髪、化粧）を中心にスタイル情報を抽出できる。 At this time, the mapping network unit 410 can extract style information based on the features of the specific domain (for example, features of 'female': long hair, makeup) from the noise information z.

さらに他の例として、図５に示されたように、スタイルコードに含まれる特定ドメイン（ターゲットドメイン）の基準が「年齢」である場合、マッピングネットワーク部４１０は、基準イメージ１０１ｂの年齢（例えば、「若者」）がドメイン特性情報として含まれるようにスタイルコードを生成できる。 As yet another example, as shown in FIG. 5, when the criterion of the specific domain (target domain) included in the style code is "age", the mapping network unit 410 may map the age of the reference image 101b (e.g., "youth") can be generated to include the domain attribute information.

このとき、マッピングネットワーク部４１０は、ノイズ情報ｚから前記特定ドメインが有する特徴（例えば、「若い女性」の特徴：滑らかな皮膚、化粧）を中心にスタイル情報を抽出できる。 At this time, the mapping network unit 410 can extract style information based on features of the specific domain (for example, features of 'young woman': smooth skin and makeup) from the noise information z.

また、図示したように、マッピングネットワーク部４１０は、ヘアカラー、皮膚カラー、ヘアスタイル、顔型など、様々なターゲットドメインを基準に、ノイズ情報ｚからスタイル情報を抽出できる。 Also, as illustrated, the mapping network unit 410 can extract style information from the noise information z based on various target domains such as hair color, skin color, hairstyle, and face shape.

一方、本発明において、「ターゲットドメインを基準にスタイル情報を抽出する」とは、ノイズ情報ｚから、ターゲットドメインと関連した特徴（例えば、ターゲットドメインが女性である場合、長髪、化粧）と関連した外貌的な特徴を有するスタイル情報を抽出することを意味できる。 On the other hand, in the present invention, ``extracting style information based on the target domain'' means that, from the noise information z, features associated with the target domain (for example, if the target domain is female, long hair, makeup) It can mean extracting style information with appearance features.

このように、本発明に係るマッピングネットワーク部４１０は、複数の基準イメージに対するガウス分布から基準イメージ１０１ｂに対応するノイズ情報ｚを抽出し、前記抽出されたノイズ情報ｚを利用して、基準イメージ１０１ｂの外貌スタイルと関連したスタイルコードを生成できる。 As described above, the mapping network unit 410 according to the present invention extracts the noise information z corresponding to the reference image 101b from the Gaussian distribution for a plurality of reference images, and uses the extracted noise information z to obtain the reference image 101b. can generate the style code associated with the appearance style of

前述したように、マッピングネットワーク部４１０は、前記ノイズ情報に前記第２のグラフィックオブジェクトの外貌スタイルに基づいて分類可能な複数のドメインのうち、いずれか１つのドメイン（または、ターゲットドメイン、特定ドメイン）を基準にスタイルコードを生成できる。したがって、スタイルコードは、前記いずれか１つのドメイン（ターゲットドメイン）によるドメイン特性情報が反映されて存在することができる。 As described above, the mapping network unit 410 classifies the noise information into one of a plurality of domains (or a target domain or a specific domain) that can be classified based on the appearance style of the second graphic object. You can generate style code based on . Therefore, the style code can exist by reflecting the domain characteristic information according to one of the domains (target domain).

一方、図５に示されたように、スタイルコードは、ドメインを基準に互いに異なるスケール（ｓｃａｌｅ）を有するベクトルで構成されることができる。 On the other hand, as shown in FIG. 5, the style code may consist of vectors having different scales based on the domain.

例え、図示されてはいないが、マッピングネットワーク４００は、学習部をさらに備えることができる。マッピングネットワーク４００の学習部は、抽出されたノイズ情報をスタイルコードに変換する学習を行うことができる。 For example, although not shown, mapping network 400 may further include a learning unit. The learning unit of the mapping network 400 can learn to convert the extracted noise information into style codes.

より具体的に、学習部は、抽出されたノイズ情報から、与えられた特定ドメインに対応するスタイル情報が抽出されるようにする学習を行うことができる。 More specifically, the learning unit can perform learning to extract style information corresponding to a given specific domain from the extracted noise information.

このような学習を介して、マッピングネットワーク部４１０は、ノイズ情報から前記特定ドメインが有する特徴（例えば、「女性」の特徴）をより正確に反映されるようにするスタイル情報を抽出できる。 Through such learning, the mapping network unit 410 can extract style information that more accurately reflects the features of the specific domain (for example, the features of 'female') from the noise information.

すなわち、学習部は、マッピングネットワーク部４１０が、ノイズ情報から特定ドメイン（ターゲットドメイン）に対してありそうな（確率が高い）スタイル情報を抽出させる学習を進行できる。マッピングネットワーク部４１０は、特定ドメインに対してありそうなスタイル情報を含むスタイルコードを生成することにより、ソースイメージをより実際に近く変換することができる。 That is, the learning unit can perform learning for the mapping network unit 410 to extract style information likely (high probability) for a specific domain (target domain) from noise information. The mapping network unit 410 can more realistically transform the source image by generating style codes that contain likely style information for a particular domain.

例えば、ターゲットドメインが女性である場合、初期にマッピングネットワーク部４１０から抽出されたスタイルコードに「ひげ」に対するスタイル情報が含まれた場合、学習を介して、「ひげ」に対するスタイル情報が除外され得る。 For example, when the target domain is female, if the style code initially extracted from the mapping network unit 410 includes style information for 'beard', the style information for 'beard' may be excluded through learning. .

一方、マッピングネットワーク４００は、ガウス分布内に存在するノイズ情報に基づいてスタイルコードを生成するので、連続する隣接したノイズ情報は、類似したスタイル情報を含むことができる。 On the other hand, because the mapping network 400 generates style codes based on noise information present within a Gaussian distribution, consecutive adjacent noise information can contain similar style information.

したがって、図１において説明したソースイメージ１００ａに対し、ターゲットドメインを「女性」としてイメージ変換を行う場合、図５において説明した特定ノイズ情報ｚ及びこれと隣接したノイズ情報に基づいて生成されたスタイルコードにより合成されたイメージ６１０、６２０、６３０、６４０、６６０は、図６に示されたように、隣り合った合成イメージと互いに類似した外貌スタイルを有することができる。 Therefore, when the source image 100a described in FIG. 1 is subjected to image transformation with the target domain as "female", the style code generated based on the specific noise information z described in FIG. Images 610, 620, 630, 640, and 660 synthesized by may have appearance styles similar to adjacent synthesized images, as shown in FIG.

以上で説明したように、本発明に係るマッピングネットワークシステムは、ノイズ情報から様々なドメインに対するスタイルコードを生成できる。さらに、イメージ生成部１１０は、このようなスタイルコードを用いて、ソースイメージに対する様々なドメインの変更を行いながら、様々なスタイルを有する合成イメージを生成できる。 As explained above, the mapping network system according to the present invention can generate style codes for various domains from noise information. In addition, the image generator 110 can use such style codes to generate composite images with different styles while making different domain changes to the source image.

一方、以上では、マッピングネットワークシステムを利用してスタイルコードを生成する方法について説明したが、本発明では、スタイルエンコーダを用いて、スタイルコードを生成することも可能である。以下では、スタイルエンコーダを活用してスタイルコードを生成する方法について添付された図面とともにより具体的に説明する。図７及び図８は、本発明に係るスタイルエンコーダを用いてスタイルコードを生成する方法を説明するための概念図である。 On the other hand, although the method of generating the style code using the mapping network system has been described above, the style encoder can also be used to generate the style code in the present invention. Hereinafter, a method of generating a style code using a style encoder will be described in more detail with attached drawings. 7 and 8 are conceptual diagrams for explaining a method of generating style codes using the style encoder according to the present invention.

前述したように、本発明に係るイメージ生成部１１０は、スタイルコード入力部１２０を介して入力されるスタイルコードを介して、ソースイメージ１００ａでどのドメインを基準にイメージを変換するかを決定できる。 As described above, the image generating unit 110 according to the present invention can determine which domain of the source image 100a is to be transformed based on the style code input through the style code input unit 120. FIG.

すなわち、スタイルコードは、特定ドメイン（または、ターゲットドメイン）に対するドメイン特性情報及び前記特定ドメインを基準に抽出されたスタイル情報を含むことができる。一方、スタイルコードに含まれたドメイン特性情報に基づいてソースイメージ１００ａの変換対象ターゲットドメインが決定される。 That is, the style code may include domain characteristic information for a specific domain (or target domain) and style information extracted based on the specific domain. Meanwhile, the target domain to be transformed of the source image 100a is determined based on the domain characteristic information included in the style code.

このようなスタイルコードは、図７に示されたスタイルエンコーダシステム７００から抽出されることができる。イメージ生成部１１０は、スタイルエンコーダシステム７００から抽出されたスタイルコードを用いて、ソースイメージの特定ドメインを、スタイルコードに含まれたドメイン特性情報による特定ドメイン（または、ターゲットドメイン）に変換することができる。 Such style code can be extracted from the style encoder system 700 shown in FIG. The image generator 110 can convert a specific domain of the source image into a specific domain (or target domain) according to the domain characteristic information included in the style code using the style code extracted from the style encoder system 700. can.

より具体的に、図７に示されたように、スタイルエンコーダシステム７００は、スタイルエンコーダ７１０、入力部７２０、及び出力部７３０のうち、少なくとも１つを備えることができる。 More specifically, the style encoder system 700 may include at least one of a style encoder 710, an input unit 720, and an output unit 730, as shown in FIG.

スタイルエンコーダ７１０は、入力部７２０を介して入力される基準イメージ（７０１ないし７０３）から特定ドメイン（または、ターゲットドメイン）を基準にスタイル情報を抽出できる。そして、スタイルエンコーダ部７１０は、抽出されたスタイル情報及び特定ドメインに対するドメイン特性情報を利用してスタイルコードを生成できる。 The style encoder 710 can extract style information based on a specific domain (or target domain) from the reference images 701 to 703 input through the input unit 720 . Also, the style encoder unit 710 may generate a style code using the extracted style information and the domain characteristic information for the specific domain.

スタイルエンコーダ７１０は、基準イメージ１０１ｂ（図７の図面符号７０１ないし７０６参照）から、基準イメージ１０１ｂの外貌スタイルと関連したスタイル情報を抽出できる。 The style encoder 710 can extract style information related to the appearance style of the reference image 101b from the reference image 101b (see reference numerals 701 to 706 in FIG. 7).

このとき、スタイルエンコーダ７１０は、基準イメージから、前記基準イメージ１０１ｂの外貌スタイルを基に分類可能な複数のドメインのうち、いずれか１つのドメインを基準に前記スタイル情報を抽出できる。ここで、いずれか１つのドメインは、特定ドメインまたはターゲットドメインと命名されることができる。 At this time, the style encoder 710 may extract the style information from the reference image based on one of a plurality of domains that can be classified based on the appearance style of the reference image 101b. Here, any one domain can be named a specific domain or a target domain.

図８に示された基準イメージ７０１を例を挙げて説明すれば、スタイルエンコーダ７１０は、基準イメージ７０１から、基準イメージ７０１の外貌スタイルを基に分類可能な複数のドメイン（例えば、女性、黒色の長髪、白色皮膚など）のうち、いずれか少なくとも１つのドメイン（例えば、女性）を基準にスタイル情報を抽出できる。 Using the reference image 701 shown in FIG. 8 as an example, the style encoder 710 extracts from the reference image 701 a plurality of domains (e.g., female, black, and black) that can be classified based on the appearance style of the reference image 701 . Style information can be extracted based on at least one domain (for example, female) among long hair, white skin, etc.).

ここで、基準になるドメインは、前述したように、ターゲットドメインと命名されることができる。スタイルエンコーダ７１０は、基準イメージ７０１から互いに異なるターゲットドメインに各々該当するスタイル情報を抽出し、これを利用してスタイルコードを生成できる。 Here, the reference domain can be named the target domain as described above. The style encoder 710 may extract style information corresponding to different target domains from the reference image 701 and generate style codes using the extracted style information.

例えば、図８に示されたように、スタイルコードに含まれる特定ドメイン（ターゲットドメイン）の基準が「性別」である場合、スタイルエンコーダ７１０は、基準イメージ７０１の性別（例えば、「女性」）がドメイン特性情報として含まれるようにスタイルコードを生成できる。 For example, as shown in FIG. 8, if the criteria for the specific domain (target domain) included in the style code is 'gender', the style encoder 710 determines that the gender (e.g., 'female') of the reference image 701 is Style code can be generated to be included as domain property information.

このとき、スタイルエンコーダ７１０は、基準イメージ７０１から前記特定ドメインが有する特徴（例えば、「女性」の特徴：長髪、化粧）を中心にスタイル情報を抽出できる。 At this time, the style encoder 710 can extract style information from the reference image 701 based on features of the specific domain (e.g., features of 'female': long hair, makeup).

さらに他の例として、図８に示されたように、スタイルコードに含まれる特定ドメイン（ターゲットドメイン）の基準が「年齢」である場合、スタイルエンコーダ７１０は、基準イメージ７０１の年齢（例えば、「若者」）がドメイン特性情報として含まれるようにスタイルコードを生成できる。 As yet another example, as shown in FIG. 8, if the criteria for the specific domain (target domain) included in the style code is "age", the style encoder 710 may determine the age of the reference image 701 (e.g., " Youth") can be generated to include the style code as domain property information.

このとき、スタイルエンコーダ７１０は、基準イメージ７０１から前記特定ドメインが有する特徴（例えば、「若い女性」の特徴：滑らかな皮膚、化粧）を中心にスタイル情報を抽出できる。 At this time, the style encoder 710 can extract style information from the reference image 701 based on features of the specific domain (for example, features of 'young woman': smooth skin, makeup).

また、図示したように、スタイルエンコーダ７１０は、ヘアカラー、皮膚カラー、ヘアスタイル、顔型など、様々なターゲットドメインを基準に、基準イメージ７０１からスタイル情報を抽出できる。 Also, as shown, the style encoder 710 can extract style information from the reference image 701 with reference to various target domains, such as hair color, skin color, hairstyle, face shape, and the like.

そして、このように抽出されたスタイル情報は、基準になるターゲットドメインに該当するドメイン特性情報を含んで、互いに異なるスタイルコードとして生成されることができる。 The style information thus extracted can be generated as different style codes including domain characteristic information corresponding to a target domain serving as a reference.

前述したように、スタイルエンコーダ７１０は、基準イメージ７０１の外貌スタイルを基に分類可能な複数のドメイン（例えば、性別、頭髪スタイル等）のうち、いずれか１つのドメイン（または、ターゲットドメイン、特定ドメイン）を基準にスタイルコードを生成できる。したがって、スタイルコードは、前記いずれか１つのドメイン（ターゲットドメイン）によるドメイン特性情報が反映されて存在することができる。一方、図８に示されたように、スタイルコードは、ドメインを基準に互いに異なるスケール（ｓｃａｌｅ）のベクトルで構成されることができる。 As described above, the style encoder 710 selects one of a plurality of domains (e.g., gender, hair style, etc.) that can be classified based on the appearance style of the reference image 701 (or a target domain, a specific domain, etc.). ) can be used to generate style code. Therefore, the style code can exist by reflecting the domain characteristic information according to one of the domains (target domain). On the other hand, as shown in FIG. 8, the style code may consist of vectors with different scales based on the domain.

以上で説明したように、本発明に係るイメージ生成システムのイメージ生成部は、マッピングネットワークまたはスタイルエンコーダシステムを介して生成されたスタイルコードを用いて、ソースイメージの特定ドメインを基準イメージのターゲットドメインに変更することができる。 As described above, the image generation unit of the image generation system according to the present invention uses the style code generated through the mapping network or style encoder system to map the specific domain of the source image to the target domain of the reference image. can be changed.

一方、本発明に係るイメージ生成システムは、学習を介してイメージ生成の性能を高めることができ、以下では、学習過程について添付された図面とともにより具体的に説明する。図９は、本発明に係るイメージ生成システムを学習する方法を説明するための概念図である。 Meanwhile, the image generation system according to the present invention can improve the performance of image generation through learning. Hereinafter, the learning process will be described in detail with reference to the accompanying drawings. FIG. 9 is a conceptual diagram for explaining a method of learning an image generation system according to the present invention.

本発明では、様々な学習アルゴリズムを利用して、イメージ生成システムを学習させることが可能である。イメージ生成部（１１０、図１参照）は、スタイルコードによるターゲットドメインと区分されない合成イメージを作るようにする学習が進行される。 Various learning algorithms can be used in the present invention to train the image generation system. The image generator (110, see FIG. 1) is trained to create a synthetic image that is not classified with the target domain according to the style code.

例えば、図示されてはいないが、本発明に係るイメージ生成システム１００は、学習部をおき、様々な学習アルゴリズムを利用してイメージ生成部１１０に対する学習を行うことができる。イメージ生成部１１０は、スタイルコードにより定義されるターゲットドメイン（例えば、黒髪）と、さらに類似または同一の合成イメージを生成するように学習されることができる。 For example, although not shown, the image generation system 100 according to the present invention may include a learning unit and use various learning algorithms to train the image generation unit 110 . The image generator 110 can be trained to generate synthetic images that are more similar or identical to the target domain (eg, black hair) defined by the style code.

一例として、学習部は、識別部（Ｄｉｓｃｒｉｍｉｎａｔｏｒ、９００）を利用して学習を進行できる。識別部９００は、ターゲットドメイン（例えば、黒髪）を基準に、合成イメージ２０１と基準イメージ１０１ｂとを比較できる。そして、比較結果に基づいて、識別部９００は、合成イメージ２０１が実際（または、本物）イメージ（ｒｅａｌｉｍａｇｅ）であるか、または、作られた偽物イメージ（ｆａｋｅｉｍａｇｅ）であるかを判断できる。 For example, the learner may perform learning using a discriminator (900). The identification unit 900 can compare the synthetic image 201 and the reference image 101b based on the target domain (eg, black hair). Based on the comparison result, the identification unit 900 can determine whether the synthetic image 201 is a real image or a fake image.

識別部９００は、合成イメージ２０１が実際イメージであると判断された場合、「１」の値を出力し、偽物イメージであると判断された場合、「０」の値を出力できる。 The identifying unit 900 may output a value of '1' if the synthetic image 201 is determined to be a real image, and output a value of '0' if it is determined to be a fake image.

さらに、学習部は、識別部９００での比較結果に該当する、合成イメージ２０１と基準イメージ１０１との間の差値を用いてイメージ生成部１１０を学習できる。イメージ生成部１１０は、前記差値が最小になるようにするイメージを生成するように学習されることができる。 Further, the training unit can train the image generation unit 110 using the difference value between the synthetic image 201 and the reference image 101 corresponding to the comparison result of the identification unit 900 . The image generator 110 can be trained to generate an image that minimizes the difference value.

また、例え、図示されてはいないが、スタイルエンコーダシステム７００は、学習部をさらに備えることができる。スタイルエンコーダシステム７００の学習部は、イメージ生成部１１０を介して生成された合成イメージから、前記合成イメージのスタイルコードが抽出されるように前記スタイルエンコーダを制御できる。ここで、合成イメージは、スタイルエンコーダ部７１０により生成されたスタイルコードにより生成されたイメージでありうる。 Also, although not shown, the style encoder system 700 may further comprise a learning unit. The learning unit of the style encoder system 700 can control the style encoder to extract the style code of the synthesized image from the synthesized image generated through the image generator 110 . Here, the synthetic image may be an image generated by the style code generated by the style encoder unit 710. FIG.

学習部は、スタイルエンコーダ７１０により生成されたスタイルコードが反映された合成イメージを利用してスタイルエンコーダ７１０を学習させることができる。 The learning unit may train the style encoder 710 using the synthesized image reflecting the style code generated by the style encoder 710 .

より具体的に、学習部は、スタイルエンコーダ７１０に合成イメージを基準イメージとして入力し、合成イメージからスタイルコードを生成できる。このとき、ターゲットドメインは、合成イメージの生成に使用されたスタイルコードのターゲットドメインと同一に設定されることができる。 More specifically, the learning unit may input the synthesized image as a reference image to the style encoder 710 and generate the style code from the synthesized image. At this time, the target domain can be set to be the same as the target domain of the style code used to generate the synthetic image.

一方、学習部は、合成イメージを生成するために使用されたスタイルコード（または、基準イメージのスタイルコード、第１のスタイルコード）と、合成イメージから生成されたスタイルコード（または、合成イメージのスタイルコード、第２のスタイルコード）とを比較し、比較結果を利用してイメージ生成部１１０を学習させることができる。すなわち、イメージ生成部１１０を介して生成された合成イメージにターゲットドメインのスタイル情報が含まれているか判断し、判断結果に基づいてイメージ生成部１１０が学習される方式である。 On the other hand, the learning unit stores the style code used to generate the synthetic image (or the style code of the reference image, the first style code) and the style code generated from the synthetic image (or the style code of the synthetic image). code, second style code), and the image generator 110 can be trained using the comparison result. That is, it is determined whether style information of the target domain is included in the synthesized image generated through the image generator 110, and the image generator 110 learns based on the determination result.

前記学習部は、前記比較結果、ｉ）合成イメージを生成するために使用されたスタイルコード（または、基準イメージのスタイルコード、第１のスタイルコード）とｉｉ）合成イメージから生成されたスタイルコード（または、合成イメージのスタイルコード、第２のスタイルコード）とが互いに相違した場合、ｉ）合成イメージを生成するために使用されたスタイルコード（または、基準イメージのスタイルコード、第１のスタイルコード）とｉｉ）合成イメージから生成されたスタイルコード（または、合成イメージのスタイルコード、第２のスタイルコード）との差値が最小になるようにイメージ生成部１１０を学習させることができる。このとき、学習部は、スタイル再構成損失（ｓｔｙｌｅｒｅｃｏｎｓｔｒｕｃｔｉｏｎｌｏｓｓ）関数を利用して学習を行うことができる。 The learning unit obtains the comparison result, i) the style code used to generate the synthetic image (or the style code of the reference image, the first style code) and ii) the style code generated from the synthetic image ( or i) the style code used to generate the synthesized image (or the style code of the reference image, the first style code) when the style code of the synthetic image and the style code of the second style are different from each other; and ii) the image generator 110 can be trained to minimize the difference between the style code generated from the synthesized image (or the style code of the synthesized image, the second style code). At this time, the learning unit may perform learning using a style reconstruction loss function.

一方、以上で説明した学習の方法の他にも、学習部は、様々な損失（ｌｏｓｓ）関数（例えば、ダイバーシティセンシティブ損失（ｄｉｖｅｒｓｉｔｙｓｅｎｓｉｔｉｖｅｌｏｓｓ）関数、サイクル一貫性損失（ｃｙｃｌｅｃｏｎｓｉｓｔｅｎｃｙｌｏｓｓ））を利用して本発明に係るイメージ生成システムを学習させることができる。 Meanwhile, in addition to the learning methods described above, the learning unit uses various loss functions (e.g., diversity sensitive loss function, cycle consistency loss). can be used to train the image generation system according to the present invention.

上述したように、本発明に係るイメージ生成システム及びこれを利用したイメージ生成方法は、ドメインの特性を含むスタイルコードを用いて、スタイルコードに含まれたドメイン特性に該当するドメインを有するイメージを生成できる。 As described above, the image generation system and the image generation method using the same according to the present invention use the style code including the characteristics of the domain to generate an image having a domain corresponding to the domain characteristics included in the style code. can.

このとき、本発明では、スタイルコードにスタイル情報を含めることにより、スタイルコードだけで生成しようとするイメージのスタイル及びドメインを特定できる。 At this time, in the present invention, by including style information in the style code, the style and domain of the image to be generated can be specified only by the style code.

したがって、本発明によれば、スタイルコードにどのドメインによるドメイン特性が反映されているかによって、生成されるイメージのドメインが様々に定義され得る。 Therefore, according to the present invention, the domain of the generated image can be defined in various ways depending on which domain characteristic is reflected in the style code.

すなわち、本発明では、イメージ生成部に入力されるスタイルコードにドメインの特性を反映することにより、単一のイメージ生成部だけでも互いに異なる様々なドメインに対応する様々なイメージを生成できる。 That is, according to the present invention, by reflecting the characteristics of the domain in the style code input to the image generator, a single image generator can generate various images corresponding to different domains.

したがって、本発明によれば、ドメイン毎に別のイメージ生成部を備えなくとも、単一のイメージ生成部だけでも、様々なドメインに対する新しいイメージを生成できるドメイン側面での拡張性を提供できる。 Therefore, according to the present invention, it is possible to provide domain-side expandability that can generate new images for various domains with a single image generation unit without providing a separate image generation unit for each domain.

また、本発明は、スタイルコードにどのスタイルによるスタイル情報を含めるかによって、同じドメインに対して互いに異なるスタイルのイメージを生成できる。したがって、本発明は、スタイルコードに含まれるスタイル情報を変更させることだけでも、同じドメインに対する様々なスタイルのイメージを生成することにより、スタイル側面での多様性を提供できる。 In addition, the present invention can generate different styles of images for the same domain, depending on which style information is included in the style code. Therefore, the present invention can provide diversity in terms of style by generating images of various styles for the same domain simply by changing the style information contained in the style code.

一方、上記で説明した本発明は、コンピュータで１つ以上のプロセスによって実行され、このようなコンピュータ読み取り可能な媒体に格納可能なプログラムとして実現されることができる。 On the other hand, the present invention described above can be implemented as a program executable by one or more processes in a computer and storable on such computer-readable media.

さらに、上記で説明した本発明は、プログラムが記録された媒体にコンピュータ読み取り可能なコードまたは命令語として実現することが可能である。すなわち、本発明は、プログラムの形態で提供されることができる。 Furthermore, the present invention described above can be implemented as computer-readable codes or instructions on a program-recorded medium. That is, the present invention can be provided in the form of a program.

一方、コンピュータ読み取り可能な媒体は、コンピュータシステムによって読み取られることができるデータが格納されるあらゆる種類の記録装置を含む。コンピュータ読み取り可能な媒体の例では、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｉｓｋ）、ＳＤＤ（ＳｉｌｉｃｏｎＤｉｓｋＤｒｉｖｅ）、ＲＯＭ、ＲＡＭ、ＣＤ－ＲＯＭ、磁気テープ、フロッピーディスク、光データ格納装置などがあり、また、キャリアウェーブ（例えば、インターネットを介しての送信）の形態で実現されることも含む。 A computer-readable medium, on the other hand, includes any type of recording device that stores data that can be read by a computer system. Examples of computer-readable media include HDD (Hard Disk Drive), SSD (Solid State Disk), SDD (Silicon Disk Drive), ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc. Yes, also including being embodied in the form of a carrier wave (eg, transmission over the Internet).

さらに、コンピュータ読み取り可能な媒体は、格納所を含み、電子機器が通信を介して接近できるサーバまたはクラウド格納所でありうる。 Additionally, the computer-readable medium includes storage and can be a server or cloud storage accessible via communication to the electronic device.

さらに、本発明では、上記で説明したコンピュータは、プロセッサ、すなわち、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ、中央処理装置）が搭載された電子機器であって、その種類に対して特別な限定をおかない。 Furthermore, in the present invention, the computer described above is an electronic device equipped with a processor, that is, a CPU (Central Processing Unit), and the type thereof is not particularly limited.

一方、上記の詳細な説明は、あらゆる面において制限的に解釈されてはならず、例示的なことと考慮されなければならない。本発明の範囲は、添付された請求項の合理的解釈により決定されなければならず、本発明の等価的範囲内での全ての変更は本発明の範囲に含まれる。 On the other hand, the above detailed description should not be construed as restrictive in all respects, but should be considered as illustrative. The scope of the invention should be determined by reasonable interpretation of the appended claims, and all changes that come within the equivalent scope of the invention are included within the scope of the invention.

Claims

an image input for receiving a source image to be transformed;
a style code input unit for inputting a style code associated with the appearance style of the reference image;
an image generator for generating a composite image in which the source image reflects the appearance style of the reference image using the style code;
A style encoder for extracting style information related to an appearance style of the reference image from the reference image, extracting the style information from the reference image based on a specific domain of the reference image, and extracting the style information and a style encoder that generates the style code including domain characteristic information according to a specific domain of the reference image;
with
An image generation system, wherein the appearance style of the reference image is associated with a specific domain of the reference image.

The source image is
comprising at least one feature feature that determines the feature identity of the source image;
The image generation unit
2. The image generation system of claim 1 , wherein the source image reflects the appearance style of the reference image around the rest of the source image except for the appearance feature portion.

if the source image and the reference image correspond to a person,
The feature feature portion of the source image comprises:
corresponding to at least one of a person's eyes, nose and mouth;
The appearance style of the reference image is
3. The image generation system of claim 2 , associated with at least one of a person's hair style, beard, age, skin color, makeup.

further comprising an identification unit,
The identification unit
identifying, based on the reference image, whether the synthetic image is a fake image generated by the image generator for a specific domain of the reference image;
4. If the synthetic image is discriminated as a fake image as a result of identification, the image generating unit learns to minimize a difference value between the reference image and the synthetic image and the fake image. An image generation system according to any one of Claims 1 to 3.

further equipped with a learning part,
The learning unit
extracting style codes associated with a particular domain of the reference image from the synthetic image using a style encoder;
comparing the style code of the synthetic image with the style code of the reference image;
If the style code of the synthesized image and the style code of the reference image are different from each other as a result of the comparison, the image is generated so that the difference between the style code of the synthesized image and the style code of the reference image is minimized. 5. The image generation system according to any one of claims 1 to 4 , which trains a part.

receiving a source image to be transformed;
receiving a style code associated with a reference image appearance style;
generating a composite image in which the source image reflects the appearance style of the reference image using the style code;
extracting from the reference image style information associated with the appearance style of the reference image;
extracting the style information from the reference image with reference to a specific domain of the reference image;
generating the style code including the style information and domain characteristic information according to a specific domain of the reference image;
including
An image generation method, wherein the appearance style of the reference image is associated with a specific domain of the reference image.