JP2021190062A

JP2021190062A - Image generation system and image generation method using the same

Info

Publication number: JP2021190062A
Application number: JP2020169539A
Authority: JP
Inventors: ユンジェチェー; Yun Jey Choi; ヨンジョンウ; Young Jung Uh; ジョンウハ; Jung Woo Ha
Original assignee: Line Corp; Naver Corp
Current assignee: Z Intermediate Global Corp; Naver Corp
Priority date: 2020-05-29
Filing date: 2020-10-07
Publication date: 2021-12-13
Anticipated expiration: 2040-10-07
Also published as: JP7224323B2; KR102427484B1; KR20210147507A

Abstract

To provide a system that generates images.SOLUTION: An image generation system according to the present invention includes: an image input unit that receives a source image to be converted; a style code input unit that inputs a style code related to the appearance style of a reference image; and an image generation unit that uses the style code to generate a composite image in which the appearance style of the reference image is reflected in the source image.SELECTED DRAWING: Figure 1

Description

本発明は、イメージを生成するシステム及びこれを利用したイメージ生成方法に関する。 The present invention relates to a system for generating an image and an image generation method using the same.

イメージの一部特徴を他の特徴に変換したり、複数のイメージを互いに合成することにより、新しいイメージを生成するイメージ生成技術は、産業界において様々な目的に活用されているだけでなく、最近では、一般ユーザにも娯楽の要素として広く活用されている。 Image generation technology that creates new images by converting some features of an image into other features or synthesizing multiple images with each other has not only been used for various purposes in industry, but also recently. Is widely used as an element of entertainment for general users.

このようなイメージ生成技術は、人工知能の発達により、その生成技術が日々発展しており、実際に、人の目では区別が難しい程度の水準まで至った。 With the development of artificial intelligence, such image generation technology is being developed day by day, and in fact, it has reached a level that is difficult for the human eye to distinguish.

特に、イメージ生成技術は、２０１４年にヨシュア・ベンジオ（ＹｏｓｈｕａＢｅｎｇｉｏ）教授の研究チームで考案された、敵対的生成ネットワーク（ＧｅｎｅｒａｔｉｖｅＡｄｖｅｒｓａｒｉａｌＮｅｔｗｏｒｋ。略字：ＧＡＮ）に基づいて飛躍的に発展した。 In particular, image generation technology has evolved dramatically based on the Generative Adversarial Network, which was devised by Professor Yoshua Bengio's research team in 2014.

敵対的生成ネットワーク（ＧＡＮ）は、確率分布を学習する生成モデルと互いに異なる集合を区分する識別モデルとで構成される。このとき、イメージ生成モデル（または、生成子）は、ターゲットドメインを有する偽物イメージを作って識別モデルを最大限詐称して訓練するようになされる。そして、識別モデル（または、識別子）は、生成モデルが提示する偽物イメージと実際イメージとをターゲットドメインを基準に最大限正確に区分するように訓練される。 A hostile generative network (GAN) consists of a generative model that learns a probability distribution and a discriminative model that separates different sets. At this time, the image generation model (or generator) is designed to create a fake image having a target domain and to spoof and train the discriminative model as much as possible. The discriminative model (or identifier) is then trained to discriminate between the fake image and the real image presented by the generative model as accurately as possible with respect to the target domain.

このように、識別モデルを詐称するように生成モデルを訓練する方式を対立的プロセスという。このような敵対的生成ネットワークは、生成モデルと識別モデルとを対立的プロセスを介して発展させる過程であって、ターゲットドメインに対して実際イメージと極めて類似した類似イメージ、すなわち、偽物イメージを生成できるようになった。 In this way, the method of training the generative model so as to spoof the discriminative model is called a confrontational process. Such a hostile generative network is a process of developing a generative model and a discriminative model through a contradictory process, and can generate a similar image, that is, a fake image, which is very similar to a real image for a target domain. It became so.

しかしながら、このような敵対的生成ネットワークにおいてイメージ生成モデル及び識別モデルは、ターゲットドメインを基準に学習されるので、ターゲットドメインが変更される場合、新しいイメージ生成モデル及び識別モデルを訓練しなければならないという限界を有する。 However, in such a hostile generative network, the image generation model and the discriminative model are learned based on the target domain, so if the target domain is changed, a new image generation model and the discriminative model must be trained. Has a limit.

これにより、様々なターゲットドメインに対して柔軟に対処できるイメージ生成方法に対するニーズが依然として存在する。 As a result, there is still a need for an image generation method that can flexibly deal with various target domains.

本発明は、互いに異なるターゲットドメインに対応する様々なイメージを生成できるイメージ生成システム及びこれを利用したイメージ生成方法を提供するものである。 The present invention provides an image generation system capable of generating various images corresponding to different target domains and an image generation method using the same.

前述したような課題を解決するために、本発明に係るイメージ生成システムは、変換の対象になるソースイメージを受信するイメージ入力部と、基準イメージの外貌スタイルと関連したスタイルコードを入力するスタイルコード入力部と、前記スタイルコードを用いて、前記ソースイメージに前記基準イメージの外貌スタイルが反映された合成イメージを生成するイメージ生成部とを備えることができる。 In order to solve the above-mentioned problems, the image generation system according to the present invention has an image input unit that receives a source image to be converted, and a style code that inputs a style code related to the appearance style of the reference image. An input unit and an image generation unit that uses the style code to generate a composite image in which the appearance style of the reference image is reflected in the source image can be provided.

本発明に係るイメージ生成システムは、ドメインの特性を含むスタイルコードを用いて、スタイルコードに含まれたドメイン特性に該当するドメインを有するイメージを生成できる。 The image generation system according to the present invention can generate an image having a domain corresponding to the domain characteristic included in the style code by using the style code including the domain characteristic.

本発明に係るイメージ生成システム及びこれを利用したイメージ生成方法を説明するための概念図である。It is a conceptual diagram for demonstrating the image generation system which concerns on this invention, and the image generation method using this. 本発明に係るイメージ生成システム及びこれを利用したイメージ生成方法を説明するための概念図である。It is a conceptual diagram for demonstrating the image generation system which concerns on this invention, and the image generation method using this. 本発明に係るイメージ生成方法を説明するためのフローチャートである。It is a flowchart for demonstrating the image generation method which concerns on this invention. 本発明に係るマッピングネットワークを利用してスタイルコードを生成する方法を説明するための概念図である。It is a conceptual diagram for demonstrating the method of generating a style code using the mapping network which concerns on this invention. 本発明に係るマッピングネットワークを利用してスタイルコードを生成する方法を説明するための概念図である。It is a conceptual diagram for demonstrating the method of generating a style code using the mapping network which concerns on this invention. 本発明に係るマッピングネットワークを利用してスタイルコードを生成する方法を説明するための概念図である。It is a conceptual diagram for demonstrating the method of generating a style code using the mapping network which concerns on this invention. 本発明に係るスタイルエンコーダを用いてスタイルコードを生成する方法を説明するための概念図である。It is a conceptual diagram for demonstrating the method of generating a style code using the style encoder which concerns on this invention. 本発明に係るスタイルエンコーダを用いてスタイルコードを生成する方法を説明するための概念図である。It is a conceptual diagram for demonstrating the method of generating a style code using the style encoder which concerns on this invention. 本発明に係るイメージ生成システムを学習する方法を説明するための概念図である。It is a conceptual diagram for demonstrating the method of learning the image generation system which concerns on this invention.

以下、添付された図面を参照して本明細書に開示された実施形態を詳細に説明するものの、図面符号に関係なく、同一であるか、類似した構成要素には同じ参照符号を付し、これについての重複する説明を省略する。以下の説明において使用される構成要素に対する接尾辞の「モジュール」及び「部」は、明細書作成の容易さだけが考慮されて付与されるか、混用されるものであって、それ自体で互いに区別される意味または役割を有するものではない。また、本明細書に開示された実施形態を説明するにあたって、関連した公知技術についての具体的な説明が本明細書に開示された実施形態の要旨を不明確にする恐れがあると判断される場合、その詳細な説明を省略する。また、添付された図面は、本明細書に開示された実施形態を容易に理解できるようにするためのものであり、添付された図面によって本明細書に開示された技術的思想が限定されず、本発明の思想及び技術範囲に含まれるあらゆる変更、均等物ないし代替物を含むことと理解されるべきである。 Hereinafter, embodiments disclosed herein will be described in detail with reference to the accompanying drawings, but the same or similar components are designated by the same reference numerals regardless of the drawing reference numerals. A duplicate description of this will be omitted. The suffixes "modules" and "parts" for the components used in the following description are either given or mixed only for ease of specification and are themselves mutually exclusive. It does not have a distinguishing meaning or role. Further, in explaining the embodiments disclosed in the present specification, it is determined that a specific description of the related publicly known technology may obscure the gist of the embodiments disclosed in the present specification. If so, the detailed description thereof will be omitted. Further, the attached drawings are intended to facilitate understanding of the embodiments disclosed in the present specification, and the attached drawings do not limit the technical ideas disclosed in the present specification. , All modifications, equivalents or alternatives contained within the ideas and technical scope of the invention should be understood.

第１、第２などのように、序数を含む用語は、様々な構成要素を説明するのに使用され得るが、上記構成要素等は、前記用語等により限定されるものではない。前記用語等は、１つの構成要素を他の構成要素から区別する目的にのみ使用される。 Terms including ordinal numbers, such as first, second, etc., can be used to describe various components, but the components and the like are not limited by the terms and the like. The terms and the like are used only for the purpose of distinguishing one component from the other.

ある構成要素が他の構成要素に「連結されて」いるまたは「接続されて」いると言及されたときには、その他の構成要素に直接的に連結されているまたは接続されていることもできるが、中間に他の構成要素が存在することもできると理解されるべきであろう。それに対し、ある構成要素が他の構成要素に「直接連結されて」いるまたは「直接接続されて」いると言及されたときには、中間に他の構成要素が存在しないことと理解されるべきであろう。 When it is mentioned that one component is "connected" or "connected" to another component, it may be directly connected or connected to another component, It should be understood that other components may be present in the middle. On the other hand, when it is mentioned that one component is "directly linked" or "directly connected" to another component, it should be understood that there is no other component in between. Let's do it.

単数の表現は、文脈上明白に異なるように意味しない限り、複数の表現を含む。 A singular expression includes multiple expressions unless they are meant to be explicitly different in context.

本出願において、「含む」または「有する」などの用語は、明細書上に記載された特徴、数字、ステップ、動作、構成要素、部品、またはこれらを組み合わせたものが存在することを指定しようとするものであり、１つまたは複数の他の特徴や数字、ステップ、動作、構成要素、部品、またはこれらを組み合わせたものの存在または付加可能性を予め排除しないことと理解されなければならない。 In this application, terms such as "include" or "have" seek to specify the existence of features, numbers, steps, actions, components, parts, or combinations thereof described herein. It must be understood that it does not preclude the existence or addability of one or more other features or numbers, steps, actions, components, parts, or combinations thereof.

一方、本発明は、互いに異なるターゲットドメインに対応する様々なイメージを生成できるイメージ生成システム及びこれを利用したイメージ生成方法を提供するものである。 On the other hand, the present invention provides an image generation system capable of generating various images corresponding to different target domains and an image generation method using the same.

より具体的に、本発明は、単一のイメージ生成部を利用して、互いに異なるターゲットドメインに各々対応する互いに異なるイメージを生成できるイメージ生成システム及びこれを利用したイメージ生成方法を提供するものである。 More specifically, the present invention provides an image generation system capable of generating different images corresponding to different target domains by using a single image generation unit, and an image generation method using the same. be.

さらに、本発明は、ターゲットドメインを基準に様々な外貌スタイルを有するイメージを生成できるイメージ生成システム及びこれを利用したイメージ生成方法を提供するものである。本発明は、イメージ生成システム及びこれを利用したイメージ生成方法に関するものであって、特に、本発明に係るイメージ生成システムは、「イメージトゥイメージ（ｉｍａｇｅｔｏｉｍａｇｅ）変換（ｔｒａｎｓｌａｔｉｏｎ）」に基づいてイメージを生成できる。 Further, the present invention provides an image generation system capable of generating images having various appearance styles based on a target domain, and an image generation method using the same. The present invention relates to an image generation system and an image generation method using the same, and in particular, the image generation system according to the present invention is an image based on "image to image conversion". Can be generated.

ここで、「イメージトゥイメージ変換」とは、与えられた入力イメージを基に新しいイメージを生成することを意味する。より具体的に、イメージトゥイメージ変換では、入力イメージの少なくとも一部分を変換することで、新しいイメージを生成することを意味できる。 Here, "image-to-image conversion" means to generate a new image based on a given input image. More specifically, image-to-image conversion can mean generating a new image by converting at least a portion of the input image.

本発明は、特に、「イメージトゥイメージ変換」を行うにあたって、単一の「イメージ生成部」だけで、様々なスタイル及びドメインに該当する新しいイメージを生成できるイメージ生成システムに関するものである。 The present invention particularly relates to an image generation system capable of generating new images corresponding to various styles and domains with only a single "image generation unit" in performing "image-to-image conversion".

このとき、イメージ生成部は、同じドメインに対する様々なスタイルのイメージを生成する、または、互いに異なるドメインに対する同じスタイルのイメージを生成できる。以下では、本発明に係るイメージ生成システムについて添付された図面とともにより具体的に説明する。図１及び図２は、本発明に係るイメージ生成システム及びこれを利用したイメージ生成方法を説明するための概念図であり、図３は、本発明に係るイメージ生成方法を説明するためのフローチャートである。 At this time, the image generation unit can generate images of various styles for the same domain, or can generate images of the same style for different domains. Hereinafter, the image generation system according to the present invention will be described in more detail together with the attached drawings. 1 and 2 are conceptual diagrams for explaining an image generation system according to the present invention and an image generation method using the same, and FIG. 3 is a flowchart for explaining an image generation method according to the present invention. be.

図１に示されたように、本発明に係るイメージ生成システム１００は、生成部（ｇｅｎｅｒａｔｏｒ、または、イメージ生成部、１１０）及びスタイルコード入力部１２０を備えるように構成されることができる（以下、説明の都合上、「生成部１１０」は「イメージ生成部１１０」と命名する）。さらに、イメージ生成システム１００は、入力部１３０及び出力部１４０のうち、少なくとも１つをさらに備えることができる。 As shown in FIG. 1, the image generation system 100 according to the present invention can be configured to include a generation unit (generator or image generation unit, 110) and a style code input unit 120 (hereinafter,). For convenience of explanation, the "generation unit 110" is named "image generation unit 110"). Further, the image generation system 100 can further include at least one of the input unit 130 and the output unit 140.

イメージ生成部１１０は、入力部１３０を介して入力されるイメージを基にイメージを生成し、生成されたイメージは、出力部１４０を介して出力されることができる。 The image generation unit 110 generates an image based on the image input via the input unit 130, and the generated image can be output via the output unit 140.

本発明では、説明の都合上、イメージ生成部１１０に、イメージ生成のために入力されるイメージを「ソースイメージ（ｓｏｕｒｃｅｉｍａｇｅ）」と命名する。 In the present invention, for convenience of explanation, the image input to the image generation unit 110 for image generation is named "source image".

ここで、ソースイメージは、イメージ変換（または、イメージ生成）の基になるイメージを意味できる。イメージ生成部１１０は、ソースイメージを基に新しいイメージを生成できる。図１に示されたように、ソースイメージ１００ａは、入力部１３０を介してイメージ生成部１１０に入力されることができる。 Here, the source image can mean an image that is the basis of image conversion (or image generation). The image generation unit 110 can generate a new image based on the source image. As shown in FIG. 1, the source image 100a can be input to the image generation unit 110 via the input unit 130.

さらに、本発明では、説明の都合上、イメージ生成部１１０により生成されたイメージを「合成イメージ（または、出力イメージ）」と命名する。図１に示されたように、合成イメージ２００は、出力部１４０を介して出力されることができる。 Further, in the present invention, for convenience of explanation, the image generated by the image generation unit 110 is named "composite image (or output image)". As shown in FIG. 1, the composite image 200 can be output via the output unit 140.

このように、イメージ生成部１１０は、入力部１３０を介して入力されるソースイメージ１００ａを基に、基準イメージ１００ｂを用いて合成イメージ２００を生成できる。 In this way, the image generation unit 110 can generate the composite image 200 using the reference image 100b based on the source image 100a input via the input unit 130.

このとき、イメージ生成部１１０は、スタイルコード入力部１２０を介して入力されるスタイルコードを用いて合成イメージ２００を生成できる。 At this time, the image generation unit 110 can generate the composite image 200 by using the style code input via the style code input unit 120.

図１に示されたように、ソースイメージ１００ａには、少なくとも１つのグラフィックオブジェクト（例えば、人のイメージ）が含まれ得る。イメージ生成部１１０は、このようなグラフィックオブジェクト（または、第１のグラフィックオブジェクト）に、スタイルコードによる外貌スタイルを反映して合成イメージ２００を生成できる。 As shown in FIG. 1, the source image 100a may include at least one graphic object (eg, a human image). The image generation unit 110 can generate a composite image 200 by reflecting the appearance style by the style code on such a graphic object (or the first graphic object).

本発明において、グラフィックオブジェクトは、人、動物、自動車、花、かばん、山などのように、事物に対するイメージと理解されることができる。 In the present invention, a graphic object can be understood as an image of a thing such as a person, an animal, a car, a flower, a bag, a mountain, and the like.

本明細書では、説明の都合上、ソースイメージ１００ａに含まれたグラフィックオブジェクトを「第１のグラフィックオブジェクト」と命名する。そして、合成イメージ２００に含まれたグラフィックオブジェクトを「第３のグラフィックオブジェクト」と命名する。そして、基準イメージ１００ｂに含まれたグラフィックオブジェクトを「第２のグラフィックオブジェクト」と命名する。さらに、第２のグラフィックオブジェクトは、基準イメージ１００ｂに含まれたものだけでなく、ガウス分布から抽出されるノイズ情報によって特定されるオブジェクトを意味できる。このような、ガウス分布から抽出されるオブジェクトは、スタイルコードの抽出対象（または、スタイルコードを抽出するために参照される対象）とも表現することができる。 In the present specification, for convenience of explanation, the graphic object included in the source image 100a is named "first graphic object". Then, the graphic object included in the composite image 200 is named "third graphic object". Then, the graphic object included in the reference image 100b is named "second graphic object". Further, the second graphic object can mean not only the object included in the reference image 100b but also the object specified by the noise information extracted from the Gaussian distribution. Such an object extracted from the Gaussian distribution can also be expressed as an object for extracting the style code (or an object referenced for extracting the style code).

すなわち、第２のグラフィックオブジェクトは、基準イメージ（ｒｅｆｅｒｅｎｃｅｉｍａｇｅ）１００ｂに含まれるか、または複数の基準イメージに対するデータ分布によるガウス分布の特定ノイズに対応することができる。 That is, the second graphic object can be included in the reference image 100b or can correspond to the specific noise of the Gaussian distribution due to the data distribution to the plurality of reference images.

以下では、説明の都合上、ガウス分布の特定ノイズに対応する第２のグラフィックオブジェクトについて別に称さずに、全て「基準イメージ」と統一して説明する。 In the following, for convenience of explanation, the second graphic object corresponding to the specific noise of the Gaussian distribution will be described in a unified manner as the "reference image" without being referred to separately.

すなわち、以下では、説明の都合上、第２のグラフィックオブジェクトと基準イメージとを同じ意味として説明する。したがって、以下において基準イメージは、ガウス分布により特定されるオブジェクトを意味することもできる。 That is, in the following, for convenience of explanation, the second graphic object and the reference image will be described as having the same meaning. Therefore, in the following, the reference image can also mean an object specified by a Gaussian distribution.

また、本明細書では、ソースイメージと第１のグラフィックオブジェクトとを互いに同じ意味として使用することができる。すなわち、ソースイメージの外貌スタイルは、つまり、第１のグラフィックオブジェクトの外貌スタイルを意味できる。 Further, in the present specification, the source image and the first graphic object can be used as having the same meaning as each other. That is, the appearance style of the source image can mean the appearance style of the first graphic object.

ここで、スタイルコードは、基準イメージ１００ｂの外貌スタイルと関連することができる。「外貌スタイル」は、基準イメージ１００ｂの視覚的な外観を定義できる要素であって、ヘアスタイル（または、頭髪スタイル）、性別など、様々な要素によって決定されることができる。 Here, the style code can be associated with the appearance style of the reference image 100b. The "appearance style" is an element that can define the visual appearance of the reference image 100b, and can be determined by various factors such as a hairstyle (or a hairstyle) and a gender.

前述したように、基準イメージ１００ｂは、ソースイメージ１００ａの外貌スタイルを変更するために参照される対象を意味できる。 As described above, the reference image 100b can mean an object referenced to change the appearance style of the source image 100a.

このように、イメージ生成部１１０は、ソースイメージ１００ａに、基準イメージの外貌スタイルに該当するスタイルコードを反映することにより、前記基準イメージの外貌スタイルが反映された合成イメージ２００を生成できる。 As described above, the image generation unit 110 can generate the composite image 200 in which the appearance style of the reference image is reflected by reflecting the style code corresponding to the appearance style of the reference image in the source image 100a.

本発明において、合成イメージ２００を生成するとは、ソースイメージ１００ａ、すなわち、第１のグラフィックオブジェクトの外貌スタイルを、基準イメージ１００ｂの外貌スタイルを参照して変換（または、変更）することを意味できる。その結果、本発明では、第１のグラフィックオブジェクトの一部分が基準イメージの外貌スタイルに変換された合成イメージが生成され得る。 In the present invention, generating the composite image 200 can mean converting (or changing) the appearance style of the source image 100a, that is, the first graphic object with reference to the appearance style of the reference image 100b. As a result, in the present invention, a composite image in which a part of the first graphic object is converted into the appearance style of the reference image can be generated.

一方、本発明において、スタイルコードは、スタイル情報及びドメイン特性情報を含むことができる。このとき、スタイル情報は、ドメイン特性情報によるドメインと関連したスタイルに関する情報でありうる。 On the other hand, in the present invention, the style code can include style information and domain characteristic information. At this time, the style information can be information about the style related to the domain by the domain characteristic information.

イメージ生成部１１０は、スタイルコードに含まれたスタイル情報及びドメイン特性情報に基づいて、ソースイメージ１００ａ（より具体的には、ソースイメージ１００ａに含まれた第１のグラフィックオブジェクト）の外貌スタイルを変換することにより合成イメージ２００を生成できる。このとき、イメージ生成部１１０は、合成イメージ２００が、スタイルコードに含まれたドメイン特性情報に対応するドメインを有するように、前記ソースイメージ１００ａを基に合成イメージ２００を生成できる。 The image generation unit 110 converts the appearance style of the source image 100a (more specifically, the first graphic object included in the source image 100a) based on the style information and the domain characteristic information included in the style code. By doing so, the composite image 200 can be generated. At this time, the image generation unit 110 can generate the composite image 200 based on the source image 100a so that the composite image 200 has a domain corresponding to the domain characteristic information included in the style code.

その結果、合成イメージ２００に含まれた第３のグラフィックオブジェクトは、第１のグラフィックオブジェクトに、前記スタイルコードに含まれたスタイル情報及びドメイン特性情報が反映されたグラフィックオブジェクトでありうる。すなわち、第３のグラフィックオブジェクトは、第１のグラフィックオブジェクトに第２のグラフィックオブジェクトの外貌スタイルが合成されたイメージでありうる。 As a result, the third graphic object included in the composite image 200 may be a graphic object in which the style information and the domain characteristic information included in the style code are reflected in the first graphic object. That is, the third graphic object may be an image in which the appearance style of the second graphic object is combined with the first graphic object.

このように、本発明では、スタイル情報及びドメイン特性情報が含まれたスタイルコードを用いて、ソースイメージ１００ａを基にする合成イメージ２００を生成できる。 As described above, in the present invention, the composite image 200 based on the source image 100a can be generated by using the style code including the style information and the domain characteristic information.

すなわち、本発明に係るイメージ生成システム１００は、ソースイメージ１００ａの特定ドメインを基準イメージ１００ｂの特定ドメインに変更することにより合成イメージ２００を生成できる。 That is, the image generation system 100 according to the present invention can generate the composite image 200 by changing the specific domain of the source image 100a to the specific domain of the reference image 100b.

スタイルコードは、図２に示されたように、それぞれの基準イメージ１０１ｂ、１０２ｂ、１０３ｂ、１０４ｂ、１０５ｂ、１０６ｂに対するスタイル及びドメインに関する情報を含むことができる。 The style code can include information about the style and domain for each reference image 101b, 102b, 103b, 104b, 105b, 106b, as shown in FIG.

このとき、スタイルコードは、図２に示されたように、ベクトル（ｖｅｃｔｏｒ）形式を有するようになされることができる。さらに、スタイルコード入力部１２０は、このようなベクトル形式を有するスタイルコードを、適応インスタンス正規化（ａｄａｐｔｉｖｅｉｎｓｔａｎｃｅｎｏｒｍａｌｉｚａｔｉｏｎ）（ＡｄａＩＮ）を介してイメージ生成部１１０に入力することができる。 At this time, the style code can be made to have a vector format as shown in FIG. Further, the style code input unit 120 can input a style code having such a vector format to the image generation unit 110 via adaptive instance normalization (AdaIN).

上述したように、スタイルコードは、基準イメージ１００ｂのスタイル及びドメインを特定するための、スタイル情報及びドメイン特性情報を含むことができる。以下では、本発明に対する理解を助けるために、スタイル情報、ドメイン、及びドメイン特性情報が有する意味について説明する。 As mentioned above, the style code can include style information and domain characteristic information for identifying the style and domain of the reference image 100b. In the following, in order to help the understanding of the present invention, the meaning of the style information, the domain, and the domain characteristic information will be described.

まず、「スタイル情報」は、グラフィックオブジェクトが有する外貌スタイル、すなわち、視覚的特徴（または、視覚的外観）に関する情報を意味する。 First, "style information" means information about the appearance style of a graphic object, that is, visual features (or visual appearance).

ここで、視覚的特徴は、頭髪スタイルなどのように、目に見える外貌（ａｐｐｅａｒａｎｃｅ）と関連した特徴を意味できる。 Here, a visual feature can mean a feature associated with a visible appearance, such as a hair style.

このようなスタイル情報は、複数のカテゴリー（または、スタイルカテゴリー、属性（ａｔｔｒｉｂｕｔｅ）などと命名可能である）のうち、少なくとも１つのカテゴリーに対する特徴情報を含むことができる。 Such style information can include feature information for at least one of a plurality of categories (or style categories, attributes, etc.).

ここで、カテゴリーまたは属性は、グラフィックオブジェクトが有する意味のある視覚的特徴を区分するための区分基準であると理解されることができる。また、カテゴリーは、グラフィックオブジェクトの外貌スタイルを定義するための要素であると理解されることができる。 Here, a category or attribute can be understood as a classification criterion for classifying meaningful visual features of a graphic object. Categories can also be understood as elements for defining the appearance style of graphic objects.

一方、カテゴリーに対する特徴情報は、グラフィックオブジェクトが当該カテゴリーにおいて「どのような視覚的特徴を有するか」をデータとして表現したことを意味できる。 On the other hand, the feature information for a category can mean that "what kind of visual features the graphic object has" in the category is expressed as data.

このとき、「カテゴリーに対する特徴情報」は、「属性値（ａｔｔｒｉｂｕｔｅｖａｌｕｅ）」とも命名されることができる。 At this time, the "characteristic information for the category" can also be named "attribute value".

「カテゴリー（または、属性）」についてより具体的に説明すれば、グラフィックオブジェクトの外貌スタイル、すなわち、視覚的特徴を表現するためのカテゴリー（または、属性）の種類は非常に様々でありうる。 More specifically about the "category (or attribute)", the appearance style of the graphic object, that is, the type of category (or attribute) for expressing the visual feature can be very different.

例えば、性別、年齢、ヘアスタイル（頭髪スタイル）、ヘア色相（頭髪色相）、皮膚色相、メーキャップ（化粧）、ひげ、顔型、表情、メガネ、アクセサリー、眉毛形状、目形状、口唇形状、鼻形状、耳形状、人中形状などが全てそれぞれの個別カテゴリー（または、属性）と理解されることができる。 For example, gender, age, hairstyle (hair style), hair hue (head hair hue), skin hue, makeup (makeup), whiskers, face shape, facial expression, glasses, accessories, eyebrow shape, eye shape, lip shape, nose shape. , Ear shape, philtrum shape, etc. can all be understood as individual categories (or attributes).

スタイル情報は、カテゴリーに対する識別情報（カテゴリー種類、カテゴリーインデックス情報等）及び当該カテゴリーに対する特徴情報を全て含むことができる。 The style information can include all the identification information (category type, category index information, etc.) for the category and the feature information for the category.

例えば、カテゴリーに対する識別情報は、「ヘアスタイル」であり、カテゴリーに対する特徴情報は、「金髪ウェーブ」でありうる。 For example, the identification information for a category can be a "hairstyle" and the feature information for a category can be a "blond wave".

このように、スタイルコードは、グラフィックオブジェクトの外貌スタイルを定義できる様々なカテゴリーのうち、少なくとも１つのカテゴリーに関する情報（カテゴリーに対する識別情報及びカテゴリーに対する特徴情報のうち、少なくとも１つを含む）を含むスタイル情報を含むことができる。 Thus, the style code is a style that includes information about at least one category (including at least one of the identification information for the category and the feature information for the category) among the various categories that can define the appearance style of the graphic object. Information can be included.

例えば、図１に示された合成イメージ２００のうち、第１の合成イメージ２０１及び第２の合成イメージ２０２を「ヘアスタイル」カテゴリー観点で説明する。この場合、第１の合成イメージ２０１は、ヘアスタイルカテゴリーに対して、第１の基準イメージ１０１ｂによる「黒色ウェーブ髪２０１ａ」に該当するカテゴリーに対する特徴情報、すなわち、スタイル情報を有することができる。そして、第２の合成イメージ２０２は、ヘアスタイルカテゴリーに対して、第２の基準イメージ１０２ｂに該当する「前髪がある金髪ウェーブ髪２０２ａ」によるカテゴリーに対する特徴情報、すなわち、スタイル情報を有することができる。 For example, among the composite images 200 shown in FIG. 1, the first composite image 201 and the second composite image 202 will be described from the viewpoint of the “hairstyle” category. In this case, the first composite image 201 can have the feature information for the hairstyle category, that is, the feature information for the category corresponding to the "black wave hair 201a" according to the first reference image 101b, that is, the style information. Then, the second composite image 202 can have characteristic information for the category by "blond hair wave hair 202a with bangs" corresponding to the second reference image 102b, that is, style information for the hairstyle category. ..

このように、第１及び第２の合成イメージ２０１、２０２は、同じカテゴリー（例えば、「ヘアスタイル」カテゴリー）に対して互いに異なるスタイル情報を有することができる。 Thus, the first and second composite images 201, 202 can have different style information for the same category (eg, "hairstyle" category).

したがって、スタイルコードにどのカテゴリーのどのような特徴を有するスタイル情報が含まれるかによって合成イメージの外貌スタイルが変わることができる。 Therefore, the appearance style of the composite image can be changed depending on which category and what characteristic of the style information is included in the style code.

したがって、本発明に係るイメージ生成部１１０は、ソースイメージ１００ａに対して、基準イメージ１００ｂの外貌スタイルから抽出されたスタイル情報を含むスタイルコードを反映できる。これにより、イメージ生成部１１０は、基準イメージ１００ｂの外貌スタイルを有する合成イメージ２００を生成できる。 Therefore, the image generation unit 110 according to the present invention can reflect the style code including the style information extracted from the appearance style of the reference image 100b on the source image 100a. As a result, the image generation unit 110 can generate the composite image 200 having the appearance style of the reference image 100b.

このように、イメージ生成部１１０は、スタイルコードに含まれたスタイル情報に基づいて、ソースイメージ１００ａの少なくとも１つのカテゴリーに対する変換を行うことができる。 In this way, the image generation unit 110 can perform conversion for at least one category of the source image 100a based on the style information included in the style code.

イメージ生成部１１０は、ソースイメージ（１００ａ、または、第１のグラフィックオブジェクト）の外貌スタイルを定義するための複数のカテゴリーのうち、スタイル情報に含まれたカテゴリーと同一または対応するカテゴリーを基準に変換を行うことができる。 The image generation unit 110 converts a plurality of categories for defining the appearance style of the source image (100a or the first graphic object) based on the same or corresponding category as the category included in the style information. It can be performed.

ここで、ソースイメージ１００ａの特定カテゴリーに対して変換を行うとは、ソースイメージ１００ａの特定カテゴリーに対する特徴情報または属性値を変換することであって、このような特徴情報が変更される場合、当該カテゴリーに対する視覚的外観が変わるようになる。 Here, the conversion for the specific category of the source image 100a is to convert the feature information or the attribute value for the specific category of the source image 100a, and when such feature information is changed, the said. The visual appearance of the category will change.

次に、ドメイン及びドメイン特性情報について説明する。 Next, the domain and the domain characteristic information will be described.

ドメイン（ｄｏｍａｉｎ）は、前述した、イメージ（または、グラフィックオブジェクト）の外貌スタイルを区分する互いに異なる複数のカテゴリーのうち、基準になる少なくとも１つのカテゴリーに対する特徴情報（または、属性値）を意味できる。 The domain (domine) can mean the characteristic information (or attribute value) for at least one category as a reference among a plurality of different categories that classify the appearance style of the image (or graphic object) described above.

ここで、「基準」は、イメージ変換の基準、イメージ分類の基準、またはイメージ区分の基準のように、様々な意味と受け入れられることができる。 Here, "criteria" can be accepted as having various meanings, such as image conversion criteria, image classification criteria, or image classification criteria.

ドメイン（ｄｏｍａｉｎ）は、互いに異なる複数のイメージが、「特定カテゴリーに対して互いに同じ属性値を有する」または「特定カテゴリーに対して互いに異なる共通属性値を有する」と表現するとき、「特定カテゴリーに対する属性値」がつまり、ドメインを意味できる。 A domain is defined as "for a particular category" when multiple images that are different from each other are described as "having the same attribute values for a particular category" or "having different common attribute values for a particular category". "Attribute value" can mean a domain.

例えば、複数のカテゴリーのうち、「性別」カテゴリーを基準にドメインを説明するとき、図２に示されたように、第１、第２、及び第３のイメージ２０１、２０２、２０３は、同じドメインを有する。そして、第４、第５、及び第６イメージ２０４、２０５、２０６も同じドメインを有する。しかし、第１、第２、及び第３のイメージ２０１、２０２、２０３のドメインは、第４、第５、及び第６のイメージ２０４、２０５、２０６のドメインと互いに異なることができる。すなわち、第１、第２、及び第３のイメージ２０１、２０２、２０３は、「女性」であり、第４、第５、及び第６のイメージ２０４、２０５、２０６のドメインは、「男性」である。このとき、「女性」または「男性」がつまり、ドメインを意味できる。 For example, when describing a domain based on the "gender" category among a plurality of categories, the first, second, and third images 201, 202, 203 are the same domain, as shown in FIG. Has. And the fourth, fifth, and sixth images 204, 205, 206 also have the same domain. However, the domains of the first, second, and third images 201, 202, 203 can be different from the domains of the fourth, fifth, and sixth images 204, 205, 206. That is, the first, second, and third images 201, 202, 203 are "female," and the domains of the fourth, fifth, and sixth images 204, 205, 206 are "male." be. At this time, "female" or "male" can mean a domain.

このように、ドメインは、外貌スタイルと関連した様々なカテゴリーに対する属性値のうち、少なくとも１つであって、イメージの変換、イメージの分類、またはイメージの区分基準になる指標でありうる。 Thus, the domain is at least one of the attribute values for the various categories associated with the appearance style and can be an index that serves as an image transformation, image classification, or image classification criterion.

一方、スタイルコードに含まれたドメイン特性情報は、特定ドメイン（または、ターゲットドメイン）を表すデータであって、外貌スタイルを区分する特定カテゴリー（または、属性）及びこれに対する特徴情報（属性値）を含むことができる。 On the other hand, the domain characteristic information included in the style code is data representing a specific domain (or target domain), and includes specific categories (or attributes) that classify appearance styles and characteristic information (attribute values) for them. Can include.

一方、イメージ生成部１１０は、スタイルコードに含まれたドメイン特性情報に基づいて合成イメージ２００のドメインを決定できる。 On the other hand, the image generation unit 110 can determine the domain of the composite image 200 based on the domain characteristic information included in the style code.

前記イメージ生成部１１０は、合成イメージ２００がスタイルコードに含まれたドメイン特性情報によるドメインを有するようにソースイメージ１００ａを変換できる。 The image generation unit 110 can convert the source image 100a so that the composite image 200 has a domain based on the domain characteristic information included in the style code.

ここで、スタイルコードに含まれたドメイン特性情報は、基準イメージの特定ドメインに関する情報でありうる。すなわち、イメージ生成部１１０は、合成イメージ２００が、基準イメージの特定ドメインと同じドメインを有するようにソースイメージ１００ａを変換できる。 Here, the domain characteristic information included in the style code can be information about a specific domain of the reference image. That is, the image generation unit 110 can convert the source image 100a so that the composite image 200 has the same domain as the specific domain of the reference image.

例えば、スタイルコードに第４、第５、及び第６の基準イメージ１０４ｂ、１０５ｂ、１０６ｂによる「男性」に該当する特定ドメインに対するドメイン特性情報が含まれた場合、イメージ生成部１１０により生成された第４、第５、及び第６のイメージ２０４、２０５、２０６は、「男性」ドメインを有することができる。 For example, when the style code contains domain characteristic information for a specific domain corresponding to "male" according to the fourth, fifth, and sixth reference images 104b, 105b, 106b, the first image generation unit 110 generates it. The fourth, fifth, and sixth images 204, 205, 206 can have a "male" domain.

このように、イメージ生成部１１０は、合成イメージ２０４、２０５、２０６が基準イメージ（例えば、第４、第５、及び第６の基準イメージ１０４ｂ、１０５ｂ、１０６ｂ）の特定ドメイン（例えば、男性）を有するように、ソースイメージ１００ａに前記ドメイン特性情報を反映できる。 As described above, in the image generation unit 110, the composite image 204, 205, 206 sets a specific domain (for example, male) of the reference image (for example, the fourth, fifth, and sixth reference images 104b, 105b, 106b). The domain characteristic information can be reflected in the source image 100a so as to have.

このとき、イメージ生成部１１０は、ソースイメージ１００ａのドメインとスタイルコードに含まれたドメイン特性情報による特定ドメインとが異なる場合、これを考慮せずに合成イメージ２００のドメインを決定できる。 At this time, if the domain of the source image 100a and the specific domain based on the domain characteristic information included in the style code are different from each other, the image generation unit 110 can determine the domain of the composite image 200 without considering this.

すなわち、イメージ生成部１１０は、ソースイメージ１００ａの特定ドメインと基準イメージ１００ｂの特定ドメインとが異なる場合、ソースイメージ１００ａの特定ドメインより、前記基準イメージ１００ｂの特定ドメインを優先して、合成イメージ（または、第３のグラフィックオブジェクト）のドメインを決定できる。その結果、合成イメージ２００は、基準イメージ１００ｂの特定ドメインを有する。 That is, when the specific domain of the source image 100a and the specific domain of the reference image 100b are different from each other, the image generation unit 110 gives priority to the specific domain of the reference image 100b over the specific domain of the source image 100a, and gives priority to the composite image (or the specific domain of the reference image 100b). , The domain of the third graphic object) can be determined. As a result, the composite image 200 has a specific domain of the reference image 100b.

一方、イメージ生成部１１０は、スタイルコードに基づいてソースイメージ１００ａを変換する場合、ソースイメージ１００ａの外貌的正体性を決定する少なくとも１つの外貌特徴部分を基準に、残りの部分に対する外貌スタイルを変更できる。 On the other hand, when the source image 100a is converted based on the style code, the image generation unit 110 changes the appearance style for the remaining part based on at least one appearance feature part that determines the appearance identity of the source image 100a. can.

より具体的に、ソースイメージ１００ａは、前記ソースイメージ１００ａの外貌的正体性を決定する少なくとも１つの外貌特徴部分を含むことができる。イメージ生成部１００ａは、ソースイメージ１００ａの外貌特徴部分を除いた残りの部分を中心に、前記ソースイメージ１００ａに対して基準イメージ１００ｂの外貌スタイルを反映できる。このとき、基準イメージ１００ｂの外貌スタイルは、スタイルコードに含まれたドメイン特性情報に対応する基準イメージの特定ドメインを基準に定義された外貌スタイルを意味できる。 More specifically, the source image 100a can include at least one external appearance feature portion that determines the external identity of the source image 100a. The image generation unit 100a can reflect the appearance style of the reference image 100b with respect to the source image 100a, centering on the remaining portion excluding the appearance feature portion of the source image 100a. At this time, the appearance style of the reference image 100b can mean the appearance style defined based on the specific domain of the reference image corresponding to the domain characteristic information included in the style code.

ソースイメージ１００ａ及び基準イメージ１００ｂが人に対応する場合、前記ソースイメージ１００ａの前記外貌特徴部分は、人の目、鼻、及び口のうち、少なくとも１つに対応する部分でありうる。このとき、前記基準イメージ１００ｂの外貌スタイルは、人の頭髪スタイル、ひげ、年齢、皮膚色、メーキャップのうち、少なくとも１つと関連したものでありうる。 When the source image 100a and the reference image 100b correspond to a person, the appearance feature portion of the source image 100a may be a portion corresponding to at least one of a person's eyes, nose, and mouth. At this time, the appearance style of the reference image 100b may be related to at least one of a person's hair style, beard, age, skin color, and makeup.

一方、前記ソースイメージ１００ａの外貌的正体性を決定する要素は様々でありうるし、イメージ生成部１１０は、合成イメージ２００の合成目的によって、外貌的正体性を決定する要素を異なるように決定することができる。 On the other hand, the factors that determine the external identity of the source image 100a may be various, and the image generation unit 110 determines the factors that determine the external identity differently depending on the purpose of synthesizing the composite image 200. Can be done.

イメージ生成部１１０において、どの部分を外貌的正体性と決定するか否かは、予め入力された情報に基づいて決定されることも可能である。 In the image generation unit 110, it is also possible to determine which part is determined to be the external identity based on the information input in advance.

例えば、合成イメージ２００の目的が特定人物に対する様々な頭髪スタイルの変化を表すことであるならば、このとき、外貌的正体性を表す外貌特徴部分は、特定人物の目、鼻、口、顔型などに対応する部分でありうる。 For example, if the purpose of the composite image 200 is to represent various changes in hair style for a specific person, then the appearance feature portions representing the appearance identity are the eyes, nose, mouth, and face type of the specific person. It can be the part corresponding to.

その結果、図１に示されたように、イメージ生成部１１０は、ソースイメージ１００ａの外貌的正体性に該当する外貌特徴部分を除いた残りの部分を中心に、前記ソースイメージ１００ａに対して基準イメージ１００ｂの外貌スタイル（例えば、ヘアスタイル）を反映できる。その結果、ソースイメージ１００ａの外貌的正体性を維持しながら、基準イメージ１００ｂの外貌スタイルを有する合成イメージ２００が生成され得る。 As a result, as shown in FIG. 1, the image generation unit 110 refers to the source image 100a centering on the remaining portion excluding the external feature portion corresponding to the external identity of the source image 100a. The appearance style (for example, hairstyle) of the image 100b can be reflected. As a result, a composite image 200 having the appearance style of the reference image 100b can be generated while maintaining the appearance identity of the source image 100a.

一方、ここで、外貌的正体性は、ソースイメージ１００ａに含まれたグラフィックオブジェクトのポーズ（ｐｏｓｅ）または姿勢を含むことができる。 On the other hand, here, the external identity can include the pose or posture of the graphic object included in the source image 100a.

すなわち、イメージ生成部１１０は、ソースイメージ１００ａに含まれたグラフィックオブジェクトのポーズと同じポーズを有するグラフィックオブジェクトが含まれるように合成イメージ２００を生成できる。 That is, the image generation unit 110 can generate the composite image 200 so that the graphic object having the same pose as the pose of the graphic object included in the source image 100a is included.

このように、本発明に係るイメージ生成システム１００は、入力部１１０を介してソースイメージを受信し（Ｓ３１０）、スタイルコード入力部１２０を介して外貌スタイルと関連したスタイルコードを受信する（Ｓ３２０）。そして、受信されたスタイルコードを用いて、スタイルコードに対応する外貌スタイルが反映されたイメージを生成できる（Ｓ３３０）。 As described above, the image generation system 100 according to the present invention receives the source image via the input unit 110 (S310) and receives the style code related to the appearance style via the style code input unit 120 (S320). .. Then, using the received style code, an image reflecting the appearance style corresponding to the style code can be generated (S330).

以上で説明したように、本発明に係るイメージ生成システム１００は、イメージ生成部１１０にドメインの特性情報を含むスタイルコードに基づいて合成イメージを生成できる。 As described above, the image generation system 100 according to the present invention can generate a composite image based on a style code including domain characteristic information in the image generation unit 110.

以下では、スタイルコードを生成する方法について添付された図面とともにより具体的に説明する。図４、図５、及び図６は、本発明に係るマッピングネットワークを利用してスタイルコードを生成する方法を説明するための概念図である。 In the following, the method of generating the style code will be described more specifically together with the attached drawings. 4, 5, and 6 are conceptual diagrams for explaining a method of generating a style code using the mapping network according to the present invention.

前述したように、本発明に係るイメージ生成部１１０は、スタイルコード入力部１２０を介して入力されるスタイルコードにより、ソースイメージ１００ａにおいてどのドメインを基準にイメージを変換するかを決定できる。 As described above, the image generation unit 110 according to the present invention can determine which domain in the source image 100a is used as the reference for converting the image by the style code input via the style code input unit 120.

すなわち、スタイルコードは、特定ドメイン（または、ターゲットドメイン）に対するドメイン特性情報及び前記特定ドメインを基準に抽出されたスタイル情報を含むことができる。一方、スタイルコードに含まれたドメイン特性情報に基づいて、ソースイメージ１００ａの変換対象ターゲットドメインが決定される。 That is, the style code can include domain characteristic information for a specific domain (or a target domain) and style information extracted based on the specific domain. On the other hand, the conversion target domain of the source image 100a is determined based on the domain characteristic information included in the style code.

このようなスタイルコードは、図４に示されたマッピングネットワーク４００から抽出されることができる。イメージ生成部１１０は、マッピングネットワーク４００から抽出されたスタイルコードを用いて、ソースイメージの特定ドメインを、スタイルコードに含まれたドメイン特性情報による特定ドメイン（または、ターゲットドメイン）に変換することができる。 Such a style code can be extracted from the mapping network 400 shown in FIG. The image generation unit 110 can convert a specific domain of the source image into a specific domain (or a target domain) based on the domain characteristic information included in the style code by using the style code extracted from the mapping network 400. ..

より具体的に、図４に示されたように、マッピングネットワーク４００は、マッピングネットワーク部４１０、入力部４２０、及び出力部４３０のうち、少なくとも１つを備えることができる。 More specifically, as shown in FIG. 4, the mapping network 400 can include at least one of a mapping network unit 410, an input unit 420, and an output unit 430.

マッピングネットワーク部４１０は、ガウス分布４００ａからノイズ情報（ｚ１ないしｚ７）を抽出し、抽出されたノイズ情報を利用してスタイルコードを生成できる。 The mapping network unit 410 can extract noise information (z1 to z7) from the Gaussian distribution 400a and generate a style code using the extracted noise information.

このようなノイズ情報は、潜在コード（ｌａｔｅｎｔｃｏｄｅ）とも命名されることができる。 Such noise information can also be named latent code.

マッピングネットワーク部４１０は、ガウス分布４００ａからランダムにサンプリングを行うことにより、様々なドメイン及び様々なスタイルを有する様々なスタイルコードを生成できる。 The mapping network unit 410 can generate various style codes having various domains and various styles by randomly sampling from the Gaussian distribution 400a.

マッピングネットワーク部４１０は、このようなガウス分布４００ａからサンプリングを行ってノイズ情報（潜在コードまたはノイズ）を抽出できる。このように抽出されたノイズ情報は、特定ドメインに対するスタイル情報になることができる。 The mapping network unit 410 can extract noise information (latent code or noise) by sampling from such a Gaussian distribution 400a. The noise information extracted in this way can be style information for a specific domain.

マッピングネットワーク部４１０は、スタイルコードに反映しようとする特定ドメインの情報とガウス分布４００ａから抽出された特定ノイズ情報とを組み合わせることができる。そして、マッピングネットワーク部４１０は、前記組み合わせに基づいて、特定ドメインに対する特性情報及び前記抽出された特定ノイズ情報に対応するスタイル情報を含むスタイルコードを生成できる。 The mapping network unit 410 can combine the information of the specific domain to be reflected in the style code and the specific noise information extracted from the Gaussian distribution 400a. Then, the mapping network unit 410 can generate a style code including the characteristic information for the specific domain and the style information corresponding to the extracted specific noise information based on the combination.

このとき、ガウス分布４００ａは、複数のイメージに対するものであって、複数のイメージに対するデータセット（ｄａｔａｓｅｔ）の確率分布でありうる。 At this time, the Gaussian distribution 400a is for a plurality of images, and may be a probability distribution of a data set for a plurality of images.

前述したように、マッピングネットワーク部４１０は、ノイズ情報からスタイルコードを変換するとき、変換されたスタイルコードにドメインの情報が含まれるようにスタイルコードを生成できる。 As described above, when the mapping network unit 410 converts the style code from the noise information, the mapping network unit 410 can generate the style code so that the converted style code includes the domain information.

例えば、図５に示されたように、ガウス分布４００ａから特定ノイズ情報ｚ１が抽出された場合、当該ノイズ情報ｚ１がどのドメインに対することであるかによって、互いに異なるスタイルコードが生成され得る。 For example, as shown in FIG. 5, when the specific noise information z1 is extracted from the Gaussian distribution 400a, different style codes may be generated depending on which domain the noise information z1 is for.

すなわち、マッピングネットワーク部４００は、ガウス分布４００ａから同一ノイズ情報が抽出されても、基準になるドメインによって、互いに異なるスタイルコードを生成できる。 That is, even if the same noise information is extracted from the Gaussian distribution 400a, the mapping network unit 400 can generate different style codes depending on the reference domain.

このために、マッピングネットワーク部４００は、互いに異なるドメインに対するスタイルコードを出力するための複数の出力分岐があるＭＬＰ（ｍｕｌｔｉｌａｙｅｒｐｅｒｃｅｐｔｒｏｎ）（ＭＬＰｗｉｔｈｍｕｌｔｉｐｌｅｏｕｔｐｕｔｂｒａｎｃｈｅｓ）で構成されることができる。このような、同じノイズ情報に対して互いに異なるスタイルコードが生成され得る。この場合、互いに異なるスタイルコードは、各々互いに異なるターゲットドメインに対応することができる。 For this purpose, the mapping network unit 400 can be configured by an MLP (multilayer perceptron) (MLP with multiple output branches) having a plurality of output branches for outputting style codes for domains different from each other. Such different style codes can be generated for the same noise information. In this case, different style codes can correspond to different target domains.

より具体的に、図５において特定ノイズ情報ｚ１は、図１及び図２において説明した基準イメージ１０１ｂを表すためのデータを含むことができる。 More specifically, in FIG. 5, the specific noise information z1 can include data for representing the reference image 101b described in FIGS. 1 and 2.

マッピングネットワーク部４１０は、基準イメージ１０１ｂに対応するノイズ情報ｚからスタイルコードを生成できる。この場合、マッピングネットワーク部４１０は、互いに異なる様々なドメインを基準にスタイルコードを生成できる。すなわち、マッピングネットワーク部４００は、特定ドメインを基準に互いに異なるスタイルコードを生成できる。 The mapping network unit 410 can generate a style code from the noise information z corresponding to the reference image 101b. In this case, the mapping network unit 410 can generate a style code based on various domains different from each other. That is, the mapping network unit 400 can generate different style codes based on a specific domain.

例えば、図５に示されたように、スタイルコードに含まれる特定ドメイン（ターゲットドメイン）の基準が「性別」である場合、マッピングネットワーク部４１０は、基準イメージ１０１ｂの性別（例えば、「女性」）がドメイン特性情報として含まれるようにスタイルコードを生成できる。 For example, as shown in FIG. 5, when the reference of the specific domain (target domain) included in the style code is "gender", the mapping network unit 410 has the gender of the reference image 101b (for example, "female"). Can generate a style code to be included as domain characteristic information.

このとき、マッピングネットワーク部４１０は、ノイズ情報ｚから前記特定ドメインが有する特徴（例えば、「女性」の特徴：長髪、化粧）を中心にスタイル情報を抽出できる。 At this time, the mapping network unit 410 can extract style information from the noise information z, focusing on the characteristics of the specific domain (for example, the characteristics of "female": long hair, makeup).

さらに他の例として、図５に示されたように、スタイルコードに含まれる特定ドメイン（ターゲットドメイン）の基準が「年齢」である場合、マッピングネットワーク部４１０は、基準イメージ１０１ｂの年齢（例えば、「若者」）がドメイン特性情報として含まれるようにスタイルコードを生成できる。 As yet another example, as shown in FIG. 5, when the reference of the specific domain (target domain) included in the style code is "age", the mapping network unit 410 uses the age of the reference image 101b (for example, for example). A style code can be generated so that "youth") is included as domain characteristic information.

このとき、マッピングネットワーク部４１０は、ノイズ情報ｚから前記特定ドメインが有する特徴（例えば、「若い女性」の特徴：滑らかな皮膚、化粧）を中心にスタイル情報を抽出できる。 At this time, the mapping network unit 410 can extract style information from the noise information z, focusing on the characteristics of the specific domain (for example, the characteristics of "young female": smooth skin, makeup).

また、図示したように、マッピングネットワーク部４１０は、ヘアカラー、皮膚カラー、ヘアスタイル、顔型など、様々なターゲットドメインを基準に、ノイズ情報ｚからスタイル情報を抽出できる。 Further, as shown in the figure, the mapping network unit 410 can extract style information from the noise information z based on various target domains such as hair color, skin color, hairstyle, and face type.

一方、本発明において、「ターゲットドメインを基準にスタイル情報を抽出する」とは、ノイズ情報ｚから、ターゲットドメインと関連した特徴（例えば、ターゲットドメインが女性である場合、長髪、化粧）と関連した外貌的な特徴を有するスタイル情報を抽出することを意味できる。 On the other hand, in the present invention, "extracting style information based on the target domain" is related to the characteristics related to the target domain (for example, long hair, makeup when the target domain is female) from the noise information z. It can be meant to extract style information with external features.

このように、本発明に係るマッピングネットワーク部４１０は、複数の基準イメージに対するガウス分布から基準イメージ１０１ｂに対応するノイズ情報ｚを抽出し、前記抽出されたノイズ情報ｚを利用して、基準イメージ１０１ｂの外貌スタイルと関連したスタイルコードを生成できる。 As described above, the mapping network unit 410 according to the present invention extracts the noise information z corresponding to the reference image 101b from the Gaussian distribution for the plurality of reference images, and uses the extracted noise information z to use the reference image 101b. Can generate style codes related to the appearance style of.

前述したように、マッピングネットワーク部４１０は、前記ノイズ情報に前記第２のグラフィックオブジェクトの外貌スタイルに基づいて分類可能な複数のドメインのうち、いずれか１つのドメイン（または、ターゲットドメイン、特定ドメイン）を基準にスタイルコードを生成できる。したがって、スタイルコードは、前記いずれか１つのドメイン（ターゲットドメイン）によるドメイン特性情報が反映されて存在することができる。 As described above, the mapping network unit 410 is one of a plurality of domains (or a target domain or a specific domain) that can be classified into the noise information based on the appearance style of the second graphic object. Style code can be generated based on. Therefore, the style code can be present by reflecting the domain characteristic information by any one of the above domains (target domain).

一方、図５に示されたように、スタイルコードは、ドメインを基準に互いに異なるスケール（ｓｃａｌｅ）を有するベクトルで構成されることができる。 On the other hand, as shown in FIG. 5, the style code can be composed of vectors having different scales with respect to the domain.

例え、図示されてはいないが、マッピングネットワーク４００は、学習部をさらに備えることができる。マッピングネットワーク４００の学習部は、抽出されたノイズ情報をスタイルコードに変換する学習を行うことができる。 For example, although not shown, the mapping network 400 may further include a learning unit. The learning unit of the mapping network 400 can perform learning to convert the extracted noise information into a style code.

より具体的に、学習部は、抽出されたノイズ情報から、与えられた特定ドメインに対応するスタイル情報が抽出されるようにする学習を行うことができる。 More specifically, the learning unit can perform learning so that the style information corresponding to a given specific domain is extracted from the extracted noise information.

このような学習を介して、マッピングネットワーク部４１０は、ノイズ情報から前記特定ドメインが有する特徴（例えば、「女性」の特徴）をより正確に反映されるようにするスタイル情報を抽出できる。 Through such learning, the mapping network unit 410 can extract style information from the noise information so that the characteristics of the specific domain (for example, the characteristics of "female") are more accurately reflected.

すなわち、学習部は、マッピングネットワーク部４１０が、ノイズ情報から特定ドメイン（ターゲットドメイン）に対してありそうな（確率が高い）スタイル情報を抽出させる学習を進行できる。マッピングネットワーク部４１０は、特定ドメインに対してありそうなスタイル情報を含むスタイルコードを生成することにより、ソースイメージをより実際に近く変換することができる。 That is, the learning unit can proceed with the learning that the mapping network unit 410 extracts the style information that is likely (high probability) for the specific domain (target domain) from the noise information. The mapping network unit 410 can convert the source image closer to the actual one by generating a style code including style information that is likely to be for a specific domain.

例えば、ターゲットドメインが女性である場合、初期にマッピングネットワーク部４１０から抽出されたスタイルコードに「ひげ」に対するスタイル情報が含まれた場合、学習を介して、「ひげ」に対するスタイル情報が除外され得る。 For example, when the target domain is female, if the style code extracted from the mapping network unit 410 initially contains the style information for the "beard", the style information for the "beard" can be excluded through learning. ..

一方、マッピングネットワーク４００は、ガウス分布内に存在するノイズ情報に基づいてスタイルコードを生成するので、連続する隣接したノイズ情報は、類似したスタイル情報を含むことができる。 On the other hand, since the mapping network 400 generates a style code based on the noise information existing in the Gaussian distribution, the continuous adjacent noise information can include similar style information.

したがって、図１において説明したソースイメージ１００ａに対し、ターゲットドメインを「女性」としてイメージ変換を行う場合、図５において説明した特定ノイズ情報ｚ及びこれと隣接したノイズ情報に基づいて生成されたスタイルコードにより合成されたイメージ６１０、６２０、６３０、６４０、６６０は、図６に示されたように、隣り合った合成イメージと互いに類似した外貌スタイルを有することができる。 Therefore, when image conversion is performed on the source image 100a described in FIG. 1 with the target domain as “female”, the style code generated based on the specific noise information z described in FIG. 5 and the noise information adjacent thereto. Images 610, 620, 630, 640, 660 synthesized by can have appearance styles similar to each other as adjacent composite images, as shown in FIG.

以上で説明したように、本発明に係るマッピングネットワークシステムは、ノイズ情報から様々なドメインに対するスタイルコードを生成できる。さらに、イメージ生成部１１０は、このようなスタイルコードを用いて、ソースイメージに対する様々なドメインの変更を行いながら、様々なスタイルを有する合成イメージを生成できる。 As described above, the mapping network system according to the present invention can generate style codes for various domains from noise information. Further, the image generation unit 110 can generate a composite image having various styles while changing various domains with respect to the source image by using such a style code.

一方、以上では、マッピングネットワークシステムを利用してスタイルコードを生成する方法について説明したが、本発明では、スタイルエンコーダを用いて、スタイルコードを生成することも可能である。以下では、スタイルエンコーダを活用してスタイルコードを生成する方法について添付された図面とともにより具体的に説明する。図７及び図８は、本発明に係るスタイルエンコーダを用いてスタイルコードを生成する方法を説明するための概念図である。 On the other hand, in the above, the method of generating the style code by using the mapping network system has been described, but in the present invention, it is also possible to generate the style code by using the style encoder. In the following, the method of generating a style code by utilizing the style encoder will be described more specifically together with the attached drawings. 7 and 8 are conceptual diagrams for explaining a method of generating a style code using the style encoder according to the present invention.

前述したように、本発明に係るイメージ生成部１１０は、スタイルコード入力部１２０を介して入力されるスタイルコードを介して、ソースイメージ１００ａでどのドメインを基準にイメージを変換するかを決定できる。 As described above, the image generation unit 110 according to the present invention can determine which domain the source image 100a is based on to convert the image via the style code input via the style code input unit 120.

すなわち、スタイルコードは、特定ドメイン（または、ターゲットドメイン）に対するドメイン特性情報及び前記特定ドメインを基準に抽出されたスタイル情報を含むことができる。一方、スタイルコードに含まれたドメイン特性情報に基づいてソースイメージ１００ａの変換対象ターゲットドメインが決定される。 That is, the style code can include domain characteristic information for a specific domain (or a target domain) and style information extracted based on the specific domain. On the other hand, the conversion target domain of the source image 100a is determined based on the domain characteristic information included in the style code.

このようなスタイルコードは、図７に示されたスタイルエンコーダシステム７００から抽出されることができる。イメージ生成部１１０は、スタイルエンコーダシステム７００から抽出されたスタイルコードを用いて、ソースイメージの特定ドメインを、スタイルコードに含まれたドメイン特性情報による特定ドメイン（または、ターゲットドメイン）に変換することができる。 Such a style code can be extracted from the style encoder system 700 shown in FIG. The image generation unit 110 may convert a specific domain of the source image into a specific domain (or a target domain) based on the domain characteristic information included in the style code by using the style code extracted from the style encoder system 700. can.

より具体的に、図７に示されたように、スタイルエンコーダシステム７００は、スタイルエンコーダ７１０、入力部７２０、及び出力部７３０のうち、少なくとも１つを備えることができる。 More specifically, as shown in FIG. 7, the style encoder system 700 can include at least one of a style encoder 710, an input unit 720, and an output unit 730.

スタイルエンコーダ７１０は、入力部７２０を介して入力される基準イメージ（７０１ないし７０３）から特定ドメイン（または、ターゲットドメイン）を基準にスタイル情報を抽出できる。そして、スタイルエンコーダ部７１０は、抽出されたスタイル情報及び特定ドメインに対するドメイン特性情報を利用してスタイルコードを生成できる。 The style encoder 710 can extract style information based on a specific domain (or target domain) from a reference image (701 to 703) input via the input unit 720. Then, the style encoder unit 710 can generate a style code by using the extracted style information and the domain characteristic information for a specific domain.

スタイルエンコーダ７１０は、基準イメージ１０１ｂ（図７の図面符号７０１ないし７０６参照）から、基準イメージ１０１ｂの外貌スタイルと関連したスタイル情報を抽出できる。 The style encoder 710 can extract style information related to the appearance style of the reference image 101b from the reference image 101b (see drawing reference numerals 701 to 706 in FIG. 7).

このとき、スタイルエンコーダ７１０は、基準イメージから、前記基準イメージ１０１ｂの外貌スタイルを基に分類可能な複数のドメインのうち、いずれか１つのドメインを基準に前記スタイル情報を抽出できる。ここで、いずれか１つのドメインは、特定ドメインまたはターゲットドメインと命名されることができる。 At this time, the style encoder 710 can extract the style information from the reference image based on any one of the plurality of domains that can be classified based on the appearance style of the reference image 101b. Here, any one domain can be named a specific domain or a target domain.

図８に示された基準イメージ７０１を例を挙げて説明すれば、スタイルエンコーダ７１０は、基準イメージ７０１から、基準イメージ７０１の外貌スタイルを基に分類可能な複数のドメイン（例えば、女性、黒色の長髪、白色皮膚など）のうち、いずれか少なくとも１つのドメイン（例えば、女性）を基準にスタイル情報を抽出できる。 Taking the reference image 701 shown in FIG. 8 as an example, the style encoder 710 can be classified from the reference image 701 based on the appearance style of the reference image 701 (for example, female, black). Style information can be extracted based on at least one domain (for example, female) of long hair, white skin, etc.

ここで、基準になるドメインは、前述したように、ターゲットドメインと命名されることができる。スタイルエンコーダ７１０は、基準イメージ７０１から互いに異なるターゲットドメインに各々該当するスタイル情報を抽出し、これを利用してスタイルコードを生成できる。 Here, the reference domain can be named the target domain as described above. The style encoder 710 can extract style information corresponding to different target domains from the reference image 701 and generate a style code by using the style information.

例えば、図８に示されたように、スタイルコードに含まれる特定ドメイン（ターゲットドメイン）の基準が「性別」である場合、スタイルエンコーダ７１０は、基準イメージ７０１の性別（例えば、「女性」）がドメイン特性情報として含まれるようにスタイルコードを生成できる。 For example, as shown in FIG. 8, when the reference of the specific domain (target domain) included in the style code is "gender", the style encoder 710 has the gender of the reference image 701 (for example, "female"). Style codes can be generated to be included as domain characteristic information.

このとき、スタイルエンコーダ７１０は、基準イメージ７０１から前記特定ドメインが有する特徴（例えば、「女性」の特徴：長髪、化粧）を中心にスタイル情報を抽出できる。 At this time, the style encoder 710 can extract style information from the reference image 701, focusing on the characteristics of the specific domain (for example, the characteristics of "female": long hair, makeup).

さらに他の例として、図８に示されたように、スタイルコードに含まれる特定ドメイン（ターゲットドメイン）の基準が「年齢」である場合、スタイルエンコーダ７１０は、基準イメージ７０１の年齢（例えば、「若者」）がドメイン特性情報として含まれるようにスタイルコードを生成できる。 As yet another example, as shown in FIG. 8, when the reference of the specific domain (target domain) included in the style code is "age", the style encoder 710 is the age of the reference image 701 (for example, "age". Youth ") can be generated to include as domain characteristic information.

このとき、スタイルエンコーダ７１０は、基準イメージ７０１から前記特定ドメインが有する特徴（例えば、「若い女性」の特徴：滑らかな皮膚、化粧）を中心にスタイル情報を抽出できる。 At this time, the style encoder 710 can extract style information from the reference image 701, focusing on the characteristics of the specific domain (for example, the characteristics of a "young woman": smooth skin, makeup).

また、図示したように、スタイルエンコーダ７１０は、ヘアカラー、皮膚カラー、ヘアスタイル、顔型など、様々なターゲットドメインを基準に、基準イメージ７０１からスタイル情報を抽出できる。 Further, as shown in the figure, the style encoder 710 can extract style information from the reference image 701 based on various target domains such as hair color, skin color, hairstyle, and face type.

そして、このように抽出されたスタイル情報は、基準になるターゲットドメインに該当するドメイン特性情報を含んで、互いに異なるスタイルコードとして生成されることができる。 Then, the style information extracted in this way can be generated as different style codes from each other, including the domain characteristic information corresponding to the reference target domain.

前述したように、スタイルエンコーダ７１０は、基準イメージ７０１の外貌スタイルを基に分類可能な複数のドメイン（例えば、性別、頭髪スタイル等）のうち、いずれか１つのドメイン（または、ターゲットドメイン、特定ドメイン）を基準にスタイルコードを生成できる。したがって、スタイルコードは、前記いずれか１つのドメイン（ターゲットドメイン）によるドメイン特性情報が反映されて存在することができる。一方、図８に示されたように、スタイルコードは、ドメインを基準に互いに異なるスケール（ｓｃａｌｅ）のベクトルで構成されることができる。 As described above, the style encoder 710 is a domain (or a target domain, a specific domain) of a plurality of domains (for example, gender, hair style, etc.) that can be classified based on the appearance style of the reference image 701. ) Can be used as a reference to generate a style code. Therefore, the style code can be present by reflecting the domain characteristic information by any one of the above domains (target domain). On the other hand, as shown in FIG. 8, the style code can be composed of vectors having different scales with respect to the domain.

以上で説明したように、本発明に係るイメージ生成システムのイメージ生成部は、マッピングネットワークまたはスタイルエンコーダシステムを介して生成されたスタイルコードを用いて、ソースイメージの特定ドメインを基準イメージのターゲットドメインに変更することができる。 As described above, the image generation unit of the image generation system according to the present invention uses a style code generated via a mapping network or a style encoder system to set a specific domain of the source image as the target domain of the reference image. Can be changed.

一方、本発明に係るイメージ生成システムは、学習を介してイメージ生成の性能を高めることができ、以下では、学習過程について添付された図面とともにより具体的に説明する。図９は、本発明に係るイメージ生成システムを学習する方法を説明するための概念図である。 On the other hand, the image generation system according to the present invention can improve the performance of image generation through learning, and the learning process will be described more specifically with the attached drawings below. FIG. 9 is a conceptual diagram for explaining a method of learning the image generation system according to the present invention.

本発明では、様々な学習アルゴリズムを利用して、イメージ生成システムを学習させることが可能である。イメージ生成部（１１０、図１参照）は、スタイルコードによるターゲットドメインと区分されない合成イメージを作るようにする学習が進行される。 In the present invention, it is possible to train an image generation system by using various learning algorithms. The image generation unit (110, see FIG. 1) is trained to create a composite image that is not distinguished from the target domain by the style code.

例えば、図示されてはいないが、本発明に係るイメージ生成システム１００は、学習部をおき、様々な学習アルゴリズムを利用してイメージ生成部１１０に対する学習を行うことができる。イメージ生成部１１０は、スタイルコードにより定義されるターゲットドメイン（例えば、黒髪）と、さらに類似または同一の合成イメージを生成するように学習されることができる。 For example, although not shown, the image generation system 100 according to the present invention has a learning unit and can perform learning on the image generation unit 110 by using various learning algorithms. The image generator 110 can be trained to generate a composite image that is more similar or identical to the target domain (eg, black hair) defined by the style code.

一例として、学習部は、識別部（Ｄｉｓｃｒｉｍｉｎａｔｏｒ、９００）を利用して学習を進行できる。識別部９００は、ターゲットドメイン（例えば、黒髪）を基準に、合成イメージ２０１と基準イメージ１０１ｂとを比較できる。そして、比較結果に基づいて、識別部９００は、合成イメージ２０１が実際（または、本物）イメージ（ｒｅａｌｉｍａｇｅ）であるか、または、作られた偽物イメージ（ｆａｋｅｉｍａｇｅ）であるかを判断できる。 As an example, the learning unit can proceed with learning by using the identification unit (Discriminator, 900). The identification unit 900 can compare the composite image 201 and the reference image 101b with reference to the target domain (for example, black hair). Then, based on the comparison result, the identification unit 900 can determine whether the composite image 201 is a real (or real) image or a created fake image.

識別部９００は、合成イメージ２０１が実際イメージであると判断された場合、「１」の値を出力し、偽物イメージであると判断された場合、「０」の値を出力できる。 The identification unit 900 can output a value of "1" when it is determined that the composite image 201 is an actual image, and can output a value of "0" when it is determined that the composite image 201 is a fake image.

さらに、学習部は、識別部９００での比較結果に該当する、合成イメージ２０１と基準イメージ１０１との間の差値を用いてイメージ生成部１１０を学習できる。イメージ生成部１１０は、前記差値が最小になるようにするイメージを生成するように学習されることができる。 Further, the learning unit can learn the image generation unit 110 by using the difference value between the composite image 201 and the reference image 101, which corresponds to the comparison result in the identification unit 900. The image generation unit 110 can be learned to generate an image that minimizes the difference value.

また、例え、図示されてはいないが、スタイルエンコーダシステム７００は、学習部をさらに備えることができる。スタイルエンコーダシステム７００の学習部は、イメージ生成部１１０を介して生成された合成イメージから、前記合成イメージのスタイルコードが抽出されるように前記スタイルエンコーダを制御できる。ここで、合成イメージは、スタイルエンコーダ部７１０により生成されたスタイルコードにより生成されたイメージでありうる。 Further, although not shown, the style encoder system 700 can further include a learning unit. The learning unit of the style encoder system 700 can control the style encoder so that the style code of the composite image is extracted from the composite image generated via the image generation unit 110. Here, the composite image may be an image generated by the style code generated by the style encoder unit 710.

学習部は、スタイルエンコーダ７１０により生成されたスタイルコードが反映された合成イメージを利用してスタイルエンコーダ７１０を学習させることができる。 The learning unit can train the style encoder 710 by using the composite image reflecting the style code generated by the style encoder 710.

より具体的に、学習部は、スタイルエンコーダ７１０に合成イメージを基準イメージとして入力し、合成イメージからスタイルコードを生成できる。このとき、ターゲットドメインは、合成イメージの生成に使用されたスタイルコードのターゲットドメインと同一に設定されることができる。 More specifically, the learning unit can input a composite image into the style encoder 710 as a reference image and generate a style code from the composite image. At this time, the target domain can be set to be the same as the target domain of the style code used to generate the composite image.

一方、学習部は、合成イメージを生成するために使用されたスタイルコード（または、基準イメージのスタイルコード、第１のスタイルコード）と、合成イメージから生成されたスタイルコード（または、合成イメージのスタイルコード、第２のスタイルコード）とを比較し、比較結果を利用してイメージ生成部１１０を学習させることができる。すなわち、イメージ生成部１１０を介して生成された合成イメージにターゲットドメインのスタイル情報が含まれているか判断し、判断結果に基づいてイメージ生成部１１０が学習される方式である。 On the other hand, the learning unit has the style code used to generate the composite image (or the style code of the reference image, the first style code) and the style code generated from the composite image (or the style of the composite image). The code, the second style code) can be compared, and the image generation unit 110 can be trained by using the comparison result. That is, it is a method of determining whether the composite image generated via the image generation unit 110 includes the style information of the target domain, and the image generation unit 110 is learned based on the determination result.

前記学習部は、前記比較結果、ｉ）合成イメージを生成するために使用されたスタイルコード（または、基準イメージのスタイルコード、第１のスタイルコード）とｉｉ）合成イメージから生成されたスタイルコード（または、合成イメージのスタイルコード、第２のスタイルコード）とが互いに相違した場合、ｉ）合成イメージを生成するために使用されたスタイルコード（または、基準イメージのスタイルコード、第１のスタイルコード）とｉｉ）合成イメージから生成されたスタイルコード（または、合成イメージのスタイルコード、第２のスタイルコード）との差値が最小になるようにイメージ生成部１１０を学習させることができる。このとき、学習部は、スタイル再構成損失（ｓｔｙｌｅｒｅｃｏｎｓｔｒｕｃｔｉｏｎｌｏｓｓ）関数を利用して学習を行うことができる。 The learning unit includes the comparison result, i) the style code used to generate the composite image (or the style code of the reference image, the first style code) and ii) the style code generated from the composite image (i). Or, if the style code of the composite image (second style code) is different from each other, i) the style code used to generate the composite image (or the style code of the reference image, the first style code). And ii) The image generation unit 110 can be trained so that the difference value from the style code (or the style code of the composite image, the second style code) generated from the composite image is minimized. At this time, the learning unit can perform learning by using the style resonance loss function.

一方、以上で説明した学習の方法の他にも、学習部は、様々な損失（ｌｏｓｓ）関数（例えば、ダイバーシティセンシティブ損失（ｄｉｖｅｒｓｉｔｙｓｅｎｓｉｔｉｖｅｌｏｓｓ）関数、サイクル一貫性損失（ｃｙｃｌｅｃｏｎｓｉｓｔｅｎｃｙｌｏｓｓ））を利用して本発明に係るイメージ生成システムを学習させることができる。 On the other hand, in addition to the learning method described above, the learning unit uses various loss functions (for example, diversity sensitive loss function, cycle consistency loss). Then, the image generation system according to the present invention can be learned.

上述したように、本発明に係るイメージ生成システム及びこれを利用したイメージ生成方法は、ドメインの特性を含むスタイルコードを用いて、スタイルコードに含まれたドメイン特性に該当するドメインを有するイメージを生成できる。 As described above, the image generation system according to the present invention and the image generation method using the image generation system generate an image having a domain corresponding to the domain characteristic included in the style code by using the style code including the domain characteristic. can.

このとき、本発明では、スタイルコードにスタイル情報を含めることにより、スタイルコードだけで生成しようとするイメージのスタイル及びドメインを特定できる。 At this time, in the present invention, by including the style information in the style code, the style and domain of the image to be generated can be specified only by the style code.

したがって、本発明によれば、スタイルコードにどのドメインによるドメイン特性が反映されているかによって、生成されるイメージのドメインが様々に定義され得る。 Therefore, according to the present invention, the domain of the generated image can be variously defined depending on which domain reflects the domain characteristic in the style code.

すなわち、本発明では、イメージ生成部に入力されるスタイルコードにドメインの特性を反映することにより、単一のイメージ生成部だけでも互いに異なる様々なドメインに対応する様々なイメージを生成できる。 That is, in the present invention, by reflecting the characteristics of the domain in the style code input to the image generation unit, it is possible to generate various images corresponding to various domains different from each other even with a single image generation unit alone.

したがって、本発明によれば、ドメイン毎に別のイメージ生成部を備えなくとも、単一のイメージ生成部だけでも、様々なドメインに対する新しいイメージを生成できるドメイン側面での拡張性を提供できる。 Therefore, according to the present invention, even if a separate image generation unit is not provided for each domain, a single image generation unit alone can provide extensibility in terms of domains that can generate new images for various domains.

また、本発明は、スタイルコードにどのスタイルによるスタイル情報を含めるかによって、同じドメインに対して互いに異なるスタイルのイメージを生成できる。したがって、本発明は、スタイルコードに含まれるスタイル情報を変更させることだけでも、同じドメインに対する様々なスタイルのイメージを生成することにより、スタイル側面での多様性を提供できる。 Further, the present invention can generate images of different styles for the same domain depending on which style of style information is included in the style code. Therefore, the present invention can provide diversity in terms of style by generating images of various styles for the same domain simply by changing the style information contained in the style code.

一方、上記で説明した本発明は、コンピュータで１つ以上のプロセスによって実行され、このようなコンピュータ読み取り可能な媒体に格納可能なプログラムとして実現されることができる。 On the other hand, the present invention described above can be realized as a program that can be executed by one or more processes on a computer and can be stored in such a computer-readable medium.

さらに、上記で説明した本発明は、プログラムが記録された媒体にコンピュータ読み取り可能なコードまたは命令語として実現することが可能である。すなわち、本発明は、プログラムの形態で提供されることができる。 Further, the invention described above can be realized as a computer-readable code or instruction word on the medium on which the program is recorded. That is, the present invention can be provided in the form of a program.

一方、コンピュータ読み取り可能な媒体は、コンピュータシステムによって読み取られることができるデータが格納されるあらゆる種類の記録装置を含む。コンピュータ読み取り可能な媒体の例では、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｉｓｋ）、ＳＤＤ（ＳｉｌｉｃｏｎＤｉｓｋＤｒｉｖｅ）、ＲＯＭ、ＲＡＭ、ＣＤ−ＲＯＭ、磁気テープ、フロッピーディスク、光データ格納装置などがあり、また、キャリアウェーブ（例えば、インターネットを介しての送信）の形態で実現されることも含む。 Computer-readable media, on the other hand, include all types of recording devices that store data that can be read by computer systems. Examples of computer-readable media include HDDs (Hard Disk Drives), SSDs (Solid State Disks), SDDs (Silicon Disk Drives), ROMs, RAMs, CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and the like. It also includes being realized in the form of a carrier wave (eg, transmission over the Internet).

さらに、コンピュータ読み取り可能な媒体は、格納所を含み、電子機器が通信を介して接近できるサーバまたはクラウド格納所でありうる。 Further, the computer-readable medium may be a server or cloud vault, including the vault, accessible to electronic devices via communication.

さらに、本発明では、上記で説明したコンピュータは、プロセッサ、すなわち、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ、中央処理装置）が搭載された電子機器であって、その種類に対して特別な限定をおかない。 Further, in the present invention, the computer described above is an electronic device equipped with a processor, that is, a CPU (Central Processing Unit), and there is no particular limitation on the type thereof.

一方、上記の詳細な説明は、あらゆる面において制限的に解釈されてはならず、例示的なことと考慮されなければならない。本発明の範囲は、添付された請求項の合理的解釈により決定されなければならず、本発明の等価的範囲内での全ての変更は本発明の範囲に含まれる。 On the other hand, the above detailed description should not be construed in a restrictive manner in all respects and should be considered exemplary. The scope of the invention must be determined by reasonable interpretation of the appended claims, and all modifications within the equivalent scope of the invention are within the scope of the invention.

Claims

The image input section that receives the source image to be converted, and
The style code input section for inputting the style code related to the appearance style of the reference image,
An image generation unit that uses the style code to generate a composite image in which the appearance style of the reference image is reflected in the source image.
Equipped with
The appearance style of the reference image is an image generation system associated with a particular domain of the reference image.

The style code is
Includes domain characteristic information corresponding to the specific domain of the reference image
The image generation unit
The image generation system according to claim 1, wherein the composite image is generated so that the composite image has a specific domain of the reference image based on the style code.

The image generation unit
The specific domain of the source image is converted into the specific domain of the reference image to generate the composite image.
The image generation system according to claim 2, wherein the specific domain of the source image and the specific domain of the reference image are external attributes corresponding to each other.

The image generation unit
When the specific domain of the source image and the specific domain of the reference image are different from each other,
The image generation system according to claim 3, wherein the domain of the composite image is determined by giving priority to the specific domain of the reference image over the specific domain of the source image.

With more style encoders
The style encoder
The image generation system according to any one of claims 1 to 4, wherein style information related to the appearance style of the reference image is extracted from the reference image.

The style encoder
The style information is extracted from the reference image based on the specific domain of the reference image.
The image generation system according to claim 5, wherein the style code including the style information and domain characteristic information according to a specific domain of the reference image is generated.

The style code is
The image generation system according to claim 5, wherein the image generation system has different vector values depending on which domain is used as a reference for extracting style information related to the appearance style of the reference image from the reference image.

Further equipped with a mapping network part,
The mapping network part is
The image generation system according to any one of claims 1 to 7, wherein the style code associated with the specific domain of the reference image is generated by using the noise information extracted from the Gaussian distribution.

The mapping network part is
The image generation system according to claim 8, wherein the style information related to the specific domain of the reference image is extracted from the extracted noise information.

The mapping network part is
The extracted noise information is used to generate different style codes for each of the plurality of domains associated with the reference image.
The image generation system according to claim 9, wherein the specific domain of the reference image is any one of the plurality of domains.

The source image is
It contains at least one glamor feature portion that determines the glamor identity of the source image.
The image generation unit
The image generation system according to any one of claims 1 to 10, wherein the appearance style of the reference image is reflected on the source image centering on the remaining portion excluding the appearance feature portion of the source image.

When the source image and the reference image correspond to a person
The external feature portion of the source image is
Corresponds to at least one of the human eye, nose, and mouth,
The appearance style of the reference image is
11. The image generation system of claim 11, which is associated with at least one of a person's hair style, beard, age, skin color, and makeup.

With more identification
The identification unit is
Based on the reference image, it is determined whether or not the composite image is a fake image generated by the image generation unit for a specific domain of the reference image.
When the composite image is identified as a fake image as a result of the identification, the image generation unit is trained so that the difference between the reference image and the composite image is minimized between the reference image and the fake image. The image generation system according to any one of the above.

With more learning department,
The learning unit
Using the style encoder, the style code associated with the specific domain of the reference image is extracted from the composite image.
The image generation system according to any one of claims 1 to 13, which compares the style code of the composite image with the style code of the reference image.

The learning unit
As a result of comparison, when the style code of the composite image and the style code of the reference image are different from each other, the image generation unit is such that the difference value between the style code of the composite image and the style code of the reference image is minimized. 14. The image generation system according to claim 14.

The step of receiving the source image to be converted, and
Steps to receive the style code associated with the glamor style of the reference image,
Using the style code, a step of generating a composite image in which the appearance style of the reference image is reflected in the source image, and
Including
The appearance style of the reference image is an image generation method associated with a specific domain of the reference image.