JP2023143446A

JP2023143446A - Operation method of information processing device, information processing device, and program

Info

Publication number: JP2023143446A
Application number: JP2022050828A
Authority: JP
Inventors: 駿帯金; Shun Obikane; 理恵中村; Rie Nakamura; 勲石井; Isao Ishii
Original assignee: Kose Corp
Current assignee: Kose Corp
Priority date: 2022-03-25
Filing date: 2022-03-25
Publication date: 2023-10-06

Abstract

To make it possible to convert images so as not to damage attributes of original face images while expressing various makeup methods.SOLUTION: An operation method of an information processing device includes: a first step of converting a first face image to a second face image by applying makeup information to the first face image by a prescribed procedure; a second step of converting the first face image to a third face image with the second face image as a reference image by an image conversion model generated by machine learning processing for converting a conversion object face image by applying makeup information included in a reference face image; and a third step of adjusting the image conversion model by reducing a loss of the first face image in the third face image.SELECTED DRAWING: Figure 3

Description

本開示は、情報処理装置の動作方法、情報処理装置、及びプログラムに関する。 The present disclosure relates to an operating method of an information processing device, an information processing device, and a program.

参照画像のスタイルを他の画像に適用する処理を機械学習した画像変換モデルを用い、対象画像のスタイルを変換する技術が知られている（例えば特許文献１）。かかる技術は、美容分野などにおいて応用されている。例えば、メイクアップが施されていない顔画像にメイクアップが施された顔画像、つまり参照画像のメイクアップ情報を付与する画像変換モデルを用いることで、ユーザの顔画像をメイクアップが施された顔画像に変換し、メイクアップを仮想的に試行する、メイクアップ変換方法が提案されている。 2. Description of the Related Art There is a known technique for converting the style of a target image using an image conversion model obtained by machine learning processing for applying the style of a reference image to another image (for example, Patent Document 1). Such technology is applied in the beauty field and the like. For example, by using an image conversion model that adds make-up information from a face image with make-up applied to a face image without make-up, that is, a reference image, the user's face image can be transformed into a face image with make-up applied. A makeup conversion method has been proposed in which a face image is converted and makeup is virtually tried out.

特開２０２１－１３５８２２号JP2021-135822

参照画像の多様なスタイルをＧＡＮ（Generative Adversarial Network（敵対的生成ネットワーク））等の手法により機械学習した画像変換モデルには、学習が安定せずに、対象画像のスタイルを変換する際に対象画像の属性が損なわれてしまうという問題がある。メイクアップ変換の場合、元の顔画像の質感が損なわれるおそれがある。また、メイクアップ手法はメイクアップアーティストの属人的スキルに依存し日々変化するのでばらつきが生じ易いところ、あらゆるメイクアップを表現可能な画像変換モデルを機械学習により生成するためには際限なく学習を行わねばならず、実用的ではない。 Image transformation models that machine learn various styles of reference images using methods such as GAN (Generative Adversarial Network) do not have stable learning, and when converting the style of the target image, There is a problem that the attributes of the image are lost. In the case of makeup conversion, the texture of the original facial image may be lost. In addition, makeup techniques depend on the makeup artist's personal skills and change from day to day, so variations are likely to occur. In order to use machine learning to generate an image conversion model that can express any makeup, endless learning is required. It has to be done and is not practical.

上記に鑑み、以下では、種々のメイクアップ手法を表現しつつ元の顔画像の属性を損なわないような画像変換を可能にする、情報処理装置の動作方法等を開示する。 In view of the above, hereinafter, a method of operating an information processing apparatus and the like will be disclosed that enables image conversion that does not impair the attributes of the original facial image while expressing various makeup techniques.

上記課題を解決するために本開示における情報処理装置の動作方法は、第１の顔画像に所定の手順によりメイクアップ情報を付与して第２の顔画像に変換する第１の工程と、参照用の顔画像に含まれるメイクアップ情報を付与することで変換対象の顔画像を変換する処理を機械学習して生成された画像変換モデルにより、前記第２の顔画像を参照用の画像として前記第１の顔画像を第３の顔画像に変換する第２の工程と、前記第３の顔画像における前記第１の顔画像の損失を低減させることで前記画像変換モデルを調整する第３の工程と、を含む。 In order to solve the above problems, an operating method of an information processing apparatus according to the present disclosure includes a first step of adding makeup information to a first face image according to a predetermined procedure and converting it into a second face image; The second face image is used as a reference image by an image conversion model generated by machine learning to transform the face image to be converted by adding makeup information included in the face image for conversion. a second step of converting a first facial image into a third facial image; and a third step of adjusting the image transformation model by reducing loss of the first facial image in the third facial image. process.

また、本開示における情報処理装置は、参照用の顔画像に含まれるメイクアップ情報を付与することで変換対象の顔画像を変換する処理を機械学習して生成され、第１の顔画像に所定の手順によりメイクアップ情報が付与された顔画像を参照用の画像として前記第１の顔画像を第２の顔画像に変換した場合に前記第１の顔画像の前記第２の顔画像における損失を低減させるように調整された、画像変換モデルを格納する記憶部と、
入力される顔画像を、前記所定の手順を実行してから前記画像変換モデルにより出力用の顔画像に変換する制御部と、を有する。 In addition, the information processing device according to the present disclosure is generated by machine learning processing for converting a face image to be converted by adding makeup information included in a reference face image, and is generated by applying makeup information to a first face image. Loss of the first face image in the second face image when the first face image is converted into a second face image using the face image to which makeup information has been added as a reference image according to the procedure of a storage unit that stores an image transformation model adjusted to reduce
and a control unit that converts an input facial image into an output facial image using the image conversion model after executing the predetermined procedure.

さらに、本開示におけるプログラムは、情報処理装置により実行されるプログラムであって、前記情報処理装置が、参照用の顔画像に含まれるメイクアップ情報を付与することで変換対象の顔画像を変換する処理を機械学習して生成され、第１の顔画像に所定の手順によりメイクアップ情報が付与された顔画像を参照用の画像として前記第１の顔画像を第２の顔画像に変換した場合に前記第１の顔画像の前記第２の顔画像における損失を低減させるように調整された、画像変換モデルを使用可能であり、入力される顔画像に前記所定の手順を実行する第１の工程と、前記第１の工程が実行された前記顔画像を前記画像変換モデルにより出力用の顔画像に変換する第２の工程と、を含む。 Furthermore, a program in the present disclosure is a program executed by an information processing device, wherein the information processing device converts a face image to be converted by adding makeup information included in a reference face image. When the first face image is converted into a second face image using a face image generated by machine learning processing and with makeup information added to the first face image according to a predetermined procedure as a reference image. an image transformation model adjusted to reduce a loss of the first facial image in the second facial image, and performing the predetermined procedure on an input facial image; and a second step of converting the face image on which the first step has been performed into a face image for output using the image conversion model.

本開示における情報処理装置の動作方法等によれば、種々のメイクアップ手法を表現しつつ元の顔画像の属性を損なわないような画像変換が可能になる。 According to the operating method of the information processing device according to the present disclosure, it is possible to perform image conversion that expresses various makeup techniques while not impairing the attributes of the original facial image.

情報処理システムの構成例を示す図である。FIG. 1 is a diagram illustrating a configuration example of an information processing system. サーバ装置の動作手順例を示すフローチャート図である。FIG. 3 is a flowchart diagram illustrating an example of an operation procedure of the server device. サーバ装置の動作手順例を示すフローチャート図である。FIG. 3 is a flowchart diagram illustrating an example of an operation procedure of the server device. 画像変換に用いられる顔画像について説明する図である。FIG. 3 is a diagram illustrating a face image used for image conversion. サーバ装置と端末装置の動作手順例を示すシーケンス図である。FIG. 2 is a sequence diagram showing an example of an operation procedure of a server device and a terminal device.

以下、本発明の実施の形態について説明する。 Embodiments of the present invention will be described below.

［システム構成］
図１は、本発明の一実施形態の構成例を示す図である。情報処理システム１は、ネットワーク１１を介して互いに情報通信可能に接続されるサーバ装置１０と端末装置１２とを有する。情報処理システム１では、端末装置１２から送られる各種情報を用いてサーバ装置１０が機械学習を行う。端末装置１２は、例えば、一以上のパーソナルコンピュータである。パーソナルコンピュータは、タブレット端末装置、スマートフォン等を含んでもよい。サーバ装置１０は、例えば、一以上のサーバコンピュータである。サーバ装置１０が単一のサーバコンピュータである場合、サーバ装置１０は、本実施形態における動作を連係して実行しクラウドサービスを提供する複数のサーバコンピュータであってもよい。ネットワーク１１は、例えば、ＬＡＮ（Local Area Network）、インターネット、アドホックネットワーク、ＭＡＮ(Metropolitan Area Network)、移動体通信網もしくは他のネットワーク又はこれらいずれかの組合せである。 [System configuration]
FIG. 1 is a diagram showing a configuration example of an embodiment of the present invention. The information processing system 1 includes a server device 10 and a terminal device 12 that are connected to each other via a network 11 so as to be able to communicate information. In the information processing system 1, the server device 10 performs machine learning using various information sent from the terminal device 12. The terminal device 12 is, for example, one or more personal computers. The personal computer may include a tablet terminal device, a smartphone, and the like. The server device 10 is, for example, one or more server computers. When the server device 10 is a single server computer, the server device 10 may be a plurality of server computers that coordinately execute the operations in this embodiment and provide cloud services. The network 11 is, for example, a LAN (Local Area Network), the Internet, an ad hoc network, a MAN (Metropolitan Area Network), a mobile communication network, or another network, or a combination of any of these.

サーバ装置１０は、人物の顔を撮像して得られる顔画像を端末装置１２から取得し、顔画像を用いて機械学習を行い、画像変換モデル１０８を生成する。端末装置１２は、例えば、ユーザ所有の装置、実店舗に設置され販売員により用いられる装置等である。画像変換モデル１０８は、メイクアップが施されていない顔画像（人物の正面視における顔全体を含む画像）にメイクアップが施された顔画像、つまり参照画像のメイクアップ情報を付与する画像変換モデルである。また、サーバ装置１０は、メイクアップが施されていない顔画像（以下、元顔画像という）に所定の手順でメイクアップ情報を付与した画像を用いて、画像変換モデル１０８により元顔画像を変換する際に元顔画像の属性が損なわれないように、画像変換モデル１０８を調整する。 The server device 10 acquires a facial image obtained by capturing a person's face from the terminal device 12, performs machine learning using the facial image, and generates an image conversion model 108. The terminal device 12 is, for example, a device owned by a user, a device installed in a physical store and used by a salesperson, or the like. The image conversion model 108 is an image conversion model that adds make-up information of a face image with make-up applied, that is, a reference image, to a face image without make-up (an image including the entire face of a person when viewed from the front). It is. Further, the server device 10 converts the original face image using the image conversion model 108 using an image in which makeup information is added to a face image without makeup (hereinafter referred to as the original face image) using a predetermined procedure. The image conversion model 108 is adjusted so that the attributes of the original face image are not impaired during the process.

具体的には、サーバ装置１０は、元顔画像に所定の手順によりメイクアップ情報を付与して一次顔画像に変換する第１の工程（以下、アルゴリズム処理工程という）を実行する。また、サーバ装置１０は、参照用の顔画像に含まれるメイクアップ情報を付与することで変換対象の顔画像を変換する処理を機械学習して生成された画像変換モデル１０８により、一次顔画像を参照用の顔画像として元顔画像を二次顔画像に変換する第２の工程（以下、メイクアップ変換工程という）を実行する。そして、サーバ装置１０は、二次顔画像における元顔画像の損失を低減させることで画像変換モデル１０８を調整する第３の工程（以下、調整工程という）を実行する。ここでは、サーバ装置１０が「情報処理装置」に対応する。 Specifically, the server device 10 executes a first step (hereinafter referred to as an algorithm processing step) of adding makeup information to the original facial image according to a predetermined procedure and converting it into a primary facial image. Further, the server device 10 converts the primary facial image into a primary facial image using an image conversion model 108 that is generated by machine learning processing for converting a facial image to be converted by adding makeup information included in a reference facial image. A second step (hereinafter referred to as makeup conversion step) of converting the original facial image into a secondary facial image as a reference facial image is executed. Then, the server device 10 executes a third step (hereinafter referred to as an adjustment step) of adjusting the image conversion model 108 by reducing the loss of the original face image in the secondary face image. Here, the server device 10 corresponds to an "information processing device."

本実施形態によれば、サーバ装置１０は、メイクアップアーティストによるメイクアップ手法を再現するための手順によりアルゴリズム処理工程を実行することで、機械学習を経なくてもメイクアップアーティストの手法を模擬することが可能となる。また、アルゴリズム処理工程としてメイクアップ情報の付与手順を分離することで、機械学習すべき情報処理量を低減することができ、機械学習の安定性が向上する。さらに、サーバ装置１０は、画像変換モデル１０８により元顔画像を二次顔画像に変換させ、二次顔画像における元顔画像の損失を低減させることで画像変換モデル１０８を調整するので、元顔画像をアルゴリズム処理工程とメイクアップ変換工程とを経て二次顔画像に変換する際に、元顔画像の属性が損なわれることを抑制することが可能となる。すなわち、種々のメイクアップ手法を再現しつつ元顔画像の属性、特に、ツヤ、マット、シアー、フォギーといったいわゆる質感を損なわないような画像変換が可能になる。 According to the present embodiment, the server device 10 simulates the makeup artist's technique without going through machine learning by executing the algorithm processing step according to a procedure for reproducing the makeup artist's technique. becomes possible. Furthermore, by separating the procedure for adding makeup information as an algorithm processing step, the amount of information processing to be performed by machine learning can be reduced, and the stability of machine learning is improved. Furthermore, the server device 10 adjusts the image conversion model 108 by converting the original face image into a secondary face image using the image conversion model 108 and reducing the loss of the original face image in the secondary face image. When converting an image into a secondary facial image through an algorithm processing step and a makeup conversion step, it is possible to prevent the attributes of the original facial image from being impaired. In other words, it is possible to perform image conversion that reproduces various makeup techniques without impairing the attributes of the original facial image, especially the so-called textures such as gloss, matte, sheer, and foggy.

次いで、サーバ装置１０及び端末装置１２の構成について説明する。 Next, the configurations of the server device 10 and the terminal device 12 will be explained.

サーバ装置１０は、通信部１０１、記憶部１０２、制御部１０３、入力部１０５、及び出力部１０６を有する。これらの構成は、サーバ装置１０が二以上のサーバコンピュータで構成される場合には、二以上のサーバコンピュータに適宜に配置される。 The server device 10 includes a communication section 101, a storage section 102, a control section 103, an input section 105, and an output section 106. When the server device 10 is composed of two or more server computers, these configurations are appropriately arranged in the two or more server computers.

通信部１０１は、一以上の通信用インタフェースを含む。通信用インタフェースは、例えば、ＬＡＮインタフェースである。通信部１０１は、サーバ装置１０の動作に用いられる情報を受信し、またサーバ装置１０の動作によって得られる情報を送信する。サーバ装置１０は、通信部１０１によりネットワーク１１に接続され、ネットワーク１１経由で端末装置１２と情報通信を行う。 Communication unit 101 includes one or more communication interfaces. The communication interface is, for example, a LAN interface. The communication unit 101 receives information used for the operation of the server device 10 and transmits information obtained by the operation of the server device 10. The server device 10 is connected to a network 11 by a communication unit 101 and performs information communication with a terminal device 12 via the network 11.

記憶部１０２は、例えば、主記憶装置、補助記憶装置、又はキャッシュメモリとして機能する一以上の半導体メモリ、一以上の磁気メモリ、一以上の光メモリ、又はこれらのうち少なくとも２種類の組み合わせを含む。半導体メモリは、例えば、ＲＡＭ（Random Access Memory）又はＲＯＭ（Read Only Memory）である。ＲＡＭは、例えば、ＳＲＡＭ（Static RAM）又はＤＲＡＭ（Dynamic RAM）である。ＲＯＭは、例えば、ＥＥＰＲＯＭ（Electrically Erasable Programmable ROM）である。記憶部１０２は、制御部１０３の動作に用いられる情報と、制御部１０３の動作によって得られた情報とを格納する。記憶部１０２は、端末装置１２から送られる情報に基づき制御部１０３が生成する画像変換モデル１０８を格納する。 The storage unit 102 includes, for example, one or more semiconductor memories, one or more magnetic memories, one or more optical memories, or a combination of at least two of these, which function as a main storage device, an auxiliary storage device, or a cache memory. The semiconductor memory is, for example, RAM (Random Access Memory) or ROM (Read Only Memory). The RAM is, for example, SRAM (Static RAM) or DRAM (Dynamic RAM). The ROM is, for example, an EEPROM (Electrically Erasable Programmable ROM). The storage unit 102 stores information used for the operation of the control unit 103 and information obtained by the operation of the control unit 103. The storage unit 102 stores an image conversion model 108 generated by the control unit 103 based on information sent from the terminal device 12.

制御部１０３は、一以上のプロセッサ、一以上の専用回路、又はこれらの組み合わせを含む。プロセッサは、例えば、ＣＰＵ（Central Processing Unit）などの汎用プロセッサ、又は特定の処理に特化したＧＰＵ（Graphics Processing Unit）等の専用プロセッサである。専用回路は、例えば、ＦＰＧＡ（Field-Programmable Gate Array）、ＡＳＩＣ（Application Specific Integrated Circuit）等である。制御部１０３は、サーバ装置１０の各部を制御しながら、サーバ装置１０の動作に係る情報処理を実行する。 Control unit 103 includes one or more processors, one or more dedicated circuits, or a combination thereof. The processor is, for example, a general-purpose processor such as a CPU (Central Processing Unit), or a dedicated processor such as a GPU (Graphics Processing Unit) specialized for specific processing. The dedicated circuit is, for example, an FPGA (Field-Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit). The control unit 103 executes information processing related to the operation of the server device 10 while controlling each part of the server device 10 .

サーバ装置１０の機能は制御部１０３に含まれるプロセッサが、制御プログラムを実行することにより実現される。制御プログラムは、プロセッサを制御部１０３として機能させるためのプログラムである。また、サーバ装置１０の一部又は全ての機能が、制御部１０３に含まれる専用回路により実現されてもよい。また、制御プログラムは、制御部１０３により読取り可能な非一過性の記録・記憶媒体に格納され、制御部１０３が媒体から読み取ってもよい。 The functions of the server device 10 are realized by a processor included in the control unit 103 executing a control program. The control program is a program for causing the processor to function as the control unit 103. Furthermore, some or all of the functions of the server device 10 may be realized by a dedicated circuit included in the control unit 103. Further, the control program may be stored in a non-transitory recording/storage medium that can be read by the control unit 103, and may be read from the medium by the control unit 103.

入力部１０５は、一以上の入力用インタフェースを含む。入力用インタフェースは、例えば、物理キー、静電容量キー、ポインティングデバイス、ディスプレイと一体的に設けられたタッチスクリーン、又は音声入力を受け付けるマイクロフォンである。入力部１０５は、サーバ装置１０の動作に用いられる情報を入力する操作を受け付け、入力される情報を制御部１０３に送る。 The input unit 105 includes one or more input interfaces. The input interface is, for example, a physical key, a capacitive key, a pointing device, a touch screen provided integrally with the display, or a microphone that accepts voice input. The input unit 105 accepts an operation to input information used for the operation of the server device 10 and sends the input information to the control unit 103.

出力部１０６は、一以上の出力用インタフェースを含む。出力用インタフェースは、例えば、ディスプレイ又はスピーカである。ディスプレイは、例えば、ＬＣＤ（Liquid Crystal Display）又は有機ＥＬ（Electro-Luminescence）ディスプレイである。出力部１０６は、サーバ装置１０の動作によって得られる情報を出力する。 The output unit 106 includes one or more output interfaces. The output interface is, for example, a display or a speaker. The display is, for example, an LCD (Liquid Crystal Display) or an organic EL (Electro-Luminescence) display. The output unit 106 outputs information obtained by the operation of the server device 10.

端末装置１２は、通信部１２１、記憶部１２２、制御部１２３、入力部１２５及び出力部１２６を有する。 The terminal device 12 includes a communication section 121, a storage section 122, a control section 123, an input section 125, and an output section 126.

通信部１２１は、有線又は無線ＬＡＮ規格に対応する通信モジュール、ＬＴＥ、４Ｇ、５Ｇ等の移動体通信規格に対応するモジュール等を有する。端末装置１２は、通信部１２１により、近傍のルータ装置又は移動体通信の基地局を介してネットワーク１１に接続され、ネットワーク１１経由でサーバ装置１０等と情報通信を行う。 The communication unit 121 includes a communication module compatible with wired or wireless LAN standards, a module compatible with mobile communication standards such as LTE, 4G, and 5G, and the like. The terminal device 12 is connected to the network 11 by a communication unit 121 via a nearby router device or a mobile communication base station, and performs information communication with the server device 10 and the like via the network 11.

記憶部１２２は一以上の半導体メモリ、一以上の磁気メモリ、一以上の光メモリ、又はこれらのうち少なくとも２種類の組み合わせを含む。半導体メモリは、例えば、ＲＡＭ又はＲＯＭである。ＲＡＭは、例えば、ＳＲＡＭ又はＤＲＡＭである。ＲＯＭは、例えば、ＥＥＰＲＯＭである。記憶部１２２は、例えば、主記憶装置、補助記憶装置、又はキャッシュメモリとして機能する。記憶部１２２は、制御部１２３の動作に用いられる情報と、制御部１２３の動作によって得られた情報とを格納する。 The storage unit 122 includes one or more semiconductor memories, one or more magnetic memories, one or more optical memories, or a combination of at least two of these. The semiconductor memory is, for example, RAM or ROM. The RAM is, for example, SRAM or DRAM. The ROM is, for example, an EEPROM. The storage unit 122 functions as, for example, a main storage device, an auxiliary storage device, or a cache memory. The storage unit 122 stores information used for the operation of the control unit 123 and information obtained by the operation of the control unit 123.

制御部１２３は、例えば、ＣＰＵ、ＭＰＵ（Micro Processing Unit）等の一以上の汎用プロセッサ、又は特定の処理に特化したＧＰＵ等の一以上の専用プロセッサを有する。あるいは、制御部１２３は、一以上の、ＦＰＧＡ、ＡＳＩＣ等の専用回路を有してもよい。制御部１２３は、制御・処理プログラムに従って動作したり、あるいは、回路として実装された動作手順に従って動作したりすることで、端末装置１２の動作を統括的に制御する。そして、制御部１２３は、通信部１２１を介してサーバ装置１０等と各種情報を送受し、本実施形態にかかる動作を実行する。 The control unit 123 includes, for example, one or more general-purpose processors such as a CPU or an MPU (Micro Processing Unit), or one or more dedicated processors such as a GPU specialized for specific processing. Alternatively, the control unit 123 may include one or more dedicated circuits such as FPGA or ASIC. The control unit 123 comprehensively controls the operation of the terminal device 12 by operating according to a control/processing program or operating according to an operation procedure implemented as a circuit. The control unit 123 transmits and receives various information to and from the server device 10 and the like via the communication unit 121, and executes operations according to the present embodiment.

端末装置１２の機能は、制御部１２３に含まれるプロセッサが制御プログラムを実行することにより実現される。制御プログラムは、プロセッサを制御部１２３として機能させるためのプログラムである。また、端末装置１２の一部又は全ての機能が、制御部１２３に含まれる専用回路により実現されてもよい。また、制御プログラムは、制御部１２３に読取り可能な非一過性の記録・記憶媒体に格納され、制御部１２３が媒体から読み取ってもよい。 The functions of the terminal device 12 are realized by a processor included in the control unit 123 executing a control program. The control program is a program for causing the processor to function as the control unit 123. Further, some or all of the functions of the terminal device 12 may be realized by a dedicated circuit included in the control unit 123. Further, the control program may be stored in a non-transitory recording/storage medium that can be read by the control unit 123, and may be read by the control unit 123 from the medium.

入力部１２５は、一以上の入力用インタフェースを含む。入力用インタフェースは、例えば、物理キー、静電容量キー、ポインティングデバイス、およびディスプレイと一体的に設けられたタッチスクリーンを含む。また、入力用インタフェースは、音声入力を受け付けるマイクロフォン、及び撮像画像を取り込むカメラを含む。更に、入力用インタフェースは、画像コードをスキャンするスキャナ又はカメラ、ＩＣカードリーダを含んでもよい。入力部１２５は、制御部１２３の動作に用いられる情報を入力する操作を受け付け、入力される情報を制御部１２３に送る。また、入力部１２５は、カメラによる撮像画像を制御部１２３に送る。 The input unit 125 includes one or more input interfaces. Input interfaces include, for example, physical keys, capacitive keys, pointing devices, and touch screens integrated with the display. The input interface also includes a microphone that accepts audio input and a camera that captures captured images. Furthermore, the input interface may include a scanner or camera for scanning image codes, and an IC card reader. The input unit 125 accepts an operation to input information used for the operation of the control unit 123 and sends the input information to the control unit 123. Further, the input unit 125 sends an image captured by the camera to the control unit 123.

出力部１２６は、一以上の出力用インタフェースを含む。出力用インタフェースは、例えば、ディスプレイ、及びスピーカを含む。ディスプレイは、例えば、ＬＣＤ又は有機ＥＬディスプレイである。出力部１２６は、制御部１２３の動作によって得られる情報を出力する。 The output unit 126 includes one or more output interfaces. The output interface includes, for example, a display and a speaker. The display is, for example, an LCD or an organic EL display. The output unit 126 outputs information obtained by the operation of the control unit 123.

［画像変換モデルの生成］
図２は、画像変換モデル１０８の生成に係るサーバ装置１０の動作例を説明するためのフローチャート図である。各ステップは、制御部１０３により実行される。 [Generation of image conversion model]
FIG. 2 is a flowchart for explaining an example of the operation of the server device 10 related to generation of the image conversion model 108. Each step is executed by the control unit 103.

ステップＳ２０において、制御部１０３は、画像変換モデルを生成するための機械学習に必要な顔画像を取得する。顔画像は、メイクアップが施されていない、変換対象の元顔画像と、メイクアップが施された参照用の顔画像とを含む。元顔画像は、メイクアップが施されていない人物の顔を撮像することにより生成される。また、参照用の顔画像は、メイクアップが施された人物の顔を撮像することにより生成される。人物の顔の撮像は、例えば、端末装置１２により行われる。例えば、端末装置１２から送られる複数の元顔画像と、複数の参照用の顔画像とを、制御部１０３が通信部１０１を介して受けて、記憶部１０２に格納する。制御部１０３は、オープンデータから複数の元顔画像と、複数の参照用の顔画像を取得してもよい。 In step S20, the control unit 103 acquires facial images necessary for machine learning to generate an image conversion model. The face image includes an original face image to be converted without makeup and a reference face image with makeup. The original face image is generated by capturing an image of a person's face without makeup. Further, a reference face image is generated by capturing an image of a person's face with makeup applied. For example, the terminal device 12 captures an image of a person's face. For example, the control unit 103 receives a plurality of original face images and a plurality of reference face images sent from the terminal device 12 via the communication unit 101, and stores them in the storage unit 102. The control unit 103 may acquire a plurality of original facial images and a plurality of reference facial images from open data.

ステップＳ２２において、制御部１０３は、機械学習を実行する。制御部１０３は、例えば、ＧＡＮを用いた深層学習を実行する。制御部１０３は、元顔画像に参照用の顔画像のメイクアップ情報を付与する生成器に対応するモジュールと、生成器が生成する顔画像を元顔画像から識別する識別器に対応するモジュールとを有する。制御部１０３は、生成器と識別器を敵対的に学習させることで、画像変換モデル１０８を生成する。制御部１０３は、生成した画像変換モデル１０８を記憶部１０２に格納する。 In step S22, the control unit 103 executes machine learning. The control unit 103 executes deep learning using GAN, for example. The control unit 103 includes a module corresponding to a generator that adds makeup information of a reference facial image to the original facial image, and a module corresponding to a discriminator that identifies the facial image generated by the generator from the original facial image. has. The control unit 103 generates the image transformation model 108 by causing the generator and the classifier to learn adversarially. The control unit 103 stores the generated image conversion model 108 in the storage unit 102.

［画像変換モデルの調整］
図３は、画像変換モデル１０８の調整に係るサーバ装置１０の動作例を説明するためのフローチャート図である。各ステップは、制御部１０３により実行される。図４は、画像変換モデル１０８の調整に用いられる顔画像について説明するための図である。図４を参照しつつ、図３の手順について説明する。 [Adjust image conversion model]
FIG. 3 is a flowchart for explaining an example of the operation of the server device 10 related to adjustment of the image conversion model 108. Each step is executed by the control unit 103. FIG. 4 is a diagram for explaining a face image used for adjusting the image conversion model 108. The procedure in FIG. 3 will be explained with reference to FIG. 4.

ステップＳ３０において、制御部１０３は、アルゴリズム処理工程を実行するための元顔画像Iｓｒｃを取得する。元顔画像Ｉｓｒｃは、メイクアップが施されていない顔画像であって、メイクアップが施されていない人物の顔を例えば端末装置１２が撮像することにより生成される。例えば、端末装置１２から送られる元顔画像Ｉｓｒｃを、制御部１０３が通信部１０１を介して受けて、記憶部１０２に格納する。制御部１０３は、オープンデータから複数の元顔画像Ｉｓｒｃを取得してもよい。 In step S30, the control unit 103 acquires the original face image Isrc for executing the algorithm processing step. The original face image Isrc is a face image without makeup, and is generated by, for example, the terminal device 12 capturing an image of a person's face without makeup. For example, the control unit 103 receives the original face image Isrc sent from the terminal device 12 via the communication unit 101 and stores it in the storage unit 102. The control unit 103 may acquire a plurality of original facial images Isrc from open data.

ステップＳ３２において、制御部１０３は、アルゴリズム処理工程を実行する。制御部１０３は、元顔画像Ｉｓｒｃに所定のメイクアップ情報ＭＵを付与し、元顔画像Ｉｓｒｃを一次顔画像Ｉｓｙｎに変換する。メイクアップ情報ＭＵは、元顔画像Ｉｓｒｃにおける目元、鼻筋、頬、唇を含む部位（以下、メイク部位という）に対し付与される色相、明度、及び彩度のいずれか一以上を含む。メイクアップ情報ＭＵと、メイクアップ情報を付与する手順は、予め任意に設定される画像処理手順である。メイク部位に付される色相、明度、及び彩度は、任意に定量的に定めてもよいし、参照用の顔画像に既に付された色の情報を抽出して適用してもよい。参照用の顔画像は、ユーザが、例えば自らのイメージに合わせて任意に選択することが可能である。参照用画像は、例えば、エレガント、クール、トレンドといった各種タイプのメイクが施された顔画像、及び、実在の人物の顔画像を含む。メイクアップ情報ＭＵは、端末装置１２において操作者が入力し、端末装置１２からサーバ装置１０へ送られてもよい。制御部１０３は、メイクアップ情報ＭＵにおいて任意に設定された色のアイライン、ノーズシャドウ、チークカラー、又はリップカラー等をそれぞれ顔画像Ｉｓｒｃにおける目元、鼻筋、頬、又は唇に付与する画像処理を実行する。 In step S32, the control unit 103 executes an algorithm processing step. The control unit 103 adds predetermined makeup information MU to the original facial image Isrc, and converts the original facial image Isrc into a primary facial image Isyn. The makeup information MU includes one or more of hue, brightness, and saturation given to a region including the eyes, bridge of the nose, cheeks, and lips (hereinafter referred to as makeup region) in the original facial image Isrc. The makeup information MU and the procedure for adding makeup information are image processing procedures that are arbitrarily set in advance. The hue, brightness, and saturation applied to the makeup area may be arbitrarily determined quantitatively, or information on colors already applied to the reference facial image may be extracted and applied. The reference face image can be arbitrarily selected by the user, for example, according to his/her own image. The reference images include, for example, facial images with various types of makeup such as elegant, cool, and trendy, and facial images of real people. The makeup information MU may be input by an operator at the terminal device 12 and sent from the terminal device 12 to the server device 10. The control unit 103 performs image processing to apply eyeliner, nose shadow, cheek color, lip color, etc. of a color arbitrarily set in the makeup information MU to the eyes, bridge of the nose, cheeks, or lips in the face image Isrc, respectively. Execute.

ステップＳ３４において、制御部１０３は、メイクアップ変換工程を実行する。制御部１０３は、元顔画像Ｉｓｒｃを変換対象の顔画像として、一次顔画像Ｉｓｙｎを参照用の顔画像として、画像変換モデル１０８に入力する。画像変換モデル１０８は、元顔画像Ｉｓｒｃから顔の形状、立体感、表面状態等の特徴情報を、一次顔画像Ｉｓｙｎからメイクアップ情報をエンコーダにより抽出し、元顔画像Ｉｓｒｃに一次顔画像Ｉｓｙｎのメイクアップ情報を付与してデコーダにより二次顔画像Ｉｇに変換する。 In step S34, the control unit 103 executes a makeup conversion process. The control unit 103 inputs the original face image Isrc as a face image to be converted and the primary face image Isyn as a reference face image to the image conversion model 108. The image conversion model 108 uses an encoder to extract feature information such as face shape, three-dimensional effect, and surface condition from the original facial image Isrc, and makeup information from the primary facial image Isyn, and converts the primary facial image Isyn into the original facial image Isrc. Make-up information is added and converted into a secondary facial image Ig by a decoder.

ステップＳ３６において、制御部１０３は、調整工程を実行する。制御部１０３は、二次顔画像Ｉｇを損失関数Ｌのパラメータとしたとき、損失関数Ｌの値を最小化するように、画像変換モデル１０８のパラメータを調整する。損失関数Ｌは、例えば、Adversarial loss、Makeup loss、Perceptual loss、MGE（Mean Gradient Error）loss、Color lossの一以上を含む。 In step S36, the control unit 103 executes an adjustment process. The control unit 103 adjusts the parameters of the image conversion model 108 so as to minimize the value of the loss function L when the secondary face image Ig is used as a parameter of the loss function L. The loss function L includes, for example, one or more of Adversarial loss, Makeup loss, Perceptual loss, MGE (Mean Gradient Error) loss, and Color loss.

Adversarial lossは、ＧＡＮにおける識別器を騙すように生成器を学習させるための損失関数である。制御部１０３は、二次顔画像Ｉｇがメイクアップ変換工程の結果によるものなのか、元顔画像Ｉｓｒｃなのかを識別器により識別させ、その結果を用いて識別器を騙すように生成器を学習させる。そうすることで、二次顔画像Ｉｇにおける元顔画像Ｉｓｒｃの損失が小さくなるように、画像変換モデル１０８のパラメータが調整される。識別器は、グローバル識別器、ローカル識別器を含んでもよい。グローバル識別器は、顔画像全体を対象として二次顔画像Ｉｇがメイクアップ変換工程の結果によるものなのか、元顔画像Ｉｓｒｃなのかを識別する。ローカル識別器は、メイク部位を対象として二次顔画像Ｉｇがメイクアップ変換工程の結果によるものなのか、元顔画像Ｉｓｒｃなのかを識別する。グローバル識別器とローカル識別器を併用することで、生成器の学習精度を向上させることが可能となる。 Adversarial loss is a loss function for training the generator to fool the classifier in the GAN. The control unit 103 causes the discriminator to identify whether the secondary facial image Ig is the result of the makeup conversion process or the original facial image Isrc, and uses the result to learn the generator so as to fool the discriminator. let By doing so, the parameters of the image conversion model 108 are adjusted so that the loss of the original face image Isrc in the secondary face image Ig is reduced. The classifier may include a global classifier and a local classifier. The global classifier identifies whether the secondary facial image Ig is the result of the makeup conversion process or the original facial image Isrc, targeting the entire facial image. The local classifier identifies whether the secondary facial image Ig is the result of the makeup conversion process or the original facial image Isrc, targeting the makeup area. By using a global classifier and a local classifier together, it is possible to improve the learning accuracy of the generator.

Makeup lossは、色の分布に関し識別器を騙すように生成器を学習させるための損失関数である。制御部１０３は、一次顔画像Ｉｓｙｎに基づきヒストグラムマッチングにより一次顔画像Ｉｓｙｎと同じ色分布を有する疑似的な顔画像を生成する。ヒストグラムマッチングは、顔画像全体の色の分布を対象として行われてもよいし、メイク部位の色分布を対象として行われてもよい。そして、制御部１０３は、疑似的な顔画像又は二次顔画像Ｉｇがヒストグラムマッチング又はメイクアップ変換工程の結果によるものなのか、元顔画像Ｉｓｒｃなのかを識別器に識別させ、その結果を用いて識別器を騙すように生成器を学習させる。そうすることで、二次顔画像Ｉｇにおける元顔画像Ｉｓｒｃの損失が小さくなるように、画像変換モデル１０８のパラメータが調整される。 Makeup loss is a loss function that trains the generator to fool the classifier regarding the color distribution. The control unit 103 generates a pseudo facial image having the same color distribution as the primary facial image Isyn by histogram matching based on the primary facial image Isyn. Histogram matching may be performed on the color distribution of the entire face image, or may be performed on the color distribution of the makeup area. Then, the control unit 103 causes the discriminator to identify whether the pseudo face image or the secondary face image Ig is the result of the histogram matching or makeup conversion process, or the original face image Isrc, and uses the result. train the generator to fool the discriminator. By doing so, the parameters of the image conversion model 108 are adjusted so that the loss of the original face image Isrc in the secondary face image Ig is reduced.

Perceptual lossは、顔画像の輪郭に関し識別器を騙すように生成器を学習させるための損失関数である。制御部１０３は、生成器にて一次顔画像Ｉｓｙｎを二次顔画像Ｉｇに変換するときの中間層から変換途中の顔画像を取得し、その画像における目、鼻、口等の各部位のエッジを抽出したエッジ画像を生成する。そして、制御部１０３は、エッジ画像のエッジがメイクアップ変換工程によるエッジ画像のエッジなのか、元顔画像Ｉｓｒｃのエッジなのかを識別器に識別させ、その結果を用いて識別器を騙すように生成器を学習させる。深層学習では、変換される画像のエッジ情報が中間層で特徴量として抽出されるところ、中間層におけるエッジ画像を用いた学習を行うことで、二次顔画像Ｉｇにおける元顔画像Ｉｓｒｃのエッジに関する損失が小さくなるように、画像変換モデル１０８のパラメータが調整される。すなわち、二次顔画像Ｉｇにおける顔の輪郭が元顔画像Ｉｓｒｃにおける顔の輪郭に一致するように、画像変換モデル１０８が調整される。 Perceptual loss is a loss function for training the generator to fool the classifier regarding the outline of the face image. The control unit 103 acquires a face image in the middle of conversion from an intermediate layer when the primary face image Isyn is converted into a secondary face image Ig by the generator, and extracts the edges of each part such as the eyes, nose, and mouth in the image. Generate an extracted edge image. Then, the control unit 103 causes the classifier to identify whether the edge of the edge image is an edge of the edge image obtained by the makeup conversion process or an edge of the original face image Isrc, and uses the result to fool the classifier. Train the generator. In deep learning, the edge information of the image to be converted is extracted as a feature quantity in the middle layer, and by performing learning using the edge images in the middle layer, the edge information of the original face image Isrc in the secondary face image Ig is The parameters of the image transformation model 108 are adjusted so that the loss is small. That is, the image conversion model 108 is adjusted so that the outline of the face in the secondary face image Ig matches the outline of the face in the original face image Isrc.

MGE lossは、顔画像の輪郭に関する損失関数である。制御部１０３は、元顔画像Ｉｓｒｃと二次顔画像Ｉｇにそれぞれ微分フィルタを適用してエッジの解像度を増大させる。そして、制御部１０３は、高解像度の二次顔画像Ｉｇにおけるエッジと高解像度の元顔画像Ｉｓｒｃにおけるエッジとが一致するように画像変換モデル１０８のパラメータを調整する。そうすることで、二次顔画像Ｉｇにおける元顔画像Ｉｓｒｃのエッジに関する損失が小さくなるように、画像変換モデル１０８が調整される。 MGE loss is a loss function regarding the contour of a face image. The control unit 103 applies a differential filter to each of the original face image Isrc and the secondary face image Ig to increase the edge resolution. Then, the control unit 103 adjusts the parameters of the image conversion model 108 so that the edges in the high-resolution secondary face image Ig match the edges in the high-resolution original face image Isrc. By doing so, the image conversion model 108 is adjusted so that the loss related to the edges of the original face image Isrc in the secondary face image Ig is reduced.

Color lossは、色の分布に関し、Makeup lossを補強するための損失関数である。制御部１０３は、元顔画像Ｉｓｒｃに対し付与されるメイクアップ情報ＭＵにより規定される色の平均値及び分散値と、メイク部位の一次顔画像Ｉｓｙｎにおける色の平均値及び分散値とのそれぞれ差分を導出する。そして、制御部１０３は、各差分を低減させるように画像変換モデル１０８のパラメータを調整する。そうすることで、二次顔画像Ｉｇにおける元顔画像Ｉｓｒｃの損失が小さくなるように、画像変換モデル１０８のパラメータが調整される。 Color loss is a loss function for reinforcing Makeup loss regarding color distribution. The control unit 103 calculates the differences between the average value and variance value of the color defined by the makeup information MU given to the original facial image Isrc and the average value and variance value of the color in the primary facial image Isyn of the makeup area. Derive. Then, the control unit 103 adjusts the parameters of the image conversion model 108 so as to reduce each difference. By doing so, the parameters of the image conversion model 108 are adjusted so that the loss of the original face image Isrc in the secondary face image Ig is reduced.

以上のような手順により、画像変換モデル１０８が調整される。 The image conversion model 108 is adjusted through the procedure described above.

調整された画像変換モデル１０８は、任意の元顔画像の変換に用いられる。具体的には、制御部１０３は、任意の元顔画像にアルゴリズム処理を実行して一次顔画像を生成し、一次顔画像を画像変換モデル１０８により二次顔画像に変換する。ここで、画像変換モデル１０８の動作の検証結果を示す。 The adjusted image transformation model 108 is used to transform any original facial image. Specifically, the control unit 103 generates a primary facial image by performing algorithmic processing on an arbitrary original facial image, and converts the primary facial image into a secondary facial image using the image conversion model 108. Here, the verification results of the operation of the image conversion model 108 will be shown.

［検証１］
サーバ装置１０によるアルゴリズム処理として、参照用の顔画像に含まれるリップカラーをヒストグラムマッチングにより元顔画像に付す処理が実行された。元顔画像として、オープンデータの１５０通りの任意の顔画像が用いられた。アルゴリズム処理により各元顔画像が一次顔画像に変換され、画像変換モデル１０８により一次顔画像が二次顔画像に変換された。ここでは、画像変換モデル１０８には、Ｕ－ｎｅｔアーキテクチャが採用された。また、各元顔画像に対し、関連技術によりメイクアップ情報が付された。関連技術として、ＢｅａｕｔｙＧＡＮ、ＰＳＧＡＮ（Pose and Expression Robust Spatial-Aware GAN）、及びＣＰＭ（Color-Pattern Makeup Transfer）が採用された。そして、二次顔画像と、各関連技術によりメイクアップ情報が付された顔画像について、下記の４項目に関する印象を示すスコアが、１２名の被検者（１８歳～２３歳の女性）から聴取された。
＜項目１＞参照用の顔画像のリップカラーが反映されているか
＜項目２＞元顔画像の肌の質感が維持されているか
＜項目３＞元顔画像の唇の質感が維持されているか
＜項目４＞＜１＞～＜３＞についての総合評価 [Verification 1]
As algorithm processing by the server device 10, a process was performed in which the lip color included in the reference face image was applied to the original face image by histogram matching. As the original facial images, 150 arbitrary facial images from open data were used. Each original facial image was converted into a primary facial image by algorithm processing, and the primary facial image was converted into a secondary facial image by the image conversion model 108. Here, the image conversion model 108 employs U-net architecture. Additionally, makeup information was added to each original face image using related technology. BeautyGAN, PSGAN (Pose and Expression Robust Spatial-Aware GAN), and CPM (Color-Pattern Makeup Transfer) were adopted as related technologies. For the secondary face image and the face image with makeup information added using each related technology, scores indicating impressions regarding the following four items were obtained from 12 subjects (females aged 18 to 23). Heard.
<Item 1> Is the lip color of the reference face image reflected? <Item 2> Is the skin texture of the original face image maintained? <Item 3> Is the lip texture of the original face image maintained? < Item 4> Overall evaluation for <1> to <3>

下の表１は、本実施形態における二次顔画像と、各関連技術による顔画像についての、項目毎のスコアの集計結果を示す。スコアが大きいほど、好印象を示す。表１に示されるように、＜項目２＞～＜項目４＞において、本実施形態による二次顔画像が、関連技術による顔画像よりも好印象を得た。

Table 1 below shows the results of aggregation of scores for each item for the secondary face image in this embodiment and the face images obtained by each related technology. The higher the score, the better the impression. As shown in Table 1, in <Item 2> to <Item 4>, the secondary face image according to this embodiment gave a better impression than the face image according to the related technology.

［検証２］
サーバ装置１０によるアルゴリズム処理として、参照用の顔画像に含まれるリップカラーをヒストグラムマッチングにより元顔画像に付す処理が実行された。元顔画像として、オープンデータの任意の顔画像が用いられた。また、リップカラーとして、紅色、小豆色、真朱、及び丹色の４色が用いられた。アルゴリズム処理の際、元顔画像のテクスチャ情報を削除し、質感を喪失させる処理が行われた。アルゴリズム処理により元顔画像が一次顔画像に変換されると、サーバ装置１０にて、一次顔画像が、異なる損失関数の組合せで調整された４通りの画像変換モデル１０８により二次顔画像に変換された。ここでは、画像変換モデル１０８には、Ｕ－ｎｅｔアーキテクチャが採用された。画像変換モデル１０８の調整に用いられた損失関数の４通りの組合せパターンは、以下のとおりである。
＜パターン１＞Adversarial loss、Makeup loss、Perceptual loss、及びColor loss
＜パターン２＞Adversarial loss、Makeup loss、MGE loss、及びColor loss
＜パターン３＞Adversarial loss、Makeup loss、Perceptual loss、及びMGE loss
＜パターン４＞Adversarial loss、Makeup loss、Perceptual loss、MGE loss、及びColor loss [Verification 2]
As algorithm processing by the server device 10, a process was performed in which the lip color included in the reference face image was applied to the original face image by histogram matching. An arbitrary face image from open data was used as the original face image. Furthermore, four lip colors were used: crimson, red bean, vermilion, and tan. During algorithm processing, the texture information of the original facial image was deleted and the texture was lost. When the original facial image is converted into a primary facial image by algorithm processing, the primary facial image is converted into a secondary facial image by four image conversion models 108 adjusted by combinations of different loss functions in the server device 10. It was done. Here, the image conversion model 108 employs U-net architecture. The four combination patterns of loss functions used to adjust the image transformation model 108 are as follows.
<Pattern 1> Adversarial loss, Makeup loss, Perceptual loss, and Color loss
<Pattern 2> Adversarial loss, Makeup loss, MGE loss, and Color loss
<Pattern 3> Adversarial loss, Makeup loss, Perceptual loss, and MGE loss
<Pattern 4> Adversarial loss, Makeup loss, Perceptual loss, MGE loss, and Color loss

任意の元顔画像が、４色のリップカラーのそれぞれにつき４通りのパターンで、合計１６通りの二次顔画像に変換された。そして、リップカラーの各色について、各パターンによる二次顔画像の定性評価がなされた、各パターンに対する定性評価は以下のとおりとなった。
＜パターン１＞元顔画像の質感を再現できていない
＜パターン２＞輪郭が際立ってしまう
＜パターン３＞学習が不安定になる
＜パターン４＞上記すべてが解決される An arbitrary original facial image was converted into a total of 16 secondary facial images, with four patterns for each of the four lip colors. Then, for each color of lip color, a qualitative evaluation was made of the secondary facial image according to each pattern.The qualitative evaluation for each pattern was as follows.
<Pattern 1> The texture of the original face image cannot be reproduced. <Pattern 2> The outline stands out. <Pattern 3> Learning becomes unstable. <Pattern 4> All of the above are resolved.

［検証３］
サーバ装置１０によるアルゴリズム処理として、参照用の顔画像に含まれるリップカラーをヒストグラムマッチングにより元顔画像に付す処理が実行された。元顔画像として、オープンデータの任意の顔画像が用いられた。また、アルゴリズム処理の後、リップカラーを色空間上でクラスタリングして光沢領域を特定し、光沢を抑制する（光沢領域を非光沢領域の色で塗りつぶす）処理が行われた。アルゴリズム処理により元顔画像が一次顔画像に変換され、画像変換モデル１０８により一次顔画像が二次顔画像に変換された。ここでは、画像変換モデル１０８には、Ｕ－ｎｅｔアーキテクチャが採用された。その結果、二次顔画像において、光沢を抑制した領域において元顔画像の立体感が再現されるという定性評価が得られた。 [Verification 3]
As algorithm processing by the server device 10, a process was performed in which the lip color included in the reference face image was applied to the original face image by histogram matching. An arbitrary face image from open data was used as the original face image. Furthermore, after the algorithm processing, lip colors were clustered in the color space to identify glossy areas, and gloss was suppressed (filling the glossy areas with the color of the non-glossy areas). The original facial image was converted into a primary facial image by algorithm processing, and the primary facial image was converted into a secondary facial image by the image conversion model 108. Here, the image conversion model 108 employs U-net architecture. As a result, a qualitative evaluation was obtained that in the secondary facial image, the three-dimensional effect of the original facial image was reproduced in the area where gloss was suppressed.

［実施例］
図５は、実施例における情報処理システム１の動作例を説明するためのシーケンス図である。図５の手順は、本実施形態の手順で調整された画像変換モデル１０８を有するサーバ装置１０と端末装置１２の連係動作に関する。端末装置１２は、例えば、自らの顔の撮像画像を用いて、メイクアップを試行するユーザにより用いられる。 [Example]
FIG. 5 is a sequence diagram for explaining an example of the operation of the information processing system 1 in the embodiment. The procedure in FIG. 5 relates to the cooperative operation of the server device 10 and the terminal device 12, which have the image conversion model 108 adjusted according to the procedure of this embodiment. The terminal device 12 is used, for example, by a user who attempts to apply makeup using a captured image of his or her own face.

ステップＳ５０において、端末装置１２はユーザの撮像を行う。端末装置１２の制御部１２３は、入力部１２５に対するユーザの操作入力に応答して、入力部１２５に含まれるカメラにより撮像を行う。これにより、端末装置１２は元顔画像を取得する。 In step S50, the terminal device 12 captures an image of the user. The control unit 123 of the terminal device 12 captures an image using a camera included in the input unit 125 in response to a user's operation input to the input unit 125. Thereby, the terminal device 12 acquires the original face image.

ステップＳ５１において、端末装置１２は、アルゴリズム処理を選択するための入力を受け付ける。端末装置１２の制御部１２３は、例えば、仮想メイクアップを提供するアプリケーションプログラムを実行する。制御部１２３は、例えば、出力部１２６に含まれるディスプレイに、選択メニューを表示する。そして、制御部１２３は、入力部１２５に対するユーザの操作入力に応じて、アルゴリズム処理の種類と、アルゴリズム処理により顔画像に付与される色を選択する。選択されるアルゴリズム処理は、アイライン、ノーズシャドウ、チークカラー、又はリップカラー等の付与である。制御部１２３は、メイクアップ情報を含む参照用の顔画像をサンプルとして表示して、ユーザがサンプルを選択することでアルゴリズム処理を選択してもよい。 In step S51, the terminal device 12 receives input for selecting algorithm processing. The control unit 123 of the terminal device 12 executes, for example, an application program that provides virtual makeup. The control unit 123 displays a selection menu on a display included in the output unit 126, for example. Then, the control unit 123 selects the type of algorithm processing and the color to be applied to the face image by the algorithm processing in accordance with the user's operation input to the input unit 125. The selected algorithmic process is the application of eyeliner, nose shadow, cheek color, lip color, or the like. The control unit 123 may display a reference face image including makeup information as a sample, and the user may select the algorithm process by selecting the sample.

ステップＳ５２において、端末装置１２は、元顔画像と画像変換要求とをサーバ装置１０へ送る。画像変換要求には、選択されたアルゴリズム処理を特定する情報が含まれる。制御部１２３は、通信部１２１により、元顔画像と画像変換要求とを送る。サーバ装置１０では、制御部１０３が、端末装置１２から送られる情報を通信部１０１により受ける。 In step S52, the terminal device 12 sends the original facial image and an image conversion request to the server device 10. The image conversion request includes information identifying the selected algorithmic process. The control unit 123 uses the communication unit 121 to send the original face image and an image conversion request. In the server device 10 , the control unit 103 receives information sent from the terminal device 12 through the communication unit 101 .

ステップＳ５３において、サーバ装置１０は、元顔画像に対しアルゴリズム処理を実行する。制御部１０３は、指定されたアルゴリズム処理を元顔画像に対し実行する。これにより、元顔画像が一次顔画像に変換される。 In step S53, the server device 10 executes algorithm processing on the original facial image. The control unit 103 executes the specified algorithm processing on the original face image. As a result, the original facial image is converted into a primary facial image.

ステップＳ５５において、サーバ装置１０の制御部１０３は、画像変換モデル１０８により、一次顔画像を二次顔画像に変換する。 In step S55, the control unit 103 of the server device 10 converts the primary facial image into a secondary facial image using the image conversion model 108.

ステップＳ５６において、サーバ装置１０は、出力用の二次顔画像を端末装置１２へ送る。制御部１０３は、通信部１０１により、二次顔画像を送る。端末装置１２では、制御部１２３が、サーバ装置１０から送られる情報を通信部１２１により受ける。 In step S56, the server device 10 sends the secondary facial image for output to the terminal device 12. The control unit 103 sends the secondary facial image via the communication unit 101. In the terminal device 12 , the control unit 123 receives information sent from the server device 10 through the communication unit 121 .

ステップＳ５７において、端末装置１２は、二次顔画像を表示する。制御部１２３は、例えば、出力部１２６に含まれるディスプレイに、二次顔画像を表示させる。 In step S57, the terminal device 12 displays the secondary facial image. For example, the control unit 123 causes a display included in the output unit 126 to display the secondary facial image.

上述の手順によれば、調整済みの画像変換モデル１０８により、メイクアップ後であって元顔画像の質感が再現された自然な顔画像を出力することが可能となる。ユーザは、指定したメイクアップが施された自分の顔画像を確認することが可能となる。 According to the above-described procedure, the adjusted image conversion model 108 makes it possible to output a natural face image that reproduces the texture of the original face image even after makeup has been applied. The user can check the image of his or her face with the specified makeup applied.

以上のとおり、本実施形態によれば、種々のメイクアップ手法を表現しつつ元の顔画像の属性を損なわないような画像変換が可能となる。 As described above, according to the present embodiment, it is possible to perform image conversion that expresses various makeup techniques without impairing the attributes of the original facial image.

上述においては、サーバ装置１０が「情報処理装置」に対応した。しかしながら、サーバ装置１０と端末装置１２とが連係動作することで「情報処理装置」を構成してもよいし、端末装置１２が「情報処理装置」に対応してもよい。 In the above description, the server device 10 corresponds to the "information processing device." However, the server device 10 and the terminal device 12 may cooperate to form an "information processing device," or the terminal device 12 may correspond to the "information processing device."

上述の実施形態において、端末装置１２の動作を規定する処理・制御プログラムは、サーバ装置１０の記憶部１０２又は他のサーバ装置の記憶部に記憶されていて、ネットワーク１１経由で端末装置１２にダウンロードされてもよいし、コンピュータに読取り可能な非一過性の記録・記憶媒体に格納され、端末装置１２が媒体から読み取ってもよい。 In the embodiment described above, the processing/control program that defines the operation of the terminal device 12 is stored in the storage unit 102 of the server device 10 or the storage device of another server device, and is downloaded to the terminal device 12 via the network 11. Alternatively, the information may be stored in a computer-readable non-transitory recording/storage medium, and the terminal device 12 may read the information from the medium.

上述において、実施形態を諸図面及び実施例に基づき説明してきたが、当業者であれば本開示に基づき種々の変形及び修正を行うことが容易であることに注意されたい。従って、これらの変形及び修正は本開示の範囲に含まれることに留意されたい。例えば、各手段、各ステップ等に含まれる機能等は論理的に矛盾しないように再配置可能であり、複数の手段、ステップ等を１つに組み合わせたり、或いは分割したりすることが可能である。 Although the embodiments have been described above based on the drawings and examples, it should be noted that those skilled in the art can easily make various changes and modifications based on the present disclosure. It should therefore be noted that these variations and modifications are included within the scope of this disclosure. For example, the functions included in each means, each step, etc. can be rearranged so as not to be logically contradictory, and it is possible to combine multiple means, steps, etc. into one, or to divide them. .

１０：サーバ装置
１１：ネットワーク
１２：端末装置
１０１、１２１：通信部
１０２、１２２：記憶部
１０３、１２３：制御部
１０５、１２５：入力部
１０６、１２６：出力部
１０８：画像変換モデル
Ｉｓｒｃ：元顔画像
Ｉｓｙｎ：一次顔画像
Ｉｇ：二次顔画像
Ｌ：損失関数

10: Server device 11: Network 12: Terminal device 101, 121: Communication section 102, 122: Storage section 103, 123: Control section 105, 125: Input section 106, 126: Output section 108: Image conversion model Isrc: Original face Image Isyn: Primary face image Ig: Secondary face image L: Loss function

Claims

A method for operating an information processing device, the method comprising:
a first step of adding makeup information to the first face image according to a predetermined procedure and converting it into a second face image;
The second face image is used as a reference image using an image conversion model generated by machine learning processing of converting a face image to be converted by adding makeup information included in the reference face image. a second step of converting the first facial image into a third facial image;
a third step of adjusting the image transformation model by reducing loss of the first face image in the third face image;
How it works, including:

In claim 1,
The predetermined procedure is a procedure for adding makeup information of one or more of hue, brightness, and saturation to a region of the first face image including the eyes, bridge of the nose, cheeks, and lips.
How it works.

In claim 2,
the predetermined procedure does not include machine learning;
How it works.

In claim 2,
The loss is a loss of one or more of hue, brightness, and saturation of the part in the first facial image, or a loss of an edge of the part,
How it works.

In claim 4,
The loss is represented by any one or more of Adversarial loss, Makeup loss, and Perceptual loss,
How it works.

In claim 5,
In the third step, the loss is further adjusted using MGE loss or color loss.
How it works.

In claim 1,
The image conversion model includes an encoder that extracts makeup information from the reference face image, and a decoder that converts the face image to be converted.
How it works.

Generated by machine learning processing of converting the target face image by adding makeup information included in the reference face image, and makeup information is added to the first face image according to a predetermined procedure. An image adjusted to reduce loss of the first face image in the second face image when the first face image is converted into a second face image using the face image as a reference image. a storage unit that stores the conversion model;
a control unit that converts an input facial image into an output facial image using the image conversion model after executing the predetermined procedure;
An information processing device having:

A program executed by an information processing device,
The information processing device is generated by machine learning processing for converting a face image to be converted by adding makeup information included in a reference face image, and applies makeup to the first face image according to a predetermined procedure. To reduce the loss of the first face image in the second face image when the first face image is converted into a second face image using a face image to which close-up information is attached as a reference image. It is possible to use an image transformation model adjusted to
a first step of performing the predetermined procedure on the input facial image;
a second step of converting the face image on which the first step has been performed into a face image for output using the image conversion model;
programs, including.