JP2019153223A

JP2019153223A - Device, method, and program for generating images

Info

Publication number: JP2019153223A
Application number: JP2018039799A
Authority: JP
Inventors: 大地木村; Daichi Kimura; 紗希上田; Saki Ueda; 智海大川内; Tomomi Okawachi
Original assignee: NTT Communications Corp
Current assignee: NTT Communications Corp
Priority date: 2018-03-06
Filing date: 2018-03-06
Publication date: 2019-09-12
Anticipated expiration: 2038-03-06
Also published as: JP6865705B2

Abstract

To allow for easily providing cooking images which present a room for a creative touch that may help create variation-rich dishes.SOLUTION: An image generation device 10 acquires information on cooking ingredients entered by a user and random number data generated at random. The image generation device 10 then uses the acquired information on the cooking ingredients and random number data as input to generate a cooking image using a learned model for generating cooking images.SELECTED DRAWING: Figure 1

Description

本発明は、画像生成装置、画像生成方法および画像生成プログラムに関する。 The present invention relates to an image generation apparatus, an image generation method, and an image generation program.

従来、家庭における調理では、既に手もとにある食材を活用しつつも、生活の質を向上するために対象の料理に相応の多様性を確保したい、あるいは創意工夫によって新たな料理を作ってみたいという状況がしばしば発生する。このような状況において、ユーザに手もとの食材で作りうる料理の方向性を示唆する技術が存在する。 Traditionally, in home cooking, while using the ingredients you already have, you want to ensure appropriate diversity in the target dishes in order to improve the quality of life, or you want to create new dishes by ingenuity Situations often occur. In such a situation, there is a technique that suggests the direction of cooking that can be made with ingredients at hand to the user.

このような技術として、例えば、料理画像と対応する食材を格納したデータベース上での検索技術（例えば、クックパッド（登録商標）など）や汎用のウェブ画像検索技術（例えば、Ｇｏｏｇｌｅ（登録商標）画像検索など）がある。このような検索技術では、料理名や食材名等の検索クエリが入力されると、人気や関連度に応じて、検索結果が出力される。 As such a technique, for example, a search technique (for example, Cookpad (registered trademark)) on a database storing foods corresponding to a cooking image or a general-purpose web image search technique (for example, Google (registered trademark) image search). and so on. In such a search technique, when a search query such as a dish name or an ingredient name is input, a search result is output according to popularity and relevance.

また、古典的な知識処理や自然言語処理技術に基づくレシピの自動生成技術（例えば、ＣｈｅｆＷａｔｓｏｎ（登録商標）など）も存在する。このような技術では、既存のレシピを改変して新しいレシピを出力するとともに、既存のレシピに対応する既存の料理画像を出力する。 There is also an automatic recipe generation technology (for example, Chef Watson (registered trademark)) based on classical knowledge processing or natural language processing technology. In such a technique, an existing recipe is modified to output a new recipe, and an existing dish image corresponding to the existing recipe is output.

特開２００４−１９２０５０号公報JP 2004-192050 A

しかしながら、従来の手法では、バリエーション豊かな料理の一助となる創意工夫の余地がある料理画像を簡易に提供することができないという課題があった。例えば、上述した従来の検索技術では、出力される検索結果が人気や関連度で整序されるので、類似した料理ばかりが上位に表示されがちであり、適切かつ多様な検索結果を得るためには、検索クエリの作成に際してユーザ側の熟練が必要になるという課題があった。また、検索技術であるが故に、データベース上、ないしウェブ上に存在する料理以外がユーザに提示されないという課題があった。 However, the conventional method has a problem in that it is not possible to easily provide a cooking image that has room for creative ideas that can help a variety of dishes. For example, in the conventional search technique described above, the output search results are ordered by popularity and relevance, so that only similar dishes tend to be displayed at the top, and in order to obtain appropriate and diverse search results However, there is a problem that the user side is required to create a search query. In addition, because of the search technique, there is a problem that the user cannot be presented with anything other than the dishes existing on the database or on the web.

また、上述した従来のレシピの自動生成技術では、既存のレシピの改変を行っているので、改変元のレシピと根本的に異なるようなレシピが生成されることはなく、レシピが具体的かつ仔細に出力されるので、ユーザが創意工夫を行う余地が少ないという課題があった。また、改変されたレシピとともに出力される料理画像が、既存の料理画像であるため、新しいレシピによって作られる料理の外見は提示されず、実際に調理するまで料理の外観は推測が困難であった。 In addition, the conventional recipe automatic generation technique described above modifies an existing recipe, so that a recipe that is fundamentally different from the original recipe is not generated, and the recipe is specific and detailed. Therefore, there is a problem that there is little room for the user to make ingenuity. In addition, because the dish image output with the modified recipe is an existing dish image, the appearance of the dish created by the new recipe is not presented, and it is difficult to guess the appearance of the dish until it is actually cooked .

上述した課題を解決し、目的を達成するために、本発明の画像生成装置は、ユーザによって入力された食材に関する情報と、ランダムに生成された乱数データとを取得する取得部と、前記取得部によって取得された前記食材に関する情報および前記乱数データを入力として、料理画像を生成するための学習済みモデルを用いて、前記料理画像を生成する学習済み生成部とを有することを特徴とする。 In order to solve the above-described problems and achieve the object, an image generation apparatus according to the present invention includes an acquisition unit that acquires information about foods input by a user and randomly generated random number data, and the acquisition unit And a learned generation unit that generates the dish image using the learned model for generating a dish image by using the information on the ingredients acquired by the above and the random number data as inputs.

また、本発明の画像生成方法は、画像生成装置によって実行される画像生成方法であって、ユーザによって入力された食材に関する情報と、ランダムに生成された乱数データとを取得する取得工程と、前記取得工程によって取得された前記食材に関する情報および前記乱数データを入力として、料理画像を生成するための学習済みモデルを用いて、前記料理画像を生成する学習済み生成工程とを含んだことを特徴とする。 Further, the image generation method of the present invention is an image generation method executed by an image generation apparatus, and obtains information related to ingredients input by a user and randomly generated random number data, And a learned generation step of generating the dish image using a learned model for generating a dish image by using the information on the ingredients acquired in the acquisition step and the random number data as inputs. To do.

また、本発明の画像生成プログラムは、ユーザによって入力された食材に関する情報と、ランダムに生成された乱数データとを取得する取得ステップと、前記取得ステップによって取得された前記食材に関する情報および前記乱数データを入力として、料理画像を生成するための学習済みモデルを用いて、前記料理画像を生成する学習済み生成ステップとをコンピュータに実行させることを特徴とする。 In addition, the image generation program of the present invention includes an acquisition step of acquiring information related to a food material input by a user and randomly generated random number data, and information related to the food material acquired by the acquisition step and the random number data. And a learned generation step for generating the cooking image using a learned model for generating a cooking image.

本発明によれば、バリエーション豊かな料理の一助となる創意工夫の余地がある料理画像を簡易に提供することができるという効果を奏する。 According to the present invention, there is an effect that it is possible to easily provide a cooking image that has a room for inventiveness that helps a variety of dishes.

図１は、第１の実施形態に係る画像生成装置の構成例を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration example of an image generation apparatus according to the first embodiment. 図２は、料理データ記憶部に記憶されるデータの一例を示す図である。FIG. 2 is a diagram illustrating an example of data stored in the dish data storage unit. 図３は、第１の実施形態に係る画像生成装置における学習処理を説明する図である。FIG. 3 is a diagram for explaining learning processing in the image generation apparatus according to the first embodiment. 図４は、ユーザ端末に表示される食材の入力画面の一例を示す図である。FIG. 4 is a diagram illustrating an example of a food input screen displayed on the user terminal. 図５は、第１の実施形態に係る画像生成装置における画像生成処理を説明する図である。FIG. 5 is a diagram for explaining image generation processing in the image generation apparatus according to the first embodiment. 図６は、ユーザ端末に表示される料理画像の出力画面の一例を示す図である。FIG. 6 is a diagram illustrating an example of a cooking image output screen displayed on the user terminal. 図７は、第１の実施形態に係る画像生成装置における学習処理の流れの一例を示すフローチャートである。FIG. 7 is a flowchart illustrating an example of a learning process in the image generation apparatus according to the first embodiment. 図８は、第１の実施形態に係る画像生成装置における画像生成処理の流れの一例を示すフローチャートである。FIG. 8 is a flowchart illustrating an example of the flow of image generation processing in the image generation apparatus according to the first embodiment. 図９は、画像生成プログラムを実行するコンピュータを示す図である。FIG. 9 is a diagram illustrating a computer that executes an image generation program.

以下に、本願に係る画像生成装置、画像生成方法および画像生成プログラムの実施の形態を図面に基づいて詳細に説明する。なお、この実施の形態により本願に係る画像生成装置、画像生成方法および画像生成プログラムが限定されるものではない。 Hereinafter, embodiments of an image generation apparatus, an image generation method, and an image generation program according to the present application will be described in detail with reference to the drawings. Note that the embodiment does not limit the image generation apparatus, the image generation method, and the image generation program according to the present application.

［第１の実施形態］
以下の実施の形態では、第１の実施形態に係る画像生成装置１０の構成、画像生成装置１０の処理の流れを順に説明し、最後に第１の実施形態による効果を説明する。 [First Embodiment]
In the following embodiments, the configuration of the image generation device 10 according to the first embodiment and the flow of processing of the image generation device 10 will be described in order, and finally the effects of the first embodiment will be described.

［画像生成装置の構成］
図１は、第１の実施形態に係る画像生成装置の構成例を示すブロック図である。図１を用いて、画像生成装置１０の構成を説明する。図１に示すように、画像生成装置１０は、ユーザ端末２０とネットワーク３０を介して接続されている。 [Configuration of Image Generation Device]
FIG. 1 is a block diagram illustrating a configuration example of an image generation apparatus according to the first embodiment. The configuration of the image generation apparatus 10 will be described with reference to FIG. As shown in FIG. 1, the image generation apparatus 10 is connected to a user terminal 20 via a network 30.

ここでユーザ端末２０は、例えば、デスクトップ型ＰＣ、タブレット型ＰＣ、ノート型ＰＣ、携帯電話機、スマートフォン、ＰＤＡ（Personal Digital Assistant）等の情報処理装置である。 Here, the user terminal 20 is an information processing apparatus such as a desktop PC, a tablet PC, a notebook PC, a mobile phone, a smartphone, or a PDA (Personal Digital Assistant).

また、図１に示すように、この画像生成装置１０は、通信処理部１１、制御部１２および記憶部１３を有する。以下に画像生成装置１０が有する各部の処理を説明する。 As illustrated in FIG. 1, the image generation apparatus 10 includes a communication processing unit 11, a control unit 12, and a storage unit 13. Hereinafter, processing of each unit included in the image generation apparatus 10 will be described.

通信処理部１１は、各種情報に関する通信を制御する。例えば、通信処理部１１は、ユーザ端末２０から食材名と料理画像の生成要求とを受信する。また、通信処理部１１は、ユーザ端末２０に対して生成した料理画像を送信する。 The communication processing unit 11 controls communication related to various types of information. For example, the communication processing unit 11 receives an ingredient name and a cooking image generation request from the user terminal 20. In addition, the communication processing unit 11 transmits the generated cooking image to the user terminal 20.

記憶部１３は、制御部１２による各種処理に必要なデータおよびプログラムを格納するが、特に本発明に密接に関連するものとしては、料理データ記憶部１３ａを有する。例えば、記憶部１３は、ＲＡＭ（Random Access Memory）、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、又は、ハードディスク、光ディスク等の記憶装置などである。なお、料理データ記憶部１３ａに記憶されるデータは、事前に格納されたデータであって、適宜更新可能なデータである。 The storage unit 13 stores data and programs necessary for various processes performed by the control unit 12, and particularly includes a dish data storage unit 13 a that is closely related to the present invention. For example, the storage unit 13 is a semiconductor memory device such as a RAM (Random Access Memory) or a flash memory, or a storage device such as a hard disk or an optical disk. The data stored in the dish data storage unit 13a is data stored in advance and can be updated as appropriate.

料理データ記憶部１３ａは、本物の料理の画像である料理画像と、該本物の料理に使用された食材に関する情報とを対応付けて記憶する。例えば、料理データ記憶部１３ａは、図２に例示するように、本物の料理の画像である「料理画像」と、料理画像の料理に使用された食材の名称を示す「食材名」と、食材名をベクトル化した「食材ベクトル」とを対応付けて記憶する。図２は、料理データ記憶部に記憶されるデータの一例を示す図である。 The dish data storage unit 13a stores a dish image, which is an image of a genuine dish, and information related to the ingredients used for the genuine dish in association with each other. For example, as illustrated in FIG. 2, the dish data storage unit 13 a includes a “cooking image” that is an image of a real dish, a “food name” that indicates the name of the ingredient used for cooking the dish image, and an ingredient. The “food vector” obtained by vectorizing the name is stored in association with each other. FIG. 2 is a diagram illustrating an example of data stored in the dish data storage unit.

図２の例を挙げて説明すると、料理データ記憶部１３ａは、料理画像「画像Ａ」と、食材名「卵、鶏肉、玉ねぎ・・・」と、食材ベクトル「ベクトルＡ」とを対応付けて記憶する。なお、図２に例示した情報は一例であり、これに限定されるものではない。なお、図２の例では、料理画像および食材ベクトルについて、画像Ａ、ベクトルＡ等と簡略的に記載している。 2, the dish data storage unit 13a associates the dish image “image A”, the ingredient name “egg, chicken, onion...” And the ingredient vector “vector A”. Remember. The information illustrated in FIG. 2 is an example, and the present invention is not limited to this. In the example of FIG. 2, the dish image and the ingredient vector are simply described as image A, vector A, and the like.

制御部１２は、各種の処理手順などを規定したプログラムおよび所要データを格納するための内部メモリを有し、これらによって種々の処理を実行するが、特に本発明に密接に関連するものとしては、生成部１２ａ、識別部１２ｂ、学習部１２ｃ、取得部１２ｄおよび学習済み生成部１２ｅを有する。ここで、制御部１２は、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）、ＧＰＵ（Graphical Processing Unit）などの電子回路やＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）などの集積回路である。 The control unit 12 has an internal memory for storing a program that defines various processing procedures and necessary data, and performs various processes using them, and particularly as closely related to the present invention, A generation unit 12a, an identification unit 12b, a learning unit 12c, an acquisition unit 12d, and a learned generation unit 12e are included. Here, the control unit 12 is an electronic circuit such as a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or a GPU (Graphical Processing Unit), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), or the like. Integrated circuit.

制御部１２の各機能部によって実行される処理は、モデルの機械学習を行う学習フェーズと学習済みモデルを用いて料理画像を生成する画像生成フェーズとに大別される。制御部１２における生成部１２ａ、識別部１２ｂおよび学習部１２ｃは、学習フェーズにおける処理を行う機能部であり、取得部１２ｄおよび学習済み生成部１２ｅは、画像生成フェーズにおける処理を行う機能部である。なお、第１の実施形態に係る画像生成装置１０は、学習フェーズにおける学習処理および画像生成フェーズにおける画像生成処理の両方を行う装置として説明するが、これに限定されるものではなく、画像生成フェーズにおける画像生成処理のみを行うようにしてもよい。この場合には、画像生成装置１０は、事前に機械学習処理が行われた学習済みモデルを予め設定されているものとする。 The processing executed by each functional unit of the control unit 12 is roughly divided into a learning phase in which machine learning of a model is performed and an image generation phase in which a cooking image is generated using a learned model. The generation unit 12a, the identification unit 12b, and the learning unit 12c in the control unit 12 are functional units that perform processing in the learning phase, and the acquisition unit 12d and the learned generation unit 12e are functional units that perform processing in the image generation phase. . Note that the image generation apparatus 10 according to the first embodiment will be described as an apparatus that performs both the learning process in the learning phase and the image generation process in the image generation phase, but is not limited to this, and the image generation phase Only the image generation process in the above may be performed. In this case, it is assumed that the image generation apparatus 10 has previously set a learned model that has been subjected to machine learning processing in advance.

また、学習フェーズにおいては、例えば、ニューラルネットワークの一種である敵対的生成ネットワークであるＧＡＮ（Generative Adversarial Network）を利用し、生成器と識別器という二つのニューラルネットワークを組み合わせて所与のデータ集合に対する学習を行う。例えば、学習処理として、対象データが画像である場合には、生成器は何らかのランダムな画像を生成するように、識別器は入力画像が元のデータ集合に属するものであるか生成器の生成したものであるかを識別するように構築される。なお、学習フェーズにおける学習処理および画像生成フェーズにおける画像生成処理については、後に図を用いて詳述する。以下では、各機能部について説明する。 In the learning phase, for example, a GAN (Generative Adversarial Network), which is a kind of a neural network, is used to combine two neural networks, a generator and a discriminator, for a given data set. Do learning. For example, as a learning process, when the target data is an image, the discriminator generates whether the input image belongs to the original data set so that the generator generates some random image. Constructed to identify what is. The learning process in the learning phase and the image generation process in the image generation phase will be described in detail later with reference to the drawings. Below, each function part is demonstrated.

生成部１２ａは、料理データ記憶部１３ａに記憶された食材に関する情報と乱数とを入力として、料理画像を生成する第一のモデル（以下では、「生成器」と記載）を用いて、料理画像を生成する。 The generation unit 12a receives information about the ingredients stored in the dish data storage unit 13a and a random number as inputs, and uses the first model (hereinafter referred to as “generator”) to generate a dish image, and uses the dish image. Is generated.

具体的な処理を説明すると、生成部１２ａは、料理データ記憶部１３ａからランダムサンプリングを行って食材名のセットを取得する。そして、生成部１２ａは、取得した食材名をベクトルに変換する。例えば、生成部１２ａは、取得した食材名をＷｏｒｄ−ｅｍｂｅｄｄｉｎｇを行ってベクトルφに変換し、さらにベクトルφを正規分布Ｎ（μ（φ），σ（φ））に基づく潜在変数ｃ´として表現することにより次元圧縮を行う。そして、生成部１２ａは、ベクトルｃ´と乱数ベクトルｚを結合し、結合したベクトルを入力として生成器を用いて、料理画像を生成する。 A specific process will be described. The generation unit 12a performs random sampling from the dish data storage unit 13a to acquire a set of food names. And the production | generation part 12a converts the acquired foodstuff name into a vector. For example, the generation unit 12a performs word-embedding on the acquired food name to convert it into a vector φ, and further expresses the vector φ as a latent variable c ′ based on the normal distribution N (μ (φ), σ (φ)). Dimension compression is performed. Then, the generation unit 12a combines the vector c ′ and the random number vector z, and generates a dish image using the combined vector as an input.

識別部１２ｂは、料理データ記憶部１３ａに記憶された料理画像と食材に関する情報を訓練データとして、入力された画像の本物らしさを識別する第二のモデル（以下では、「識別器」と記載）を用いて、生成部１２ａによって生成された料理画像の本物らしさを識別する。 The identification unit 12b uses, as training data, information about the cooking image and ingredients stored in the cooking data storage unit 13a as a second model for identifying the authenticity of the input image (hereinafter referred to as “discriminator”). Is used to identify the authenticity of the dish image generated by the generation unit 12a.

具体的な処理を説明すると、識別部１２ｂは、生成器により生成された料理画像と、生成部１２ａが料理データ記憶部１３ａからランダムサンプリングを行って取得した食材名に対応する本物の料理画像および食材ベクトルｃ´とを識別器に入力し、畳み込みニューラルネットワークを用いて、生成器により生成された料理画像が本物らしいかを識別する。なお、どのように本物らしいかを識別する手法については、既存のどのような手法を用いてもよいが、例えば、生成器により生成された料理画像と本物の料理画像との確率分布の距離を計算し、計算した距離に応じて本物らしさを定義するようにしてもよい。 Explaining the specific processing, the identification unit 12b includes a cooking image generated by the generator, a real cooking image corresponding to the food name obtained by the sampling unit 12a performing random sampling from the cooking data storage unit 13a, and The food vector c ′ is input to the discriminator, and a convolutional neural network is used to discriminate whether the dish image generated by the generator is genuine. Note that any existing method may be used as a method for identifying how it looks authentic. For example, the distance of the probability distribution between the dish image generated by the generator and the genuine dish image is determined. You may make it calculate and define authenticity according to the calculated distance.

学習部１２ｃは、本物らしい料理画像を生成できるように生成器を最適化し、料理画像の識別精度が向上できるように識別器を最適化する。例えば、学習部１２ｃは、上述したように、生成器によって料理画像が生成され、識別器によって料理画像の本物らしさを識別されるたびに、本物らしい料理画像を生成できるように生成器のパラメータを最適化し、料理画像の識別精度が向上できるように識別器のパラメータを最適化する。なお、学習部１２ｃは、パラメータを最適化する手法については、どのような手法であってもよく、機械学習における既存の最適化手法のうち、どの手法を適用してもよい。 The learning unit 12c optimizes the generator so that a genuine dish image can be generated, and optimizes the classifier so that the dish image identification accuracy can be improved. For example, as described above, each time the learning unit 12c generates a cooking image by the generator, and each time the authenticity of the cooking image is identified by the classifier, the learning unit 12c sets the parameters of the generator so that a genuine cooking image can be generated. The parameters of the discriminator are optimized so that the cooking image discrimination accuracy can be improved. Note that the learning unit 12c may use any method for optimizing the parameters, and may apply any of the existing optimization methods in machine learning.

ここで、図３を用いて、第１の実施形態に係る画像生成装置１０における学習処理の一連の流れを説明する。図３は、第１の実施形態に係る画像生成装置における学習処理を説明する図である。図３に示すように、画像生成装置１０では、生成器と識別器という二つのニューラルネットワークを組み合わせて適用し、生成器はランダムな料理画像を生成するように、識別器は入力された料理画像が本物の料理画像に属するものであるか生成器の生成した料理画像であるかを識別するように構築される。 Here, a series of learning processes in the image generating apparatus 10 according to the first embodiment will be described with reference to FIG. FIG. 3 is a diagram for explaining learning processing in the image generation apparatus according to the first embodiment. As shown in FIG. 3, the image generating apparatus 10 applies a combination of two neural networks, ie, a generator and a discriminator, so that the generator generates a random dish image, and the discriminator inputs the dish image that is input Is configured to identify whether it belongs to a real cooking image or a cooking image generated by the generator.

図３に示すように、画像生成装置１０は、料理データ記憶部１３ａから取得した「食材名」をＷｏｒｄ−ｅｍｂｅｄｄｉｎｇを行ってベクトルφに変換する。そして、画像生成装置１０は、ベクトルφを正規分布Ｎ（μ（φ），σ（φ））に基づく潜在変数ｃ´として表現することにより次元圧縮を行い、ベクトルｃ´と乱数ベクトルｚを結合する（図３の（Ａ）参照）。 As illustrated in FIG. 3, the image generation apparatus 10 performs “word-embedding” on the “food material name” acquired from the dish data storage unit 13 a and converts it into a vector φ. Then, the image generation apparatus 10 performs dimensional compression by expressing the vector φ as a latent variable c ′ based on the normal distribution N (μ (φ), σ (φ)), and combines the vector c ′ and the random number vector z. (Refer to FIG. 3A).

そして、画像生成装置１０は、結合したベクトルを生成器に入力し、畳み込みニューラルネットワークを用いてアップサンプリングして料理画像を生成する（図３の（Ｂ）参照）。 Then, the image generation apparatus 10 inputs the combined vector to the generator, and generates a dish image by upsampling using a convolutional neural network (see FIG. 3B).

続いて、画像生成装置１０は、生成器により生成された料理画像と、生成部１２ａが料理データ記憶部１３ａからランダムサンプリングを行って取得した食材名に対応する本物の料理画像および食材ベクトルｃ´とを識別器に入力し、畳み込みニューラルネットワークを用いて、生成器により生成された料理画像が本物らしいかを識別する（図３の（Ｃ）参照）。 Subsequently, the image generation apparatus 10 generates the cooking image generated by the generator, and the genuine cooking image and the food vector c ′ corresponding to the food name acquired by the generation unit 12a performing random sampling from the food data storage unit 13a. Are input to the discriminator, and a convolutional neural network is used to discriminate whether the dish image generated by the generator is genuine (see FIG. 3C).

画像生成装置１０は、上記の処理（図３の（Ａ）〜（Ｃ）の処理）を繰り返し、生成器がより本物らしい料理画像を生成できるように、識別器が料理画像の識別精度が向上できるように、それぞれのニューラルネットワークのパラメータを最適化する。このように、画像生成装置１０では、二つのニューラルネットワークを同時並行で訓練することで、学習が成功すれば、学習済みモデルの生成器は本物の料理画像と識別困難な料理画像をランダムに生成して出力するようになる。 The image generating apparatus 10 repeats the above processing (the processing of (A) to (C) in FIG. 3), and the discriminator improves the discrimination accuracy of the cooking image so that the generator can generate a more authentic cooking image. Optimize the parameters of each neural network as possible. In this way, in the image generation device 10, by training two neural networks simultaneously, if learning is successful, the generator of the learned model randomly generates a dish image that is difficult to distinguish from a real dish image. Will be output.

つまり、画像生成装置１０では、学習に用いるデータ集合として、実際の料理画像とその料理で用いられている食材たちの名称が組になったものを利用する。そして、画像生成装置１０は、生成器がランダムに画像を生成する際、及び、識別器が入力画像を識別する際に、それぞれの画像の料理に用いられている食材たちの名称を付加情報として利用するよう、ＧＡＮの構造を改良する。これによって、学習済みモデルの生成器は食材たちの名前を所与として、対応するそれらしい料理画像をランダムに生成して出力するようになる。 That is, the image generation apparatus 10 uses a combination of actual dish images and the names of ingredients used in the dish as a data set used for learning. Then, when the generator generates an image at random and when the discriminator identifies the input image, the image generation device 10 uses the names of ingredients used for cooking each image as additional information. Improve the structure of GAN to use. As a result, the trained model generator randomly generates and outputs corresponding dish images given the names of the ingredients.

また、画像生成装置１０では、「食材名」をＷｏｒｄ−ｅｍｂｅｄｄｉｎｇを行ってベクトルφに変換した後、ベクトルφを正規分布Ｎ（μ（φ），σ（φ））に基づく潜在変数ｃ´として表現している。通常、食材の種類は膨大（数百次元）であり、入力する食材名を通常のＯｎｅ−ｈｏｔベクトルなどに変換してしまうと表現能力が低くなってしまう。このため、生成器や識別器にこのままＯｎｅ−ｈｏｔベクトルを入力すると、そのベクトルの要素の多くは０につぶれてしまい、学習がうまく進まない場合がある。 Further, in the image generation apparatus 10, after “word-embedding” is performed on the “food name” to convert it into a vector φ, the vector φ is set as a latent variable c ′ based on the normal distribution N (μ (φ), σ (φ)). expressing. Normally, the types of ingredients are enormous (hundreds of dimensions), and if the inputted ingredient name is converted into a normal one-hot vector or the like, the expression ability is lowered. For this reason, if the One-hot vector is input to the generator or discriminator as it is, many elements of the vector are collapsed to 0, and learning may not progress well.

これに対して、画像生成装置１０では、「食材名」をＷｏｒｄ−ｅｍｂｅｄｄｉｎｇを行ってベクトルφに変換した後、ベクトルφを正規分布Ｎ（μ（φ），σ（φ））に基づく潜在変数ｃ´として表現することにより、より高い表現能力をもっと低次元ベクトルに変換する。このことにより、要素の多くが０に潰れることを防ぎ、学習をうまく進行させることが可能である。また、潜在変数ｃ´は、確率分布に基づく表現であり、生成器の生成する画像の多様性を損なうものではない。 On the other hand, the image generating apparatus 10 performs word-embedding on the “food name” and converts it into a vector φ, and then the vector φ is a latent variable based on the normal distribution N (μ (φ), σ (φ)). By expressing as c ′, higher expression capability is converted into a lower dimensional vector. As a result, it is possible to prevent many of the elements from being crushed to 0 and to facilitate the learning. The latent variable c ′ is an expression based on the probability distribution and does not impair the diversity of images generated by the generator.

図１の説明に戻って、取得部１２ｄは、ユーザによって入力された食材に関する情報と、ランダムに生成されたランダムデータとを取得する。例えば、取得部１２ｄは、ユーザ端末２０に表示された入力画面に入力された食材名を取得する。ここで、図４の例を用いて、ユーザ端末２０に表示される食材の入力画面の一例を説明する。図４は、ユーザ端末に表示される食材の入力画面の一例を示す図である。 Returning to the description of FIG. 1, the acquisition unit 12 d acquires information on the ingredients input by the user and randomly generated random data. For example, the acquisition unit 12d acquires the name of the ingredient input on the input screen displayed on the user terminal 20. Here, an example of the input screen of the foodstuff displayed on the user terminal 20 is demonstrated using the example of FIG. FIG. 4 is a diagram illustrating an example of a food input screen displayed on the user terminal.

図４に例示するように、ユーザ端末２０では、食材名を入力するためのテキストボックスと、料理画像の生成を指示するためのボタンとが表示される。例えば、図４に例示するように、ユーザ端末２０に表示された入力画面において、テキストボックスに「卵、鶏肉、玉ねぎ・・・」と入力し、「画像生成」と表示されたボタンを押下することで、食材名「卵、鶏肉、玉ねぎ・・・」と料理画像の生成要求とがユーザ端末２０から画像生成装置１０に送信される。入力時におけるシチュエーションの一例として、例えば、ユーザが、夕飯等を作る際に、手もとにある食材名をテキストボックスに入力する。 As illustrated in FIG. 4, the user terminal 20 displays a text box for inputting an ingredient name and a button for instructing generation of a cooking image. For example, as illustrated in FIG. 4, in the input screen displayed on the user terminal 20, enter “egg, chicken, onion ...” in the text box, and press the button displayed as “image generation”. Thus, the ingredient name “egg, chicken, onion...” And a cooking image generation request are transmitted from the user terminal 20 to the image generation apparatus 10. As an example of the situation at the time of input, for example, when a user prepares dinner or the like, the name of the ingredients at hand is input into a text box.

なお、図４に例示する入力画面は一例であり、これに限定されるものではない。例えば、図４の例では、一つのテキストボックスが表示され、一つのテキストボックスに複数の食材名が入力されている例を示すが、複数のテキストボックスが表示され、各テキストボックスに一つずつ食材名が入力されるようにしてもよい。また、テキストボックスに代えて食材名を選択するためのプルダウンリストを表示してもよく、プルダウンリストから食材を選択できるようにしてもよい。 The input screen illustrated in FIG. 4 is an example, and the present invention is not limited to this. For example, the example of FIG. 4 shows an example in which one text box is displayed and a plurality of food names are input in one text box, but a plurality of text boxes are displayed, one in each text box. An ingredient name may be input. Moreover, it may replace with a text box and may display the pull-down list for selecting a foodstuff name, and may enable it to select a foodstuff from a pull-down list.

また、取得部１２ｄは、ユーザ端末２０から食材名を受信すると、食材名をベクトルに変換する。例えば、取得部１２ｄは、取得した食材名をＷｏｒｄ−ｅｍｂｅｄｄｉｎｇを行ってベクトルφに変換し、さらにベクトルφを正規分布Ｎ（μ（φ），σ（φ））に基づく潜在変数ｃ´として表現することにより次元圧縮を行う。そして、取得部１２ｄは、食材ベクトルｃ´と乱数ベクトルｚを結合する。 Moreover, the acquisition part 12d will convert a foodstuff name into a vector, if a foodstuff name is received from the user terminal 20. FIG. For example, the acquisition unit 12d performs word-embedding on the acquired ingredient name to convert it into a vector φ, and further expresses the vector φ as a latent variable c ′ based on the normal distribution N (μ (φ), σ (φ)). Dimension compression is performed. Then, the acquisition unit 12d combines the food vector c ′ and the random number vector z.

学習済み生成部１２ｅは、取得部１２ｄによって取得された食材に関する情報および乱数データを入力として、料理画像を生成するための学習済みモデルを用いて、料理画像を生成する。具体的には、学習済み生成部１２ｅは、取得部１２ｄによって食材ベクトルｃ´と乱数ベクトルｚとが結合されたベクトルを入力として学習済みモデル（生成器）を用いて、料理画像を生成する。つまり、学習済み生成部１２ｅは、上述の学習部１２ｃによって最適化された生成器を学習済みモデルとして用いて、料理画像を生成する。その後、学習済み生成部１２ｅは、生成した料理画像をユーザ端末２０に出力する。 The learned generation unit 12e uses the learned model for generating a cooking image, using the information about the food and the random number data acquired by the acquisition unit 12d as inputs, and generates a cooking image. Specifically, the learned generation unit 12e generates a dish image using the learned model (generator) with the vector obtained by combining the food vector c ′ and the random number vector z by the acquisition unit 12d as an input. That is, the learned generation unit 12e generates a dish image using the generator optimized by the learning unit 12c as a learned model. Thereafter, the learned generation unit 12e outputs the generated dish image to the user terminal 20.

ここで、図５を用いて、第１の実施形態に係る画像生成装置１０における画像生成処理の一連の流れを説明する。図５は、第１の実施形態に係る画像生成装置における画像生成処理を説明する図である。図５に示すように、画像生成装置１０では、生成器と識別器という二つのニューラルネットワークを組み合わせて適用し、生成器はランダムな料理画像を生成するように、識別器は入力された料理画像が本物の料理画像に属するものであるか生成器の生成した料理画像であるかを識別するように構築される。 Here, a series of flow of image generation processing in the image generation apparatus 10 according to the first embodiment will be described with reference to FIG. FIG. 5 is a diagram for explaining image generation processing in the image generation apparatus according to the first embodiment. As shown in FIG. 5, in the image generation device 10, two neural networks of a generator and a discriminator are applied in combination, and the discriminator inputs the dish image inputted so that the generator generates a random dish image. Is configured to identify whether it belongs to a real cooking image or a cooking image generated by the generator.

図５に示すように、画像生成装置１０は、料理データ記憶部１３ａから取得した「食材名」をＷｏｒｄ−ｅｍｂｅｄｄｉｎｇを行ってベクトルφに変換する。そして、画像生成装置１０は、ベクトルφを正規分布Ｎ（μ（φ），σ（φ））に基づく潜在変数ｃ´として表現することにより次元圧縮を行い、ベクトルｃ´と乱数ベクトルｚを結合する。 As illustrated in FIG. 5, the image generation apparatus 10 performs “word-embedding” on the “food material name” acquired from the dish data storage unit 13 a and converts it into a vector φ. Then, the image generation apparatus 10 performs dimensional compression by expressing the vector φ as a latent variable c ′ based on the normal distribution N (μ (φ), σ (φ)), and combines the vector c ′ and the random number vector z. To do.

そして、画像生成装置１０は、結合したベクトルを学習済みモデルである生成器に入力し、畳み込みニューラルネットワークを用いてアップサンプリングして料理画像を生成する。その後、画像生成装置１０は、生成した料理画像をユーザ端末２０に送信する。 Then, the image generation apparatus 10 inputs the combined vector to a generator that is a learned model, and generates a dish image by upsampling using a convolutional neural network. Thereafter, the image generation apparatus 10 transmits the generated dish image to the user terminal 20.

ここで、図６の例を用いて、ユーザ端末２０に表示される料理画像の出力画面について説明する。図６は、ユーザ端末に表示される料理画像の出力画面の一例を示す図である。図６に例示するように、ユーザ端末２０において、料理画像の出力画面として、食材名が入力されたテキストボックスの下に料理画像が表示されている。ここで表示されている料理画像は、学習済みモデルの生成器によってランダムに生成されたものである。例えば、ユーザ端末２０には、ユーザが手元にある食材名を所与としてランダムに生成された、既存の料理画像と異なる料理画像が表示される。 Here, the output screen of the dish image displayed on the user terminal 20 will be described using the example of FIG. FIG. 6 is a diagram illustrating an example of a cooking image output screen displayed on the user terminal. As illustrated in FIG. 6, in the user terminal 20, a dish image is displayed as a dish image output screen under a text box in which the name of the ingredient is input. The dish image displayed here is randomly generated by the learned model generator. For example, the user terminal 20 displays a dish image different from the existing dish image, which is randomly generated with the name of the ingredient at hand at the user.

このように、画像生成装置１０は、学習済みモデルの生成器によってランダムに料理画像を生成するので、例えば、ユーザの手もとにある食材が以前と同一の場合でも相応のばらつきがあり、既存の料理画像とは異なる料理画像をユーザに提供することが可能である。このため、ユーザが家庭での調理において手もとの食材を用いて作りうる料理についての多様で新奇性のある示唆を容易に得ることができ、その創意工夫を通じた生活の質の向上につなげることができる。 In this way, the image generation apparatus 10 randomly generates a cooking image by using the learned model generator. For example, even when the food at the user's hand is the same as before, there is a corresponding variation, and the existing cooking A dish image different from the image can be provided to the user. For this reason, it is possible to easily obtain various and novel suggestions about cooking that the user can make using the ingredients at hand in cooking at home, which can lead to improvement of the quality of life through its ingenuity. it can.

［画像生成装置の処理手順］
次に、図７および図８を用いて、第１の実施形態に係る画像生成装置１０による処理手順の例を説明する。図７は、第１の実施形態に係る画像生成装置における学習処理の流れの一例を示すフローチャートである。図８は、第１の実施形態に係る画像生成装置における画像生成処理の流れの一例を示すフローチャートである。 [Processing procedure of image generating apparatus]
Next, an example of a processing procedure performed by the image generation apparatus 10 according to the first embodiment will be described with reference to FIGS. 7 and 8. FIG. 7 is a flowchart illustrating an example of a learning process in the image generation apparatus according to the first embodiment. FIG. 8 is a flowchart illustrating an example of the flow of image generation processing in the image generation apparatus according to the first embodiment.

まず、図７を用いて、画像生成装置１０における学習処理の流れの一例を説明する。図７に例示するように、画像生成装置１０の生成部１２ａは、料理データ記憶部１３ａからランダムサンプリングを行って食材名を取得する（ステップＳ１０１）。そして、生成部１２ａは、取得した食材名をＷｏｒｄ−ｅｍｂｅｄｄｉｎｇを行ってベクトルφに変換する（ステップＳ１０２）。 First, an example of a learning process flow in the image generation apparatus 10 will be described with reference to FIG. As illustrated in FIG. 7, the generation unit 12a of the image generation device 10 performs random sampling from the dish data storage unit 13a to acquire the ingredient name (step S101). And the production | generation part 12a performs Word-embedding for the acquired foodstuff name, and converts it into vector (phi) (step S102).

続いて、生成部１２ａは、変換したベクトルφを正規分布Ｎ（μ（φ），σ（φ））に基づく潜在変数ｃ´として表現することにより次元圧縮を行う（ステップＳ１０３）。そして、生成部１２ａは、ベクトルｃ´と乱数ベクトルｚを結合し（ステップＳ１０４）、結合したベクトルから生成器の畳み込みニューラルネットワークを用いて、料理画像を生成する（ステップＳ１０５）。 Subsequently, the generation unit 12a performs dimensional compression by expressing the converted vector φ as a latent variable c ′ based on the normal distribution N (μ (φ), σ (φ)) (step S103). Then, the generation unit 12a combines the vector c ′ and the random number vector z (step S104), and generates a dish image from the combined vector using the convolutional neural network of the generator (step S105).

そして、識別部１２ｂは、生成器により生成された料理画像と、生成部１２ａが料理データ記憶部１３ａからランダムサンプリングを行って取得した食材名に対応する本物の料理画像および食材ベクトルｃ´とを識別器に入力し、識別器の畳み込みニューラルネットワークを用いて、生成器により生成された料理画像が本物らしいかを識別する（ステップＳ１０６）。 And the identification part 12b produces | generates the cooking image produced | generated by the generator, and the real cooking image and ingredients vector c 'corresponding to the foodstuff name which the production | generation part 12a acquired by performing random sampling from the dish data storage part 13a. It inputs into a discriminator, and discriminate | determines whether the dish image produced | generated by the generator is genuine using the convolution neural network of a discriminator (step S106).

その後、学習部１２ｃは、所定の手法により、本物らしい料理画像を生成できるように生成器のパラメータを最適化し、料理画像の識別精度が向上できるように識別器のパラメータを最適化する（ステップＳ１０７）。なお、画像生成装置１０では、上記のステップＳ１０１〜１０７の一連の処理を所定の条件を満たすまで繰り返し行うものとする。例えば、画像生成装置１０は、予め繰り返し行う回数を設定するようにしてもよいし、生成器および識別器の精度が所定の閾値を満たすまで繰り返し行うようにしてもよい。 Thereafter, the learning unit 12c optimizes the parameters of the generator so as to generate a genuine dish image by a predetermined method, and optimizes the parameters of the classifier so that the dish image identification accuracy can be improved (step S107). ). Note that the image generation apparatus 10 repeats the series of processing in steps S101 to S107 until a predetermined condition is satisfied. For example, the image generation apparatus 10 may set the number of repetitions in advance, or may be repeated until the accuracy of the generator and the discriminator satisfies a predetermined threshold.

次に、図８を用いて、画像生成装置１０における画像生成処理の流れの一例を説明する。図８に例示するように、取得部１２ｄは、ユーザ端末２０から食材名の入力を受け付けると（ステップＳ２０１肯定）、食材名をベクトルφに変換する（ステップＳ２０２）。続いて、取得部１２ｄは、変換したベクトルφを正規分布Ｎ（μ（φ），σ（φ））に基づく潜在変数ｃ´として表現することにより次元圧縮を行う（ステップＳ２０３）。そして、取得部１２ｄは、食材ベクトルｃ´と乱数ベクトルｚを結合する（ステップＳ２０４）。 Next, an example of the flow of image generation processing in the image generation apparatus 10 will be described with reference to FIG. As illustrated in FIG. 8, when the acquisition unit 12d receives an input of a food name from the user terminal 20 (Yes in Step S201), the acquisition unit 12d converts the food name into a vector φ (Step S202). Subsequently, the acquisition unit 12d performs dimensional compression by expressing the converted vector φ as a latent variable c ′ based on the normal distribution N (μ (φ), σ (φ)) (step S203). Then, the acquisition unit 12d combines the food vector c ′ and the random number vector z (step S204).

そして、学習済み生成部１２ｅは、結合したベクトルを入力として学習済みモデル（生成器）を用いて、料理画像を生成する（ステップＳ２０５）。その後、学習済み生成部１２ｅは、生成した料理画像をユーザ端末２０に出力する（ステップＳ２０６）。 Then, the learned generation unit 12e generates a dish image by using the learned vector (generator) with the combined vector as an input (step S205). Thereafter, the learned generation unit 12e outputs the generated dish image to the user terminal 20 (step S206).

（第１の実施形態の効果）
第１の実施形態に係る画像生成装置１０は、ユーザによって入力された食材に関する情報と、ランダムに生成された乱数データとを取得し、取得した食材に関する情報および乱数データを入力として、料理画像を生成するための学習済みモデルを用いて、料理画像を生成する。このため、バリエーション豊かな料理の一助となる創意工夫の余地がある料理画像を簡易に提供することが可能である。 (Effects of the first embodiment)
The image generation apparatus 10 according to the first embodiment acquires information on food ingredients input by the user and randomly generated random number data, and inputs the information on the acquired food ingredients and random number data, and inputs a dish image. A cooking image is generated using the learned model for generation. For this reason, it is possible to easily provide a cooking image with room for ingenuity that helps a variety of dishes.

つまり、画像生成装置１０では、学習済みモデルの生成器によってランダムに未知でかつ本物らしい料理画像を生成することができ、ユーザの手もとにある食材が以前と同一ないし類似の場合でも相応のばらつきがあり、ユーザに料理の外観をある程度想像させる料理画像を提供することができる。 In other words, the image generation device 10 can generate a randomly unknown and genuine dish image by the learned model generator, and there is a corresponding variation even when the ingredients at the user's hand are the same or similar as before. Yes, it is possible to provide a cooking image that allows the user to imagine the appearance of the cooking to some extent.

また、画像生成装置１０では、レシピを提示するのではなく、既存の料理画像とは異なる料理画像をユーザに提供するので、調理に際して一定の創意工夫を凝らす余地がある。また、画像生成装置１０は、手もとにある食材名を入力するだけでよいため、簡易に料理画像を得ることが出来る。このように、画像生成装置１０では、ユーザが家庭での調理において手もとの食材を用いて作りうる料理についての多様で新奇性のある示唆を容易に得ることができ、その創意工夫を通じた生活の質の向上につなげることが可能である。 In addition, since the image generation apparatus 10 does not present a recipe but provides a user with a cooking image that is different from the existing cooking image, there is room for a certain amount of ingenuity in cooking. Moreover, since the image generation apparatus 10 only needs to input the name of the foodstuff at hand, it can obtain a cooking image easily. As described above, the image generation apparatus 10 can easily obtain various and novel suggestions about the dishes that the user can make using the ingredients at hand in cooking at home, It is possible to improve the quality.

（システム構成等）
また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。さらに、各装置にて行なわれる各処理機能は、その全部または任意の一部が、ＣＰＵおよび当該ＣＰＵにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。 (System configuration etc.)
Further, each component of each illustrated apparatus is functionally conceptual, and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution / integration of each device is not limited to that shown in the figure, and all or a part thereof may be functionally or physically distributed or arbitrarily distributed in arbitrary units according to various loads or usage conditions. Can be integrated and configured. Further, all or any part of each processing function performed in each device may be realized by a CPU and a program analyzed and executed by the CPU, or may be realized as hardware by wired logic.

また、本実施の形態において説明した各処理のうち、自動的におこなわれるものとして説明した処理の全部または一部を手動的におこなうこともでき、あるいは、手動的におこなわれるものとして説明した処理の全部または一部を公知の方法で自動的におこなうこともできる。この他、上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 In addition, among the processes described in this embodiment, all or part of the processes described as being automatically performed can be manually performed, or the processes described as being manually performed All or a part of the above can be automatically performed by a known method. In addition, the processing procedure, control procedure, specific name, and information including various data and parameters shown in the above-described document and drawings can be arbitrarily changed unless otherwise specified.

（プログラム）
また、上記実施形態において説明した画像生成装置が実行する処理をコンピュータが実行可能な言語で記述したプログラムを作成することもできる。例えば、実施形態に係る画像生成装置１０が実行する処理をコンピュータが実行可能な言語で記述した画像生成プログラムを作成することもできる。この場合、コンピュータが画像生成プログラムを実行することにより、上記実施形態と同様の効果を得ることができる。さらに、かかる画像生成プログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録された画像生成プログラムをコンピュータに読み込ませて実行することにより上記実施形態と同様の処理を実現してもよい。 (program)
In addition, it is possible to create a program in which processing executed by the image generation apparatus described in the above embodiment is described in a language that can be executed by a computer. For example, it is possible to create an image generation program in which processing executed by the image generation apparatus 10 according to the embodiment is described in a language that can be executed by a computer. In this case, when the computer executes the image generation program, it is possible to obtain the same effect as in the above embodiment. Further, the image generation program may be recorded on a computer-readable recording medium, and the image generation program recorded on the recording medium may be read by the computer and executed to execute the same processing as in the above embodiment. Good.

図９は、画像生成プログラムを実行するコンピュータを示す図である。図９に例示するように、コンピュータ１０００は、例えば、メモリ１０１０と、ＣＰＵ１０２０と、ハードディスクドライブインタフェース１０３０と、ディスクドライブインタフェース１０４０と、シリアルポートインタフェース１０５０と、ビデオアダプタ１０６０と、ネットワークインタフェース１０７０とを有し、これらの各部はバス１０８０によって接続される。 FIG. 9 is a diagram illustrating a computer that executes an image generation program. As illustrated in FIG. 9, the computer 1000 includes, for example, a memory 1010, a CPU 1020, a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network interface 1070. These units are connected by a bus 1080.

メモリ１０１０は、図９に例示するように、ＲＯＭ（Read Only Memory）１０１１及びＲＡＭ１０１２を含む。ＲＯＭ１０１１は、例えば、ＢＩＯＳ（Basic Input Output System）等のブートプログラムを記憶する。ハードディスクドライブインタフェース１０３０は、図９に例示するように、ハードディスクドライブ１０９０に接続される。ディスクドライブインタフェース１０４０は、図９に例示するように、ディスクドライブ１１００に接続される。例えば磁気ディスクや光ディスク等の着脱可能な記憶媒体が、ディスクドライブ１１００に挿入される。シリアルポートインタフェース１０５０は、図９に例示するように、例えばマウス１１１０、キーボード１１２０に接続される。ビデオアダプタ１０６０は、図９に例示するように、例えばディスプレイ１１３０に接続される。 The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM 1012 as illustrated in FIG. The ROM 1011 stores a boot program such as BIOS (Basic Input Output System). The hard disk drive interface 1030 is connected to the hard disk drive 1090 as illustrated in FIG. The disk drive interface 1040 is connected to the disk drive 1100 as illustrated in FIG. For example, a removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1100. The serial port interface 1050 is connected to a mouse 1110 and a keyboard 1120, for example, as illustrated in FIG. The video adapter 1060 is connected to a display 1130, for example, as illustrated in FIG.

ここで、図９に例示するように、ハードディスクドライブ１０９０は、例えば、ＯＳ１０９１、アプリケーションプログラム１０９２、プログラムモジュール１０９３、プログラムデータ１０９４を記憶する。すなわち、上記の、画像生成プログラムは、コンピュータ１０００によって実行される指令が記述されたプログラムモジュールとして、例えばハードディスクドライブ１０９０に記憶される。 Here, as illustrated in FIG. 9, the hard disk drive 1090 stores, for example, an OS 1091, an application program 1092, a program module 1093, and program data 1094. That is, the above-described image generation program is stored in, for example, the hard disk drive 1090 as a program module in which a command to be executed by the computer 1000 is described.

また、上記実施形態で説明した各種データは、プログラムデータとして、例えばメモリ１０１０やハードディスクドライブ１０９０に記憶される。そして、ＣＰＵ１０２０が、メモリ１０１０やハードディスクドライブ１０９０に記憶されたプログラムモジュール１０９３やプログラムデータ１０９４を必要に応じてＲＡＭ１０１２に読み出し、各種処理手順を実行する。 In addition, various data described in the above embodiment is stored as program data in, for example, the memory 1010 or the hard disk drive 1090. Then, the CPU 1020 reads the program module 1093 and the program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 as necessary, and executes various processing procedures.

なお、画像生成プログラムに係るプログラムモジュール１０９３やプログラムデータ１０９４は、ハードディスクドライブ１０９０に記憶される場合に限られず、例えば着脱可能な記憶媒体に記憶され、ディスクドライブ等を介してＣＰＵ１０２０によって読み出されてもよい。あるいは、画像生成プログラムに係るプログラムモジュール１０９３やプログラムデータ１０９４は、ネットワーク（ＬＡＮ（Local Area Network）、ＷＡＮ（Wide Area Network）等）を介して接続された他のコンピュータに記憶され、ネットワークインタフェース１０７０を介してＣＰＵ１０２０によって読み出されてもよい。 Note that the program module 1093 and the program data 1094 related to the image generation program are not limited to being stored in the hard disk drive 1090, but are stored in, for example, a removable storage medium and read out by the CPU 1020 via the disk drive or the like. Also good. Alternatively, the program module 1093 and the program data 1094 related to the image generation program are stored in another computer connected via a network (LAN (Local Area Network), WAN (Wide Area Network), etc.), and the network interface 1070 is stored. Via the CPU 1020.

上記の実施形態やその変形は、本願が開示する技術に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 The above embodiments and modifications thereof are included in the invention disclosed in the claims and equivalents thereof as well as included in the technology disclosed in the present application.

１０画像生成装置
１１通信処理部
１２制御部
１２ａ生成部
１２ｂ識別部
１２ｃ学習部
１２ｄ取得部
１２ｅ学習済み生成部
１３記憶部
１３ａ料理データ記憶部
２０ユーザ端末
３０ネットワーク DESCRIPTION OF SYMBOLS 10 Image generation apparatus 11 Communication processing part 12 Control part 12a Generation part 12b Identification part 12c Learning part 12d Acquisition part 12e Learned generation part 13 Storage part 13a Cooking data storage part 20 User terminal 30 Network

Claims

An acquisition unit that acquires information about ingredients input by the user and randomly generated random number data;
A learning generation unit that generates the cooking image using a learned model for generating a cooking image by using the information on the ingredients acquired by the acquisition unit and the random number data as inputs. An image generating device.

A storage unit that stores a cooking image that is an image of a real dish and information related to the ingredients used in the real cooking in association with each other;
A generation unit that generates the dish image using a first model that generates information about the food stored in the storage unit and the random number data and generates a dish image;
Using the second model for identifying the authenticity of the input image as training data, the authenticity of the cooking image generated by the generating unit using the cooking image and the information on the ingredients stored in the storage unit as training data. An identification unit for identifying
A learning unit that optimizes the first model so as to generate a genuine dish image and optimizes the second model so that the identification accuracy of the dish image can be improved;
The image generation apparatus according to claim 1, wherein the learned generation unit generates the dish image using the first model optimized by the learning unit as the learned model.

The generation unit performs word-embedding on the food name as information on the food stored in the storage unit, converts the name into a vector, and performs dimensional compression by expressing the vector as a latent variable based on a normal distribution. 3. The image generation according to claim 2, wherein the dish image is generated by combining the dimension-compressed vector and the random number vector, and using the combined vector as an input, using the first model. apparatus.

An image generation method executed by an image generation apparatus,
An acquisition step of acquiring information about ingredients input by the user and randomly generated random number data;
And a learned generation step of generating the dish image using the learned model for generating a dish image using the information on the ingredients acquired in the acquisition step and the random number data as inputs. An image generation method.

An acquisition step of acquiring information about ingredients input by the user and randomly generated random number data;
Causing the computer to execute a learned generation step of generating the dish image using the learned model for generating a dish image by using the information on the ingredients acquired in the acquisition step and the random number data as inputs. An image generation program characterized by the above.