JP7398723B1

JP7398723B1 - Image generation device, prompt creation support device, program and application program

Info

Publication number: JP7398723B1
Application number: JP2023026784A
Authority: JP
Inventors: 優愛上村; 健夫鈴木; 弥憲舘; 凌佐藤; 真矢小路
Original assignee: Individual
Current assignee: Individual
Priority date: 2023-02-23
Filing date: 2023-02-23
Publication date: 2023-12-15
Anticipated expiration: 2043-02-23
Also published as: JP2024120131A

Abstract

【課題】ユーザによる画像作成を支援する画像生成装置、プロンプト作成支援装置、プログラム及びアプリケーションプログラムを提供する。【解決手段】イラスト背景画像生成サーバ及び複数のユーザ端末が、ネットワークを介して接続されているイラスト背景画像生成システムにおいて、イラスト背景画像生成サーバは、ユーザ端末に複数のタグを含むＵＩを生成するためのＵＩデータを送信するＵＩデータ送信部と、ユーザ端末からユーザにより選択された一以上のタグを受信するタグ受信部と、タグに対応する要素を含むプロンプトを作成するプロンプト作成部と、プロンプトに基づいて画像を生成する画ＡＩ像生成部と、画像を出力する画像出力部と、を有する。ユーザは、ユーザキャラクターとの全体構図、パースペクティブ又はストーリー等を考慮しながらそれにマッチする背景画像のシチュエーションタグを選択することで、プロンプト文を容易に作成できる。【選択図】図２１The present invention provides an image generation device, a prompt creation support device, a program, and an application program that support image creation by a user. [Solution] In an illustration background image generation system in which an illustration background image generation server and a plurality of user terminals are connected via a network, the illustration background image generation server generates a UI that includes a plurality of tags on the user terminal. a UI data transmitter that transmits UI data for the user; a tag receiver that receives one or more tags selected by the user from a user terminal; a prompt creator that creates a prompt including elements corresponding to the tag; It has an image AI image generation section that generates an image based on the image, and an image output section that outputs the image. The user can easily create a prompt sentence by selecting a situation tag for a background image that matches the overall composition with the user character, perspective, story, etc. [Selection diagram] Figure 21

Description

特許法第３０条第２項適用令和５年２月１０日に自社ウェブサイト「ＲＥＦＥＲ」及びツイートサイトにて掲載ｈｔｔｐｓ：／／ｒｅｆｅｒ－ａｒｔ．ｊｐｈｔｔｐｓ：／／ｔｗｉｔｔｅｒ．ｃｏｍ／Ｒｅｆｅｒ＿ｏｆｆｉｃｉａｌ／ｓｔａｔｕｓ／１６２３９８５１４９７１２１７５１０９？ｓ＝２０Article 30, Paragraph 2 of the Patent Act applies Published on the company's website "REFER" and tweet site on February 10, 2020 https://refer-art. jp https://twitter. com/Refer_official/status/1623985149712175109? s=20

本発明は、画像生成装置、プロンプト作成支援装置、プログラム及びアプリケーションプログラムに関する。 The present invention relates to an image generation device, a prompt generation support device, a program, and an application program.

近年、ＡＩがオリジナルの画像を自動生成する画像生成ＡＩシステムが知られている（例えば、非特許文献１、２参照）。画像生成ＡＩで作られる画像は、システムに搭載された「拡散モデル」というアルゴリズムによって生成されている。ユーザはその拡散モデルが訓練済モデルとして搭載されたシステムを利用するため、アルゴリズムを理解したり、プログラムコードを記述したりすることなく、テキスト入力の操作だけで様々な画像を生成することができる。 In recent years, image generation AI systems in which AI automatically generates original images have been known (for example, see Non-Patent Documents 1 and 2). Images created by image generation AI are generated by an algorithm called the "diffusion model" installed in the system. Since users use a system that has the diffusion model installed as a trained model, they can generate various images just by inputting text, without having to understand algorithms or write program code. .

ＳｔａｂｌｅＤｉｆｆｕｓｉｏｎＯｎｌｉｎｅ，インターネット＜URL：https://stablediffusionweb.com/＞Stable Diffusion Online, Internet <URL: https://stablediffusionweb.com/> ＮｏｖｅｌＡＩ－ＴｈｅＡＩＳｔｏｒｙｔｅｌｌｅｒ，インターネット＜URL：https://novelai.net/＞NovelAI - The AI Storyteller, Internet <URL: https://novelai.net/>

これら画像生成系ＡＩに共通するのは、プロンプト（ｐｒｏｍｐｔ）と呼ばれるテキストを入力することで、高解像度のグラフィック画像が生成される点である。より具体的に、画像生成系ＡＩに入力されるプロンプトは、「ｔｅｘｔｔｏｉｍａｇｅ」型ＡＩに入力する英単語をカンマ区切りで組み合わせた指示文である。 What these image generation AI systems have in common is that a high-resolution graphic image is generated by inputting text called a prompt. More specifically, the prompt input to the image generation system AI is an instruction sentence that is a combination of English words input to the "text to image" type AI separated by commas.

しかしながら、このような「ｔｅｘｔｔｏｉｍａｇｅ」型ＡＩでは、ユーザが望む理想に近い画像（ｉｍａｇｅ）を出力するには、ユーザが適切なプロンプト（ｔｅｘｔ）を作り込む必要があり、これが必ずしも容易な作業ではない。即ち画像生成ＡＩに入力されるプロンプトは、指示文を全て英語のテキストで書かなければならず、また生成画像の細かな制御が難しいという問題がある。実際、プロンプトをどう工夫すれば画像生成ＡＩに高品質な画像を作成させられるかのプロンプトを研究し作成するプロンプトエンジニアと呼ばれる専門家も現れ始めている。 However, with such "text to image" type AI, the user needs to create an appropriate prompt (text) in order to output an image close to the ideal desired by the user, which is not always an easy task. isn't it. That is, all prompts input to the image generation AI must be written in English text, and there is a problem in that detailed control of the generated images is difficult. In fact, experts called prompt engineers are beginning to appear who research and create prompts to determine how to devise prompts that will allow image generation AI to create high-quality images.

その一方、上記画像生成系ＡＩに頼らずに、自作オリジナルのキャラクターを制作し、イラスト投稿サイト等に自作品を投稿しているイラストレータ（絵師）の存在がある。ここで、キャラクターのイラスト創作においては、キャラクターのイラストのみならず、キャラクター背面に配置するイラスト背景も重要な要素であるところ、キャラクターによりマッチした背景を描くためには、例えば全体構図、パース（パースペクティブ）又はストーリー等を客観的に捉えるなど、キャラクターを描くスキルとはまた別のスキルや一層の時間が要求される。このため、一般に絵師はスキルや時間的資源をキャラクターのオリジナリティやクオリティを追求することに傾倒しがちであることが多く、イラスト背景は例えば単色無地かそれに近いシンプルな背景で済ませてしまうといった実情もあった。 On the other hand, there are illustrators who create their own original characters and post their works on illustration posting sites, etc., without relying on the image generation AI. When creating character illustrations, not only the character illustration but also the illustration background placed on the back of the character are important elements.In order to draw a background that better matches the character, for example, the overall composition, perspective ) or objectively interpreting the story, etc., which requires different skills and more time than drawing characters. For this reason, artists generally tend to devote their skills and time resources to pursuing the originality and quality of the characters, and the reality is that the background of the illustration can be, for example, a solid solid color or a simple background similar to that. there were.

本発明は、上記の点に鑑み提案されたものであり、一つの側面では、ユーザによる画像作成を支援することを目的とする。 The present invention has been proposed in view of the above points, and one aspect of the present invention is to support image creation by a user.

上記の課題を解決するため、本発明に係る画像生成装置は、ユーザ端末に複数のタグを含むＵＩを生成するためのＵＩデータを送信するＵＩデータ送信手段と、前記ユーザ端末からユーザにより選択された一以上の前記タグを受信するタグ受信手段と、前記タグに対応する要素を含むプロンプトを作成するプロンプト作成手段と、前記プロンプトに基づいて画像を生成する画像生成手段と、前記画像を出力する画像出力手段と、を有し、前記プロンプト作成手段は、予め規定された所定の第１プロンプト要素からなるプロンプト、及び、前記タグに対応する要素を含むプロンプトを含むプロンプトと、予め規定された所定の第２プロンプト要素からなるネガティブプロンプトと、を作成する。
In order to solve the above problems, an image generation device according to the present invention includes a UI data transmission means for transmitting UI data for generating a UI including a plurality of tags to a user terminal, and a UI data transmission means for transmitting UI data for generating a UI including a plurality of tags to a user terminal; a tag receiving means for receiving one or more of the tags, a prompt generating means for creating a prompt including an element corresponding to the tag, an image generating means for generating an image based on the prompt, and an image generating means for outputting the image. an image output means, and the prompt creation means includes a prompt including a predetermined first prompt element, a prompt including a prompt including an element corresponding to the tag, and a predetermined predetermined prompt. A negative prompt consisting of a second prompt element is created .

また、上記の課題を解決するため、本発明に係るプロンプト作成支援装置は、ユーザ端末に複数のタグを含むＵＩを生成するためのＵＩデータを送信するＵＩデータ送信手段と、前記ユーザ端末からユーザにより選択された一以上の前記タグを受信するタグ受信手段と、前記タグに対応する要素を含むプロンプトを作成するプロンプト作成手段と、を有し、前記プロンプト作成手段は、予め規定された所定の第１プロンプト要素からなるプロンプト、及び、前記タグに対応する要素を含むプロンプトを含むプロンプトと、予め規定された所定の第２プロンプト要素からなるネガティブプロンプトと、を作成する。 Further, in order to solve the above problems, the prompt creation support device according to the present invention includes a UI data transmitting means for transmitting UI data for generating a UI including a plurality of tags to a user terminal, and a user terminal from the user terminal to a user. a tag receiving means for receiving one or more of the tags selected by the tag; and a prompt creating means for creating a prompt including an element corresponding to the tag , and the prompt creating means is configured to receive a predetermined predetermined tag. A prompt including a first prompt element, a prompt including a prompt including an element corresponding to the tag, and a negative prompt including a predetermined second prompt element are created.

本発明の実施の形態によれば、ユーザによる画像作成を支援することができる。 According to an embodiment of the present invention, it is possible to support image creation by a user.

本実施形態に係るイラスト背景画像生成システムのネットワーク構成例を示す図である。FIG. 1 is a diagram showing an example of a network configuration of an illustration background image generation system according to the present embodiment. 本実施形態に係るイラスト背景画像生成サーバのハードウェア構成例を示す図である。It is a diagram showing an example of the hardware configuration of an illustration background image generation server according to the present embodiment. 本実施形態に係るイラスト背景画像生成サーバのソフトウェア構成例を示す図である。It is a diagram showing an example of the software configuration of an illustration background image generation server according to the present embodiment. 本実施形態に係るユーザＤＢのデータ構成例を示す図である。FIG. 2 is a diagram showing an example of a data structure of a user DB according to the present embodiment. 本実施形態に係る画像保存ＤＢのデータ構成例を示す図である。FIG. 2 is a diagram illustrating an example data structure of an image storage DB according to the present embodiment. 本実施形態に係るタグＤＢのタグテーブル情報例を示す図である。It is a figure showing an example of tag table information of tag DB concerning this embodiment. 本実施形態に係るタグＤＢのオプションタグテーブル情報例を示す図である。It is a figure showing an example of option tag table information of tag DB concerning this embodiment. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その１）を示す。An example (part 1) of a web UI screen of a user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その２）を示す。An example (part 2) of the web UI screen of the user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その３）を示す。An example (part 3) of the web UI screen of the user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その４）を示す。An example of a web UI screen (No. 4) of a user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その５）を示す。An example (part 5) of the web UI screen of the user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その６）を示す。An example of a web UI screen (Part 6) of a user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その７）を示す。An example of a web UI screen (No. 7) of a user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その８）を示す。An example of a web UI screen (part 8) of a user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その９）を示す。An example of a web UI screen (No. 9) of a user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その１０）を示す。An example (No. 10) of the web UI screen of the user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その１１）を示す。An example of a web UI screen (No. 11) of a user terminal according to the present embodiment is shown. 本実施形態に係るユーザ端末のウェブＵＩ画面例（その１２）を示す。An example of a web UI screen (No. 12) of a user terminal according to the present embodiment is shown. 本実施形態に係る編集後のイラスト背景画像例を示す。An example of an illustration background image after editing according to the present embodiment is shown. 本実施形態に係るイラスト背景画像生成サーバの画像生成処理を示すフローチャート図である。FIG. 2 is a flowchart showing image generation processing by the illustration background image generation server according to the present embodiment. 本実施形態に係る「共通プロンプト」及び「ネガティブプロンプト」有り無しのイラスト背景画像例を示す。An example of an illustration background image with and without a "common prompt" and a "negative prompt" according to the present embodiment is shown.

本発明の実施の形態について、図面を参照しつつ詳細に説明する。
＜ネットワーク構成＞
図１は、本実施形態に係るイラスト背景画像生成システムのネットワーク構成例を示す図である。図１におけるイラスト背景画像生成システム１００は、イラスト背景画像生成サーバ１０及びユーザ端末２０が、ネットワーク５０を介して接続されている。 Embodiments of the present invention will be described in detail with reference to the drawings.
<Network configuration>
FIG. 1 is a diagram showing an example of a network configuration of an illustration background image generation system according to this embodiment. In an illustration background image generation system 100 in FIG. 1, an illustration background image generation server 10 and a user terminal 20 are connected via a network 50.

イラスト背景画像生成サーバ（以下単に画像生成サーバともいう）１０は、イラスト背景画像（背景コンテンツ）を生成するためのウェブＵＩ（ＵｓｅｒＩｎｔｅｒｆａｃｅ）画面をユーザに提供するとともに、ウェブＵＩ画面を介してユーザの選択操作に基づいて作成されたプロンプトを、ＤｉｆｆｕｓｉｏｎＭｏｄｅｌ（拡散モデル）により学習したＡＩ画像生成部に入力することで、イラスト背景画像を生成するサーバ装置である。 An illustration background image generation server (hereinafter also simply referred to as an image generation server) 10 provides the user with a web UI (User Interface) screen for generating illustration background images (background content), and also provides the user with a web UI (user interface) screen for generating illustration background images (background content). This is a server device that generates an illustration background image by inputting a prompt created based on the selection operation to an AI image generation unit trained by a Diffusion Model.

ユーザ端末２０は、イラストレータ等のユーザが所持するＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）、スマートフォン及びタブレット端末などを含む情報処理端末装置である。ユーザは画像生成サーバ１０のウェブＵＩ画面を介して、ユーザキャラクターとの全体構図、パース（パースペクティブ）又はストーリー等を考慮しながら、ユーザが望む理想に近いイラスト背景画像を出力するためのプロンプトを作成する。またユーザ端末２０は、作成したプロンプトをＡＩ画像生成部に入力することで、イラスト背景画像を生成し出力させることができる。 The user terminal 20 is an information processing terminal device including a PC (Personal Computer), a smartphone, a tablet terminal, etc. owned by a user such as an illustrator. Through the web UI screen of the image generation server 10, the user creates a prompt for outputting an illustration background image close to the ideal desired by the user while considering the overall composition with the user character, perspective, story, etc. do. Furthermore, the user terminal 20 can generate and output an illustration background image by inputting the created prompt to the AI image generation unit.

ネットワーク５０は、有線、無線を含む通信ネットワークである。ネットワーク５０は、例えば、インターネット、公衆回線網、ＷｉＦｉ（登録商標）などを含む。 The network 50 is a communication network including wired and wireless communication networks. The network 50 includes, for example, the Internet, a public network, WiFi (registered trademark), and the like.

（ハードウェア構成）
図２は、本実施形態に係るイラスト背景画像生成サーバのハードウェア構成例を示す図である。画像生成サーバ１０は、ＣＰＵ（Central Processing Unit）１１、ＲＯＭ（Read Only Memory）１２、ＲＡＭ（Random Access Memory）１３、ＨＤＤ（Hard Disk Drive）１４、及び通信装置１５を有する。 (Hardware configuration)
FIG. 2 is a diagram showing an example of the hardware configuration of the illustration background image generation server according to the present embodiment. The image generation server 10 includes a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, an HDD (Hard Disk Drive) 14, and a communication device 15.

ＣＰＵ１１は、各種プログラムの実行や演算処理を行う。ＲＯＭ１２は、起動時に必要なプログラムなどが記憶されている。ＲＡＭ１３は、ＣＰＵ１１での処理を一時的に記憶したり、データを記憶したりする作業エリアである。ＨＤＤ１４は、各種データ及びプログラムを格納する。通信装置１５は、ネットワーク５０を介して他装置との通信を行う。 The CPU 11 executes various programs and performs arithmetic processing. The ROM 12 stores programs necessary for startup. The RAM 13 is a work area that temporarily stores processing performed by the CPU 11 and stores data. The HDD 14 stores various data and programs. The communication device 15 communicates with other devices via the network 50.

（ソフトウェア構成）
図３は、本実施形態に係るイラスト背景画像生成サーバのソフトウェア構成例を示す図である。画像生成サーバ１０は、主な機能部として、ウェブＵＩデータ取得部１０１、ウェブＵＩデータ送信部１０２、タグ受信部１０３、プロンプト作成部１０４、ＡＩ画像生成部１０５、画像出力部１０６、キャラクター画像取得部１０７、画像編集部１０８、及び記憶部１０９を有する。 (Software configuration)
FIG. 3 is a diagram showing an example of the software configuration of the illustration background image generation server according to the present embodiment. The image generation server 10 includes a web UI data acquisition unit 101, a web UI data transmission unit 102, a tag reception unit 103, a prompt creation unit 104, an AI image generation unit 105, an image output unit 106, and a character image acquisition unit as main functional units. 107, an image editing section 108, and a storage section 109.

ウェブＵＩデータ取得部１０１は、複数のシチュエーションタグを含むウェブＵＩ（例えばウェブＵＩ画面）を生成するためのウェブＵＩデータを取得する機能を有している。 The web UI data acquisition unit 101 has a function of acquiring web UI data for generating a web UI (for example, a web UI screen) including a plurality of situation tags.

ウェブＵＩデータ送信部１０２は、ユーザ端末２０にウェブＵＩデータを送信する機能を有している。 The web UI data transmission unit 102 has a function of transmitting web UI data to the user terminal 20.

タグ受信部１０３は、ユーザ端末２０からユーザにより選択された一以上のタグを受信する機能を有している。 The tag receiving unit 103 has a function of receiving one or more tags selected by the user from the user terminal 20.

プロンプト作成部１０４は、ユーザにより選択されたタグに対応する要素を含むプロンプトを作成する機能を有している。 The prompt creation unit 104 has a function of creating a prompt including an element corresponding to the tag selected by the user.

ＡＩ画像生成部１０５は、作成されたプロンプトに基づいて画像を生成する機能を有している。 The AI image generation unit 105 has a function of generating an image based on the created prompt.

画像出力部１０６は、生成された画像を出力する機能を有している。 The image output unit 106 has a function of outputting the generated image.

キャラクター画像取得部１０７は、ユーザ端末２０からキャラクター画像を取得する機能を有している。 The character image acquisition unit 107 has a function of acquiring character images from the user terminal 20.

画像編集部１０８は、背景画像の上にキャラクター画像を配置する機能を有している。 The image editing unit 108 has a function of arranging a character image on a background image.

記憶部１０９は、ユーザＤＢ１０９ａ、画像保存ＤＢ１０９ｂ、タグＤＢ１０９ｃ及び学習データＤＢ１０９ｄを記憶する。ユーザＤＢ１０９ａは、ユーザのユーザ情報が登録されたＤＢである。画像保存ＤＢ１０９ｂは、ユーザキャラクター画像（キャラクターコンテンツ画像）、イラスト背景画像（背景コンテンツ画像）、合成コンテンツ画像を保持するＤＢである。タグＤＢ１０９ｃは、タグテーブル及びオプションタグテーブルを保持するＤＢである。学習データＤＢ１０９ｄは、画像生成サーバ１０のＡＩ画像生成部１０５がイラスト背景のコンテンツ画像を生成するために、例えば公知のＤｉｆｆｕｓｉｏｎＭｏｄｅｌ（拡散モデル）により予め学習された膨大な画像データセットである。 The storage unit 109 stores a user DB 109a, an image storage DB 109b, a tag DB 109c, and a learning data DB 109d. The user DB 109a is a DB in which user information of users is registered. The image storage DB 109b is a DB that holds user character images (character content images), illustration background images (background content images), and composite content images. The tag DB 109c is a DB that holds a tag table and an option tag table. The learning data DB 109d is a huge image data set that is trained in advance using, for example, a known diffusion model, in order for the AI image generation unit 105 of the image generation server 10 to generate a content image of an illustration background.

なお、各機能部は、画像生成サーバ１０を構成するコンピュータのＣＰＵ、ＲＯＭ、ＲＡＭ等のハードウェア資源上で実行されるプログラムによって実現されるものである。これらの機能部は、「手段」、「モジュール」、「ユニット」、又は「回路」に読替えてもよい。また、各機能部は、画像生成サーバ１０単一だけではなく、複数のサーバ装置に機能を分散させ相互にネットワーク５０を介して通信可能とすることで、画像生成サーバ１０単一と同様の機能を実現してもよい。また、記憶部１０９の各ＤＢは、ネットワーク５０上の外部記憶装置に配置することも可能である。また、コンピュータプログラム及びアプリケーションプログラムは、コンピュータが読み取り可能な記憶媒体に格納されていてもよい。 Note that each functional unit is realized by a program executed on hardware resources such as the CPU, ROM, and RAM of the computer that constitutes the image generation server 10. These functional units may be read as "means", "module", "unit", or "circuit". In addition, each functional unit has functions similar to those of a single image generation server 10 by distributing the functions not only to a single image generation server 10 but also to a plurality of server devices and allowing them to communicate with each other via the network 50. may be realized. Further, each DB of the storage unit 109 can also be placed in an external storage device on the network 50. Further, the computer program and the application program may be stored in a computer readable storage medium.

（データベース）
図４は、本実施形態に係るユーザＤＢのデータ構成例を示す図である。ユーザＤＢ１０９ａは、ユーザのユーザ情報が登録されたＤＢである。本実施形態に係るユーザＤＢ１０９ａは、例えば「ユーザＩＤ」、「ユーザ名」、「パスワード」、「メールアドレス」などのデータ項目を有する。 (database)
FIG. 4 is a diagram showing an example of the data structure of the user DB according to this embodiment. The user DB 109a is a DB in which user information of users is registered. The user DB 109a according to this embodiment has data items such as "user ID", "user name", "password", and "email address", for example.

「ユーザＩＤ」は、ユーザ毎に付番される固有の識別子である。「ユーザＩＤ」及び「パスワード」は、ユーザが画像生成サーバ１０のウェブＵＩ画面にアクセス・ログインするためのＩＤ及びパスワードである。 "User ID" is a unique identifier numbered for each user. The “user ID” and “password” are the ID and password for the user to access and log in to the web UI screen of the image generation server 10.

図５は、本実施形態に係る画像保存ＤＢのデータ構成例を示す図である。画像保存ＤＢ１０９ｂに保存される画像は、ユーザキャラクター画像（キャラクターコンテンツ画像）、イラスト背景画像（背景コンテンツ画像）、及び合成コンテンツ画像がある。ユーザキャラクター画像は、ウェブＵＩ画面を介してユーザによりアップロードされたキャラクターのコンテンツ画像（透過画像）である。イラスト背景画像は、画像生成サーバ１０の画像生成部によって生成されたイラスト背景のコンテンツ画像である。合成コンテンツ画像は、ユーザキャラクター画像をイラスト背景画像の上に重畳して合成したコンテンツ画像である。 FIG. 5 is a diagram showing an example of the data structure of the image storage DB according to this embodiment. Images stored in the image storage DB 109b include user character images (character content images), illustration background images (background content images), and composite content images. The user character image is a character content image (transparent image) uploaded by the user via the web UI screen. The illustration background image is an illustration background content image generated by the image generation unit of the image generation server 10. The composite content image is a content image in which a user character image is superimposed and composited on an illustration background image.

図５（ａ）に示す画像保存ＤＢ(ユーザキャラクター画像)は、例えば「ユーザＩＤ」、「アップロード日時刻」、「画像データ」などのデータ項目を有する。「ユーザＩＤ」は、ユーザＤＢ１０９ａに対応するユーザＩＤである。「アップロード日時刻」は、ユーザによりユーザキャラクター画像がアップロードされた日時刻である。「画像データ」は、ユーザによりアップロードされたユーザキャラクター画像のデータである。 The image storage DB (user character image) shown in FIG. 5A has data items such as "user ID", "upload date and time", and "image data", for example. "User ID" is a user ID corresponding to the user DB 109a. The "upload date and time" is the date and time when the user character image was uploaded by the user. "Image data" is data of a user character image uploaded by the user.

図５（ｂ）に示す画像保存ＤＢ(イラスト背景画像)は、例えば「ユーザＩＤ」、「作成日時刻」、「画像データ」、「タグ」、「プロンプト」などのデータ項目を有する。「ユーザＩＤ」は、ユーザＤＢ１０９ａに対応するユーザＩＤである。「作成日時刻」は、画像生成サーバ１０の画像生成部によってイラスト背景画像が生成された日時刻である。「画像データ」は、画像生成サーバ１０のＡＩ画像生成部１０５によって生成されたイラスト背景画像のデータである。「タグ」は、ウェブＩＵからユーザにより選択されたタグである。「プロンプト」は、当該イラスト背景画像を生成するためにＡＩ画像生成部１０５に入力されるプロンプトテキストである。 The image storage DB (illustration background image) shown in FIG. 5B has data items such as "user ID", "creation date and time", "image data", "tag", and "prompt". "User ID" is a user ID corresponding to the user DB 109a. The “creation date and time” is the date and time when the illustration background image was generated by the image generation unit of the image generation server 10. “Image data” is data of an illustration background image generated by the AI image generation unit 105 of the image generation server 10. "Tag" is a tag selected by the user from the web IU. “Prompt” is a prompt text input to the AI image generation unit 105 in order to generate the illustration background image.

図５（ｃ）に示す画像保存ＤＢ(合成コンテンツ画像)は、例えば「ユーザＩＤ」、「作成日時刻」、「画像データ」などのデータ項目を有する。「ユーザＩＤ」は、ユーザＤＢ１０９ａに対応するユーザＩＤである。「作成日時刻」は、ユーザ操作に応じて合成コンテンツ画像が生成された日時刻である。「画像データ」は、ユーザ操作に応じて生成された合成コンテンツ画像のデータである。 The image storage DB (synthesized content image) shown in FIG. 5(c) has data items such as "user ID", "creation date and time", and "image data", for example. "User ID" is a user ID corresponding to the user DB 109a. The “creation date and time” is the date and time when the composite content image was generated in response to a user operation. “Image data” is data of a composite content image generated in response to user operations.

図６は、本実施形態に係るタグＤＢのタグテーブル情報例を示す図である。タグＤＢ１０９ｃにおけるタグテーブルは、イラスト背景のシチュエーションと意味付けされたタグ（シチュエーションタグともいう）と、そのタグを表現するプロンプトとが対応付けられたテーブルである。タグはウェブＵＩ画面にユーザに選択可能に表示される。本実施形態に係るタグテーブルは、例えば「タグカテゴリ」、「タグ」、「プロンプト要素」、「オプションタグカテゴリ」、「デフォルトプロンプト要素」、「カスタムベースプロンプト」などのデータ項目を有する。 FIG. 6 is a diagram showing an example of tag table information of the tag DB according to the present embodiment. The tag table in the tag DB 109c is a table in which tags associated with situations in the illustration background (also referred to as situation tags) are associated with prompts expressing the tags. The tags are displayed on the web UI screen so that the user can select them. The tag table according to the present embodiment includes data items such as "tag category", "tag", "prompt element", "option tag category", "default prompt element", and "custom base prompt".

「タグカテゴリ」は、ウェブＵＩ画面にユーザ表示される第１階層（最上位階層）の意味概念タグである。「タグ」は、何れかの「タグカテゴリ」配下に属し、ウェブＵＩ画面にユーザ表示される第２階層のタグである。例えば第１階層の「タグカテゴリ」は「学校」「建物」「廃墟」「シーズン」「屋外」「自然」カテゴリタグなどがあり、例えば第１階層の「学校」タグカテゴリは、第２階層の「プール」「校庭」「廊下」「図書館」「教室」などがある。 The "tag category" is a first layer (top layer) semantic concept tag that is displayed to the user on the web UI screen. A "tag" is a second-layer tag that belongs to any "tag category" and is displayed to the user on the web UI screen. For example, the "tag categories" on the first layer include "school," "building," "ruins," "season," "outdoors," and "nature" category tags. For example, the "school" tag category on the first layer has category tags on the second layer. There are ``pools'', ``schoolyards'', ``corridors'', ``libraries'', ``classrooms'', etc.

「プロンプト要素」は、作成されるプロンプトの構成要素であって、ウェブＵＩ画面によるユーザ操作により選択（指定）されたタグを、プロンプトの表現形式に変換するための英単語による指示語である。イラスト背景画像を生成するには、ウェブＵＩ画面上のタグそのものではなく、タグテーブルにおいて変換されたプロンプト要素を含むプロンプト（英単語をカンマ区切りで組み合わせた指示文）が、画像生成サーバ１０のＡＩ画像生成部１０５に入力される。 A "prompt element" is a constituent element of a prompt to be created, and is a directive word in English for converting a tag selected (specified) by a user operation on a web UI screen into a prompt expression format. To generate an illustration background image, the image generation server 10's AI uses a prompt (instruction text consisting of a combination of English words separated by commas) that includes prompt elements converted in the tag table, rather than the tag itself on the web UI screen. The image is input to the image generation unit 105.

「オプションタグカテゴリ」は、ウェブＵＩ画面上にオプションとしてユーザ表示されるオプションのタグカテゴリである。オプションのため、タグによって「オプションタグカテゴリ」の有無及びタグカテゴリの内容は異なる。例えば、「プール」「校庭」といった屋外シチュエーションを意味するタグには「時間帯」という概念のオプションタグカテゴリがあるのに対し、屋内シチュエーションを意味する「廊下」「図書館」「教室」といったタグには「時間帯」という概念のオプションタグカテゴリがない。 The "option tag category" is an optional tag category that is displayed to the user as an option on the web UI screen. Since it is an option, the presence or absence of an "option tag category" and the content of the tag category vary depending on the tag. For example, tags that mean outdoor situations such as "pool" and "schoolyard" have an optional tag category with the concept of "time of day," whereas tags that mean indoor situations such as "corridor," "library," and "classroom" have an optional tag category with the concept of "time of day." does not have an option tag category with the concept of "time zone".

「デフォルトプロンプト要素」は、「オプションタグカテゴリ」を複数有する場合であって、複数のオプションタグカテゴリの中からユーザが何れのオプションタグカテゴリを選択（指定）しない場合に、初期値となるように規定されたオプションタグのプロンプト要素である。 The "default prompt element" is set as the initial value when there are multiple "option tag categories" and the user does not select (specify) any option tag category from among the multiple option tag categories. A prompt element for the specified option tag.

「カスタムベースプロンプト」は、プロンプト作成時に、共通プロンプトの代わりに置換して指定されるプロンプトである。カスタムベースプロンプトが規定されるタグとして、例えば「廊下」タグの場合は、「共通プロンプト」を適用する代わりに置換して、カスタムベースプロンプト「(((masterpiece))),(((best quality))), (((high quality))), (((scenery))), (((no humans))), (((absurdres)))」を入力する。出願人の研究によれば、タグテーブル上でカスタムベースプロンプトが規定されるタグは、共通プロンプトを適用せずにカスタムベースプロンプトを適用した方が、生成される画像の画質向上が期待できる。 A "custom base prompt" is a prompt that is specified in place of a common prompt when creating a prompt. For example, if the tag for which a custom base prompt is specified is "corridor", instead of applying "common prompt", replace it with the custom base prompt "(((masterpiece))),(((best quality) )), (((high quality))), (((scenery))), (((no humans))), (((absurdres)))". According to the applicant's research, when a custom base prompt is applied to a tag for which a custom base prompt is defined on the tag table, the image quality of the generated image can be expected to be improved by applying a custom base prompt instead of applying a common prompt.

図７は、本実施形態に係るタグＤＢのオプションタグテーブル情報例を示す図である。タグＤＢ１０９ｃにおけるオプションタグテーブルは、イラスト背景のシチュエーションと意味付けされたオプションタグ（シチュエーションオプションタグともいう）と、そのオプションタグを表現するプロンプトとが対応付けられたテーブルである。 FIG. 7 is a diagram showing an example of option tag table information of the tag DB according to the present embodiment. The option tag table in the tag DB 109c is a table in which option tags (also referred to as situation option tags) associated with situations in the illustration background are associated with prompts expressing the option tags.

本実施形態に係るオプションタグテーブルは、例えば「オプションタグカテゴリ」、「オプションタグ」、「プロンプト要素」などのデータ項目を有する。「オプションタグカテゴリ」は、図６に示すタグテーブル情報の「オプションタグカテゴリ」に対応したオプションのタグカテゴリである。「オプションタグ」は、何れかの「オプションタグカテゴリ」配下に属するオプションタグである。「プロンプト要素」は、ウェブＵＩ画面からユーザ操作によりオプションタグが選択（指定）された場合、そのオプションタグをプロンプトの表現形式に変換するための英単語による指示語である。 The option tag table according to this embodiment includes data items such as "option tag category," "option tag," and "prompt element." The "option tag category" is an option tag category corresponding to the "option tag category" in the tag table information shown in FIG. The "option tag" is an option tag that belongs to any "option tag category." A "prompt element" is a directive word in English for converting an option tag into a prompt expression format when an option tag is selected (specified) by a user operation from the web UI screen.

＜プロンプト＞
プロンプトは、ｔｅｘｔｔｏｉｍａｇｅ型ＡＩに入力する英単語をカンマ区切りで組み合わせた指示文である。本実施形態においては、ウェブＵＩ画面からユーザ所望の背景画像のシチュエーションに応じて選択（指定）されたタグを、タグテーブル及びオプションタグテーブルに基づいてプロンプトの表現形式に変換することで、ＡＩ画像生成部１０５に入力するプロンプトが作成される。 <Prompt>
The prompt is an instruction sentence that is a combination of English words to be input into the text-to-image AI, separated by commas. In this embodiment, by converting tags selected (specified) from the web UI screen according to the user's desired background image situation into a prompt expression format based on the tag table and option tag table, the AI image A prompt to be input to the generation unit 105 is created.

一般に「ｔｅｘｔｔｏｉｍａｇｅ」型ＡＩは、「プロンプト」と「ネガティブプロンプト」の２つを入力として受け取るため、同様にＡＩ画像生成部１０５に入力されるプロンプトは、「プロンプト」及び「ネガティブプロンプト」である。また本実施形態において前者の「プロンプト」は、「共通プロンプト」及び「ユーザプロンプト」から構成される。共通プロンプトは、全ての「プロンプト」に必ず共通含める固定のプロンプトであり、予め規定された所定のプロンプト要素からなる。ユーザプロンプトは、ユーザ選択されたタグに応じたプロンプト要素からなるプロンプトである。後者の「ネガティブプロンプト」は、画像中に表示してほしくないものを指示するためのプロンプトであり、予め規定された所定のプロンプト要素からなる。 Generally, "text to image" type AI receives two inputs: a "prompt" and a "negative prompt." Similarly, the prompt input to the AI image generation unit 105 is a "prompt" and a "negative prompt." be. Furthermore, in this embodiment, the former "prompt" is composed of a "common prompt" and a "user prompt." The common prompt is a fixed prompt that is always included in all "prompts" and is composed of predetermined prompt elements defined in advance. The user prompt is a prompt consisting of prompt elements according to the tag selected by the user. The latter "negative prompt" is a prompt for instructing what is not desired to be displayed in the image, and is composed of predetermined prompt elements defined in advance.

以下、「共通プロンプト」、「ユーザプロンプト」及び「ネガティブプロンプト」を例示する。括弧（）は括弧中のプロンプト要素を強調する指示作用を有する。
・共通プロンプトの例
「(((masterpiece))),(((best quality))), (((high quality))), (((scenery))), (((no humans))), (((absurdres))),light_particles,finely detail」
・ユーザプロンプトの例（「プール」「朝」タグの場合）
「pool,school,outdoors,((poolside)),sunlight,blue_sky」
・ネガティブプロンプトの例
「worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry,cropped,ugly,outside_border」 Hereinafter, a "common prompt", a "user prompt", and a "negative prompt" will be exemplified. Brackets () have the indicating effect of emphasizing the prompt element within the brackets.
・Example of common prompt "(((masterpiece))),(((best quality))), (((high quality))), (((scenery))), (((no humans))), ( ((absurdres))),light_particles,finely detail”
・Example of user prompt (for "pool" and "morning" tags)
"pool,school,outdoors,((poolside)),sunlight,blue_sky"
・Example of negative prompt "worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username , blurry,cropped,ugly,outside_border"

＜画像生成例＞
次に、ユーザ端末２０の画面上に表示されるウェブＵＩ画面から、ユーザ操作によって生成される画像生成例を示す。
（イラスト背景画像の生成）
図８は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その１）を示す。ユーザはユーザ端末２０の所定のアプリケーションプログラム（例えば汎用ウェブブラウザや専用アプロケーションプログラム）を用いて、イラスト背景画像生成サーバ１０にアクセス・ログインすることで、ユーザ端末２０の画面上にウェブＵＩ画面を表示する。 <Image generation example>
Next, an example of image generation generated by user operation from the web UI screen displayed on the screen of the user terminal 20 will be shown.
(Generation of illustration background image)
FIG. 8 shows an example (part 1) of the web UI screen of the user terminal according to the present embodiment. The user uses a predetermined application program (for example, a general-purpose web browser or a dedicated application program) on the user terminal 20 to access and log in to the illustration background image generation server 10, thereby displaying a web UI screen on the screen of the user terminal 20. indicate.

ウェブＵＩ画面は、画像を生成するための生成モード２０１において、生成・保存されたイラスト背景画像一覧を示す「ギャラリー」２０２、生成された画像が表示される生成画像表示部２０３、ユーザが生成したイラスト背景のシチュエーションを選択するためのタグ選択部２０４を有する。ユーザは、タグ選択部２０４のタグの中から、ユーザが生成したいイラスト背景画像にマッチするタグ（キーワード）を選択する。 The web UI screen includes, in a generation mode 201 for generating images, a "Gallery" 202 that shows a list of generated and saved illustration background images, a generated image display area 203 that displays generated images, and a generated image display area 203 that displays images generated by the user. It has a tag selection section 204 for selecting an illustration background situation. The user selects a tag (keyword) that matches the illustration background image that the user wants to generate from among the tags in the tag selection section 204.

図９は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その２）を示す。ウェブＵＩ画面のタグ選択部２０４において、ユーザが「学校」「建物」「廃墟」「シーズン」「屋外」「自然」といった複数の「タグカテゴリ」２０５のうち、例えば「学校」タグカテゴリを選択操作すると、「学校」のタグカテゴリ配下に属する「タグ」２０６が表示される。図９に示されるように、「学校」タグカテゴリ配下に属する「タグ」２０６は、例えば「プール」「校庭」「廊下」「図書館」「教室」などがある（図６に示すタグＤＢ１０９ｃにおけるタグテーブル参照）。 FIG. 9 shows an example (part 2) of the web UI screen of the user terminal according to the present embodiment. In the tag selection section 204 of the web UI screen, the user selects, for example, the "school" tag category from among the multiple "tag categories" 205 such as "school," "building," "ruin," "season," "outdoor," and "nature." Then, "tags" 206 belonging to the "school" tag category are displayed. As shown in FIG. 9, the "tags" 206 that belong to the "school" tag category include, for example, "pool," "schoolyard," "corridor," "library," and "classroom" (tags in the tag DB 109c shown in FIG. 6). (see table).

図１０は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その３）を示す。ユーザが例えば「学校」タグカテゴリ配下に属する「プール」タグを選択操作すると、選択操作されたタグがタグ選択欄２０７に表示される。 FIG. 10 shows an example (part 3) of the web UI screen of the user terminal according to the present embodiment. For example, when the user selects a "pool" tag belonging to the "school" tag category, the selected tag is displayed in the tag selection field 207.

また、選択操作されたタグに「オプションタグカテゴリ」が存在する場合、当該タグに対応する「オプションタグカテゴリ」２０８及び「オプションタグ」２０９が表示される。例えば「プール」タグが選択操作された場合、当該「プール」タグに対応する「時間帯」オプションタグカテゴリ並びに「朝」オプションタグ及び「夕方～夜」オプションタグが表示される（図７に示すタグＤＢ１０９ｃにおけるオプションタグテーブル参照）。なお、ユーザが「オプションタグ」２０９を選択操作しない場合には、「デフォルトプロンプト要素」（「朝」オプションタグに対応するプロンプト要素）が選択されたものとみなされる。 Further, if an "option tag category" exists in the selected tag, "option tag category" 208 and "option tag" 209 corresponding to the tag are displayed. For example, when a "pool" tag is selected, the "time zone" option tag category, "morning" option tag, and "evening to night" option tag corresponding to the "pool" tag are displayed (as shown in Figure 7). (See option tag table in tag DB 109c). Note that if the user does not select the "option tag" 209, it is assumed that the "default prompt element" (the prompt element corresponding to the "morning" option tag) has been selected.

図１１は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その４）を示す。ユーザがタグ選択を完了し「背景画像生成」ボタン２１０を押下操作した場合、生成画像表示部２０３に生成されたイラスト背景画像２０３ａが表示される。当該イラスト背景画像２０３ａは、タグ選択欄２０７に表示されたタグに基づいて作成されたプロンプトが入力されることで、生成された画像である。以下に、選択操作された「プール」（及び「朝」）タグに基づいて作成されたプロンプト例を示す。なお「プール」のプロンプト要素は、「pool,school,outdoors,((poolside))」である。「朝」のデフォルトプロンプト要素は、「sunlight,blue_sky」である。
プロンプト（共通プロンプト＋ユーザプロンプト）：「(((masterpiece))),(((best quality))), (((high quality))), (((scenery))), (((no humans))), (((absurdres))),light_particles,finely detail,pool,school,outdoors,((poolside)),sunlight,blue_sky」
ネガティブプロンプト：「worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry,cropped,ugly,outside_border」 FIG. 11 shows an example (part 4) of the web UI screen of the user terminal according to the present embodiment. When the user completes tag selection and presses the "background image generation" button 210, the generated illustration background image 203a is displayed on the generated image display section 203. The illustration background image 203a is an image generated by inputting a prompt created based on the tag displayed in the tag selection field 207. An example of a prompt created based on the selected "pool" (and "morning") tags is shown below. Note that the prompt element for "pool" is "pool, school, outdoors, ((poolside))". The default prompt element for "morning" is "sunlight,blue_sky".
Prompts (common prompt + user prompt): "(((masterpiece))),(((best quality))), (((high quality))), (((scenery))), (((no humans) )), (((absurdres))),light_particles,finely detail,pool,school,outdoors,((poolside)),sunlight,blue_sky”
Negative prompt: “worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry ,cropped,ugly,outside_border”

このうち、選択操作された「プール」タグに対応するプロンプト要素からなる「ユーザプト」は、「pool,school,outdoors,((poolside)),sunlight,blue_sky」の部分である。 Among these, the "user input" consisting of prompt elements corresponding to the selected "pool" tag is the "pool, school, outdoors, ((poolside)), sunlight, blue_sky" portion.

なお、作成されたプロンプト結果については、例えばウェブＵＩ画面上において、ユーザに対しては明示的に表示してもよいし、非表示としてもよい。非表示としてもよいのは、ユーザはタグを選択すればよく、作成されたプロンプト自体を意識する必要がないためである。 Note that the created prompt result may be explicitly displayed to the user, for example, on the web UI screen, or may be hidden. The reason why the prompt may be hidden is that the user only needs to select the tag and does not need to be aware of the created prompt itself.

図１２は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その５）を示す。図１２は、図１０に示すウェブＵＩ画面において、ユーザが「夕方～夜」オプションタグ２０９を選択操作し、「背景画像生成」ボタン２１０を押下操作した場合に、生成されたイラスト背景画像２０３ａを示す。図１２のイラスト背景画像２０３ａと図１１のイラスト背景画像２０３ａとを比較すると、生成されたイラスト背景画像は「プール」という共通シチュエーションの画像である一方、時間帯は「朝」と「夕方～夜」という異なるシチュエーションの画像となっている。以下に、選択操作された「プール」及び「夕方～夜」タグに基づいて作成されたプロンプト例を示す。なお「プール」のプロンプト要素は、「pool,school,outdoors,((poolside))」である。「夕方～夜」のデフォルトプロンプト要素は、「night,night sky」である。 FIG. 12 shows an example (part 5) of the web UI screen of the user terminal according to the present embodiment. FIG. 12 shows a generated illustration background image 203a when the user selects the "evening to night" option tag 209 and presses the "background image generation" button 210 on the web UI screen shown in FIG. show. Comparing the illustration background image 203a in FIG. 12 with the illustration background image 203a in FIG. ” The images are from different situations. Below is an example of a prompt created based on the selected "pool" and "evening to night" tags. Note that the prompt element for "pool" is "pool, school, outdoors, ((poolside))". The default prompt element for "evening-night" is "night, night sky."

プロンプト（共通プロンプト＋ユーザプロンプト）：「(((masterpiece))),(((best quality))), (((high quality))), (((scenery))), (((no humans))), (((absurdres))),light_particles,finely detail,pool,school,outdoors,((poolside)),night,night sky」
ネガティブプロンプト：「worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry,cropped,ugly,outside_border」 Prompts (common prompt + user prompt): "(((masterpiece))),(((best quality))), (((high quality))), (((scenery))), (((no humans) )), (((absurdres))),light_particles,finely detail,pool,school,outdoors,((poolside)),night,night sky”
Negative prompt: “worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry ,cropped,ugly,outside_border”

このうち、選択操作された「プール」及び「夕方～夜」タグに対応するプロンプト要素からなる「ユーザプト」は、「pool,school,outdoors,((poolside)),night,night sky」の部分である。 Among these, the "user input" consisting of prompt elements corresponding to the selected tags "pool" and "evening to night" is "pool, school, outdoors, ((poolside)), night, night sky". be.

そしてユーザが「画像保存」ボタン２１１を押下操作した場合、生成画像表示部２０３に生成された画像が画像保存ＤＢ１０９ｂに保存される（図５（ｂ））。また保存された画像は「ギャラリー」２０２に表示される。「ギャラリー」２０２の画像はユーザ端末２０にダウンロードも可能である。 When the user presses the "image save" button 211, the image generated in the generated image display section 203 is saved in the image storage DB 109b (FIG. 5(b)). The saved images are also displayed in a “gallery” 202. Images in the "Gallery" 202 can also be downloaded to the user terminal 20.

図１３は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その６）を示す。図１３に示すウェブＵＩ画面においては、ユーザが「自然」「冬」「朝」タグを選択操作し、「背景画像生成」ボタン２１０を押下操作した場合に、生成されたイラスト背景画像２０３ａを示す。生成されたイラスト背景画像は「自然」「冬」「朝」というシチュエーションの画像となっている。 FIG. 13 shows an example (part 6) of the web UI screen of the user terminal according to the present embodiment. The web UI screen shown in FIG. 13 shows an illustration background image 203a that is generated when the user selects the "nature", "winter", and "morning" tags and presses the "background image generation" button 210. . The generated illustration background images are images of situations such as "nature," "winter," and "morning."

（イラスト背景画像の編集）
図１４は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その７）を示す。ウェブＵＩ画面は、画像を編集するための編集モード２２１において、背景画像を編集操作するための操作部として、イラスト背景画像及びユーザキャラクター画像を移動操作する「自由移動」２２２、ユーザキャラクター画像を拡大縮小操作する「キャラ拡大」２２３、イラスト背景画像を拡大縮小操作する「背景拡大」２２４、イラスト背景画像をぼかし操作する「背景ぼかし」２２５、イラスト背景画像の彩度を上げ下げする「彩度」２２６を有する。またウェブＵＩ画面は、ユーザキャラクター画像をアップロード操作するための操作部として、「キャラ画像をアップロード」２２７を有する。 (Editing illustration background image)
FIG. 14 shows an example (No. 7) of the web UI screen of the user terminal according to the present embodiment. In the editing mode 221 for editing an image, the web UI screen serves as an operation unit for editing the background image, "free movement" 222 for moving the illustration background image and the user character image, and enlarging the user character image. "Character enlargement" 223 for scaling down; "Background enlargement" 224 for scaling the illustration background image; "Background blur" 225 for blurring the illustration background image; "Saturation" 226 for increasing/lowering the saturation of the illustration background image. has. The web UI screen also includes an "upload character image" 227 as an operation section for uploading a user character image.

図１５は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その８）を示す。図１５（ａ）に示されるように、ユーザが「キャラ画像をアップロード」２２７を操作した場合、ユーザキャラクター画像がアップロードされることで、生成画像表示部２０３のイラスト背景画像２０３ａの上に重畳してユーザキャラクター画像２０３ｂが表示される。またアップロードされたユーザキャラクター画像は、画像保存ＤＢ１０９ｂに保存される（図５（ａ））。また図１５（ｂ）に示されるように、ユーザが「キャラ拡大」２２３を操作した場合、生成画像表示部２０３のイラスト背景画像の上のユーザキャラクター画像２０３ｂを拡大縮小操作することができる。 FIG. 15 shows an example (No. 8) of the web UI screen of the user terminal according to the present embodiment. As shown in FIG. 15(a), when the user operates "Upload character image" 227, the user character image is uploaded and superimposed on the illustration background image 203a of the generated image display section 203. The user character image 203b is displayed. Further, the uploaded user character image is stored in the image storage DB 109b (FIG. 5(a)). Further, as shown in FIG. 15(b), when the user operates "enlarge character" 223, the user character image 203b above the illustration background image on the generated image display section 203 can be enlarged or reduced.

また、ユーザが「キャラ削除」２２８を操作した場合、生成画像表示部２０３のイラスト背景画像上からアップロードしたユーザキャラクター画像２０３ｂを削除できる。また、ユーザが「背景保存」２２９を操作した場合、生成画像表示部２０３のイラスト背景画像２０３ａのみを、画像保存ＤＢ１０９ｂに保存できる。一方、ユーザが「画像保存」ボタン２１１を操作した場合、生成画像表示部２０３に表示された全体画像、即ちイラスト背景画像２０３ａの上にユーザキャラクター画像２０３ｂが重畳した合成画像を、画像保存ＤＢ１０９ｂに保存できる（図５（ｃ））。保存された画像は「ギャラリー」２０２に表示される。 Furthermore, when the user operates "delete character" 228, the uploaded user character image 203b can be deleted from the illustration background image of the generated image display section 203. Further, when the user operates the "background save" 229, only the illustration background image 203a of the generated image display section 203 can be saved in the image save DB 109b. On the other hand, when the user operates the "image save" button 211, the entire image displayed on the generated image display section 203, that is, a composite image in which the user character image 203b is superimposed on the illustration background image 203a, is stored in the image storage DB 109b. It can be saved (Figure 5(c)). The saved images are displayed in "Gallery" 202.

図１６は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その９）を示す。図１６（ａ）に示されるように、ユーザが「背景拡大」２２４を操作した場合、生成画像表示部２０３のイラスト背景画像２０３ａを拡大縮小することができる。また図１６（ｂ）に示されるように、ユーザが「背景ぼかし」２２５を操作した場合、生成画像表示部２０３のイラスト背景画像２０３ａをぼかすことができる。 FIG. 16 shows an example (No. 9) of the web UI screen of the user terminal according to the present embodiment. As shown in FIG. 16A, when the user operates "background enlargement" 224, the illustration background image 203a on the generated image display section 203 can be enlarged or reduced. Further, as shown in FIG. 16(b), when the user operates "background blur" 225, the illustration background image 203a on the generated image display section 203 can be blurred.

図１７は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その１０）を示す。図１７に示されるように、ユーザが「彩度」２２６を操作した場合、生成画像表示部２０３のイラスト背景画像２０３ａの彩度を上げ下げできる。 FIG. 17 shows an example (No. 10) of the web UI screen of the user terminal according to the present embodiment. As shown in FIG. 17, when the user operates "saturation" 226, the saturation of the illustration background image 203a in the generated image display section 203 can be increased or decreased.

図１８は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その１１）を示す。図１８（ａ）に示されるように、ユーザが「キャラ移動」２２２ａを操作した場合、生成画像表示部２０３のユーザキャラクター画像２０３ｂを移動することができる。また図１８（ｂ）に示されるように、ユーザが「背景移動」２２２ｂを操作した場合、生成画像表示部２０３のイラスト背景画像２０３ａを移動することができる。 FIG. 18 shows an example (No. 11) of the web UI screen of the user terminal according to the present embodiment. As shown in FIG. 18A, when the user operates "Character movement" 222a, the user character image 203b on the generated image display section 203 can be moved. Further, as shown in FIG. 18(b), when the user operates "move background" 222b, the illustration background image 203a on the generated image display section 203 can be moved.

図１９は、本実施形態に係るユーザ端末のウェブＵＩ画面例（その１２）を示す。図１９（ａ）に示されるように、ユーザが「ギャラリー」２０２の中から一のイラスト背景画像を選択操作した場合、生成画像表示部２０３のイラスト背景画像２０３ａを、選択操作したイラスト背景画像２０３ａに変更することができる。 FIG. 19 shows an example (No. 12) of the web UI screen of the user terminal according to the present embodiment. As shown in FIG. 19(a), when the user selects one illustration background image from the "Gallery" 202, the illustration background image 203a in the generated image display section 203 is changed to the selected illustration background image 203a. can be changed to .

図２０は、本実施形態に係る編集後のイラスト背景画像例を示す。ウェブＵＩ画面は、画像を編集するための編集モード２２１（非図示）において、生成画像表示部２０３のイラスト背景画像２０３ａに対して、更に例えば、魚が水中から空を見上げた時に見える風景に近いアングルの魚眼画像、被写体を真下から撮影したアングルの煽り画像、被写体を真上から撮影したアングルの俯瞰画像となるよう編集加工が可能である。 FIG. 20 shows an example of an illustration background image after editing according to this embodiment. In the editing mode 221 (not shown) for editing images, the web UI screen is more similar to the illustration background image 203a of the generated image display section 203, for example, to the scenery that a fish would see when looking up at the sky from underwater. It is possible to edit the image to create a fisheye image taken from an angle, an exaggerated image taken from an angle where the subject is taken from directly below, or an overhead image taken from an angle taken from directly above the subject.

＜情報処理＞
図２１は、本実施形態に係るイラスト背景画像生成サーバの画像生成処理を示すフローチャート図である。ＣＰＵ１１が図２１に示すフローチャートを実現可能なプログラムを読み込んで実行させることで、各機能部が実行するステップ（以下、「Ｓ」と表記する）を実現することができる。 <Information processing>
FIG. 21 is a flowchart showing image generation processing by the illustration background image generation server according to the present embodiment. When the CPU 11 reads and executes a program that can implement the flowchart shown in FIG. 21, steps (hereinafter referred to as "S") executed by each functional unit can be implemented.

Ｓ１：ウェブＵＩデータ取得部１０１は、ユーザ端末２０からのアクセス・ログイン要求を受信すると、予め設けられたユーザ端末用のウェブＵＩ画面データを取得し、ウェブＵＩデータ送信部１０２は、ユーザ端末２０に取得したウェブＵＩ画面データを送信する。ユーザ端末２０は、ウェブＵＩ画面データを受信すると、所定のアプリケーションプログラム（例えば汎用ウェブブラウザや専用アプロケーションプログラム）を介して、画面上にウェブＵＩ画面データに基づくウェブＵＩ画面を表示する。 S1: When the web UI data acquisition unit 101 receives an access/login request from the user terminal 20, it acquires web UI screen data provided in advance for the user terminal, and the web UI data transmission unit 102 receives the access/login request from the user terminal 20. Send the acquired web UI screen data to. Upon receiving the web UI screen data, the user terminal 20 displays a web UI screen based on the web UI screen data on the screen via a predetermined application program (for example, a general-purpose web browser or a dedicated application program).

Ｓ２：タグ受信部１０３は、ユーザ端末２０からウェブＵＩ画面を介してユーザにより選択されたタグを受信する。受信タイミングは、例えばウェブＵＩ画面上でユーザがタグ選択を完了し「背景画像生成」ボタン２１０を押下操作したタイミングである。 S2: The tag receiving unit 103 receives the tag selected by the user from the user terminal 20 via the web UI screen. The reception timing is, for example, the timing when the user completes tag selection on the web UI screen and presses the "background image generation" button 210.

Ｓ３：プロンプト作成部１０４は、タグＤＢ１０９ｃにおけるタグテーブル及びオプションタグテーブルを参照し、Ｓ２で取得したタグに基づいてプロンプトを作成する。 S3: The prompt creation unit 104 refers to the tag table and option tag table in the tag DB 109c, and creates a prompt based on the tag acquired in S2.

具体的に、プロンプト作成部１０４は、まずタグＤＢ１０９ｃにおけるタグテーブル及びオプションタグテーブルを参照し、Ｓ２で取得したタグに対応するプロンプト要素を取得する。ユーザ選択されたタグが「プール」タグの場合、「プール」タグに対応するプロンプト要素「pool,school,outdoors,((poolside))」を取得する。またオプションタグカテゴリ「時間帯」が非選択であった場合には、「朝」タグに対応するデフォルトプロンプト要素「sunlight,blue_sky」を取得する。そしてＳ２で取得したタグに対応するプロンプト要素からなる「ユーザプロンプト」（Ａ２）を作成する。pool,school,outdoors,((poolside)),sunlight,blue_sky Specifically, the prompt creation unit 104 first refers to the tag table and option tag table in the tag DB 109c, and acquires the prompt element corresponding to the tag acquired in S2. If the tag selected by the user is the "pool" tag, the prompt element "pool,school,outdoors,((poolside))" corresponding to the "pool" tag is obtained. Furthermore, if the option tag category "time zone" is not selected, the default prompt element "sunlight, blue_sky" corresponding to the "morning" tag is acquired. Then, a "user prompt" (A2) consisting of prompt elements corresponding to the tag acquired in S2 is created. pool,school,outdoors,((poolside)),sunlight,blue_sky

次にプロンプト作成部１０４は、記憶部１０９の所定メモリ領域（非図示）に予め保持された所定の「共通プロンプト」（Ａ２）を取得する。
(((masterpiece))),(((best quality))), (((high quality))), (((scenery))), (((no humans))), (((absurdres))),light_particles,finely detail Next, the prompt creation unit 104 obtains a predetermined “common prompt” (A2) stored in a predetermined memory area (not shown) of the storage unit 109 in advance.
(((masterpiece))),(((best quality))), (((high quality))), (((scenery))), (((no humans))), (((absurdres))) ,light_particles,finely detail

そしてプロンプト作成部１０４は、取得した「共通プロンプト」（Ａ１）と、「ユーザプロンプト」（Ａ２）と含む「プロンプト」（Ａ１＋Ａ２）を作成する。 Then, the prompt creation unit 104 creates a "prompt" (A1+A2) including the acquired "common prompt" (A1) and "user prompt" (A2).

またプロンプト作成部１０４は、記憶部１０９の所定メモリ領域（非図示）に予め保持された所定の「ネガティブプロンプト」（Ｂ）を取得する。そして、「プロンプト」（Ａ１＋Ａ２）及び「ネガティブプロンプト」（Ｂ）を作成する。 The prompt creation unit 104 also acquires a predetermined “negative prompt” (B) stored in a predetermined memory area (not shown) of the storage unit 109 in advance. Then, a "prompt" (A1+A2) and a "negative prompt" (B) are created.

Ｓ４：プロンプト作成部１０４は、ＡＩ画像生成部１０５にＳ３で作成したプロンプトを入力する。具体的に、プロンプト作成部１０４は、ＡＩ画像生成部１０５に上述の「プロンプト」（Ａ１＋Ａ２）及び「ネガティブプロンプト」（Ｂ）を入力する。 S4: The prompt creation unit 104 inputs the prompt created in S3 to the AI image generation unit 105. Specifically, the prompt generation unit 104 inputs the above-mentioned “prompt” (A1+A2) and “negative prompt” (B) to the AI image generation unit 105.

Ｓ５：ＡＩ画像生成部１０５は、Ｓ４で入力されたプロンプトに基づいてイラスト背景画像の生成処理を実行する。 S5: The AI image generation unit 105 executes an illustration background image generation process based on the prompt input in S4.

Ｓ６：画像出力部１０６は、Ｓ５で生成したイラスト背景画像を出力する。また記憶部１０９は、画像保存ＤＢ（イラスト背景画像）１０９ｂに、アクセス・ログインした当該ユーザの「ユーザＩＤ」に対応付けてイラスト背景画像を保存する。また画像保存ＤＢ（イラスト背景画像）１０９ｂには、Ｓ２で受信したタグ及びＳ４で入力されたプロンプトが、それぞれ「タグ」「プロンプト」の項目に保存される（図５（ｂ）参照）。 S6: The image output unit 106 outputs the illustration background image generated in S5. The storage unit 109 also stores the illustration background image in the image storage DB (illustration background image) 109b in association with the "user ID" of the user who has accessed and logged in. Further, in the image storage DB (illustration background image) 109b, the tag received in S2 and the prompt input in S4 are stored in the "tag" and "prompt" items, respectively (see FIG. 5(b)).

なお、図５（ｂ）に示す画像保存ＤＢ(イラスト背景画像)１０９ｂは、「タグ」、「プロンプト（文）」が保存されるため、ウェブＵＩ画面上において、ユーザは画像生成履歴情報として「画像データ」と「タグ」と「プロンプト」とをセットで読みだすことが可能である。過去に生成された「画像データ」と「タグ」と「プロンプト」とをセットを参照しながら、新たにタグやオブションタグを選択したりすることで、新たに生成されたイラスト背景画像と、過去に生成されたイラスト背景画像との出来を比較しながら、様々なタグパターンによるイラスト背景画像生成を試すことができる。また、「画像データ」と「プロンプト」とを参照することで、プロンプト作成スキルを学習したり、過去に作成された「プロンプト」を直接テキスト編集することもできる。 Note that the image storage DB (illustration background image) 109b shown in FIG. It is possible to read out "image data", "tag", and "prompt" as a set. By selecting a new tag or option tag while referring to the set of "image data", "tag" and "prompt" generated in the past, the newly generated illustration background image and the past You can try generating illustration background images using various tag patterns while comparing the results with the illustration background images generated in . In addition, by referring to "image data" and "prompts", it is possible to learn prompt creation skills and to directly edit the text of "prompts" created in the past.

なお、Ｓ４において、プロンプト作成部１０４がＡＩ画像生成部１０５に作成したプロンプトを入力する際、プロンプトと共にもう一つ、ＡＩ画像生成部１０５にはシード（ｓｅｅｄ）という乱数値が入力として与えられる。シードは生成する画像にランダム性を持たせるために使われる。シードの乱数の桁数は非常に大きく毎回変わるため、入力するプロンプトは毎回同じであっても、Ｓ５で生成出力されるイラスト背景画像は、全く同一のイラスト背景画像が生成される可能性は極めて低い（逆に、固定値のシード及び一定のプロンプトを入力すれば生成される画像にランダム性はなくなり、毎回同一のイラスト背景画像が生成される）。このため、ユーザが仮に同じ「タグ」２０６ないし同じ「オプションタグ」２０９を選択しても異なるイラスト背景画像が生成され出力されうる。よってユーザは、複数のイラスト背景画像を、同じ「タグ」等ないし異なる「タグ」の組み合わせバリエーションを試行しながら複数のイラスト背景画像を生成し、「ギャラリー」２０２の中から、生成された所望のイラスト背景画像を取捨選択できる。 Note that, in S4, when the prompt creation unit 104 inputs the created prompt to the AI image generation unit 105, a random value called a seed is given to the AI image generation unit 105 as an input along with the prompt. Seeds are used to add randomness to the generated images. The number of digits in the seed random number is very large and changes each time, so even if the input prompt is the same each time, it is highly unlikely that the illustration background image generated and output by S5 will be the same illustration background image. Low (on the contrary, if you input a fixed seed value and a constant prompt, the generated image will be non-random and the same illustration background image will be generated every time). Therefore, even if the user selects the same "tag" 206 or the same "option tag" 209, different illustration background images can be generated and output. Therefore, the user generates a plurality of illustration background images while trying combination variations of the same "tag" or different "tags", and selects the desired generated one from the "gallery" 202. You can select illustration background images.

（「共通プロンプト」及び「ネガティブプロンプト」）
上述したように、「共通プロンプト」及び「ネガティブプロンプト」は、予め規定された所定のプロンプト要素からなり、指示文内容が予め定められた一定のプロンプトである。即ち本実施形態においては、所定のプロンプト要素からなる「共通プロンプト」とユーザ選択されたタグに応じたプロンプト要素からなる「ユーザプロンプト」とを含む「プロンプト」、及び、所定のプロンプト要素からなる「ネガティブプロンプト」とが作成される。 (“Common Prompt” and “Negative Prompt”)
As described above, the "common prompt" and the "negative prompt" are constant prompts that are composed of predefined predetermined prompt elements and have predetermined directive content. That is, in this embodiment, a "prompt" includes a "common prompt" consisting of a predetermined prompt element, a "user prompt" consisting of a prompt element according to a tag selected by the user, and a "prompt" consisting of a predetermined prompt element. A negative prompt is created.

「共通プロンプト」及び「ネガティブプロンプト」は、出願人の研究で見つけたイラスト背景としての画質を全体的に底上げするプロンプト要素の組み合わせ指示文であり、本実施形態に係るプロンプト作成部１０４は、「共通プロンプト」及び「ネガティブプロンプト」を自動的に組み込んだプロンプトを作成するため、ユーザは画質向上に関するタグやプロンプト要素を意識しなくて済み、あくまでウェブＵＩ画面からシチュエーションに応じたタグを選択しさえすれば、結果的に画質の高い画像（例えばイラスト背景画像）を得ることができる。 The "common prompt" and the "negative prompt" are combination instructions of prompt elements that improve the overall image quality of the illustration background found in the applicant's research, and the prompt creation unit 104 according to this embodiment Since prompts are automatically created that incorporate "common prompts" and "negative prompts," users do not have to be aware of tags or prompt elements related to image quality improvement, and can simply select tags according to the situation from the web UI screen. As a result, a high-quality image (for example, an illustration background image) can be obtained.

図２２は、本実施形態に係る「共通プロンプト」及び「ネガティブプロンプト」有り無しのイラスト背景画像例を示す。図２２（ａ）は、ユーザが「草原」タグを選択した場合に生成されたイラスト背景画像である。入力されたプロンプトには、「共通プロンプト」及び「ネガティブプロンプト」が有りである。一方、図２２（ｂ）は、ユーザが「草原」タグを選択した場合に生成されたイラスト背景画像である。入力されたプロンプトには、「共通プロンプト」及び「ネガティブプロンプト」が無しである。（ａ）は奥行のある背景画像であるのに対して、（ｂ）はのっぺりした背景画像であり背景が近すぎて何かわからない。通常、奥行のある背景画像はキャラクター配置するに適した構図が得られる。のっぺりした背景画像であれば、例えば単色無地かそれに近いシンプルな背景で済ませてしまう場合と差異は少ない。「共通プロンプト」及び「ネガティブプロンプト」には、一つには、このような奥行の背景を生成させるためのプロンプト要素が予め入っている。 FIG. 22 shows an example of an illustration background image with and without a "common prompt" and a "negative prompt" according to this embodiment. FIG. 22(a) is an illustration background image generated when the user selects the "grassland" tag. The input prompts include a "common prompt" and a "negative prompt." On the other hand, FIG. 22(b) is an illustration background image generated when the user selects the "grassland" tag. The input prompts have no "common prompt" and no "negative prompt." (a) is a background image with depth, whereas (b) is a flat background image, and the background is too close to make out anything. Usually, a background image with depth provides a composition suitable for arranging characters. If the background image is flat, there is little difference from, for example, a plain background of a single color or something similar. One of the "common prompts" and "negative prompts" includes a prompt element for generating a background with such depth in advance.

また図２２（ｃ）は、ユーザが「校庭」タグを選択した場合に生成されたイラスト背景画像である。入力されたプロンプトには、「共通プロンプト」及び「ネガティブプロンプト」が有りである。一方、図２２（ｄ）は、ユーザが「校庭」タグを選択した場合に生成されたイラスト背景画像である。入力されたプロンプトには、「共通プロンプト」及び「ネガティブプロンプト」が無しである。（ｃ）は人物画が無い背景画であるのに対して、（ｄ）は背景に人物画が出てしまっている。本実施形態に係る「共通プロンプト」及び「ネガティブプロンプト」は一つには、背景を描くよう指示を与えているプロンプト要素が予め入っている。これが無い場合、ＡＩは背景を描く指示を受けておらず、イラスト背景画像として適さない人物画を描いてしまう場合がある。 Further, FIG. 22(c) is an illustration background image generated when the user selects the "schoolyard" tag. The input prompts include a "common prompt" and a "negative prompt." On the other hand, FIG. 22(d) is an illustration background image generated when the user selects the "schoolyard" tag. The input prompts have no "common prompt" and no "negative prompt." (c) is a background image without a portrait, whereas (d) has a portrait appearing in the background. One of the "common prompts" and "negative prompts" according to this embodiment includes a prompt element that gives an instruction to draw a background. If this is not present, the AI has not received instructions to draw the background, and may draw a portrait that is not suitable as an illustration background image.

＜総括＞
以上のように本実施形態に係るイラスト背景画像生成システム１００によれば、ユーザはウェブＵＩ画面から背景画像のシチュエーションを示すタグを選択操作することで、プロンプトを容易に作成することが可能である。即ち、ユーザは例えばユーザキャラクターとの全体構図、パース（パースペクティブ）又はストーリー等を考慮しながら、それにマッチする背景画像のシチュエーションタグを選択しさえすれば、プロンプト文を容易に作成することができる。 <Summary>
As described above, according to the illustration background image generation system 100 according to the present embodiment, the user can easily create a prompt by selecting and operating the tag indicating the situation of the background image from the web UI screen. . That is, the user can easily create a prompt sentence by selecting a situation tag for a background image that matches the overall composition of the user character, perspective, or story.

またユーザは作成されたプロント文入力により、イラスト背景画像を生成することが可能である。これにより、ユーザは自身がイメージするシチュエーションのイラスト背景画像の上にユーザキャラクター画像を配置することで、自身のキャラクターイラスト（合成コンテンツ画像）を完成することが可能である。またユーザ（イラストレータ等）は、プロンプト文作成の手間削減により、自身のスキルや時間的資源をキャラクターのオリジナリティやクオリティ追求により配分することが可能となる。 Furthermore, the user can generate an illustration background image by inputting a created prompt sentence. Thereby, the user can complete his or her own character illustration (composite content image) by placing the user character image on the illustration background image of the situation that the user imagines. Furthermore, by reducing the time and effort required to create prompt sentences, users (illustrators, etc.) can allocate their skills and time resources to pursue character originality and quality.

即ち、本実施形態によれば、一つの側面では、ＡＩによるイラスト背景画像生成に際し、ユーザによるプロンプト入力を支援し、ひいてはユーザによるイラスト背景画像作成及び自身のキャラクターイラスト（合成コンテンツ画像）完成を支援することが可能である。 That is, according to the present embodiment, in one aspect, when the illustration background image is generated by AI, prompt input by the user is supported, and furthermore, the user is supported in creating the illustration background image and completing his/her own character illustration (synthesized content image). It is possible to do so.

なお、本発明の好適な実施の形態により、特定の具体例を示して本発明を説明したが、特許請求の範囲に定義された本発明の広範な趣旨および範囲から逸脱することなく、これら具体例に様々な修正および変更を加えることができることは明らかである。すなわち、具体例の詳細および添付の図面により本発明が限定されるものと解釈してはならない。 Although the present invention has been described with reference to specific examples according to preferred embodiments of the present invention, it is to be understood that these specific examples may be modified without departing from the broad spirit and scope of the present invention as defined in the claims. Obviously, various modifications and changes may be made to the examples. In other words, the invention is not to be construed as limited by the details of the specific examples and the accompanying drawings.

また、本実施形態に係る画像生成サーバ１０の有する各機能を、ユーザ端末２０にインストールされるアプリケーションプログラムに搭載することも可能である。これにより画像生成サーバ１０とネットワーク５０を介した通信をすることなく、ユーザ端末２０側のみで画像生成が可能である。即ち例えば、ユーザ端末２０における一態様のアプリケーションプログラムは、
コンピュータに、
表示装置に複数のタグを含むＵＩを表示する表示手段（例えば図８のＵＩ画面）と、
ユーザに一以上の前記タグを選択させるタグ選択手段（例えば図８のタグ選択部２０４）と、
選択された前記タグに対応する要素を含むプロンプトを作成するプロンプト作成手段と（画像生成サーバ１０の有するプロンプト作成部１０４の機能に相当）、
前記プロンプトに基づいて画像を生成する画像生成手段（画像生成サーバ１０の有するＡＩ画像生成部１０５の機能に相当）と、
前記画像を出力する画像出力手段（画像生成サーバ１０の有する画像出力部１０６の機能に相当と、
して機能させるためのアプリケーションプログラムである。 Further, each function of the image generation server 10 according to the present embodiment can be incorporated into an application program installed on the user terminal 20. This allows image generation only on the user terminal 20 side without communicating with the image generation server 10 via the network 50. That is, for example, one aspect of the application program on the user terminal 20 is
to the computer,
A display means (for example, the UI screen in FIG. 8) that displays a UI including a plurality of tags on a display device;
Tag selection means (for example, tag selection unit 204 in FIG. 8) that allows a user to select one or more of the tags;
a prompt creation means for creating a prompt including an element corresponding to the selected tag (corresponding to the function of the prompt creation unit 104 of the image generation server 10);
an image generation unit (corresponding to the function of the AI image generation unit 105 included in the image generation server 10) that generates an image based on the prompt;
an image output means for outputting the image (corresponding to the function of the image output unit 106 of the image generation server 10;
This is an application program for making the system function.

１０イラスト背景画像生成サーバ
２０ユーザ端末
５０ネットワーク
１００イラスト背景画像生成システム
１０１ウェブＵＩデータ取得部
１０２ウェブＵＩデータ送信部
１０３タグ受信部
１０４プロンプト作成部
１０５ＡＩ画像生成部
１０６画像出力部
１０７キャラクター画像取得部
１０８画像編集部
１０９記憶部 10 Illustration background image generation server 20 User terminal 50 Network 100 Illustration background image generation system 101 Web UI data acquisition unit 102 Web UI data transmission unit 103 Tag reception unit 104 Prompt creation unit 105 AI image generation unit 106 Image output unit 107 Character image acquisition Section 108 Image editing section 109 Storage section

Claims

UI data transmission means for transmitting UI data for generating a UI including a plurality of tags to a user terminal;
tag receiving means for receiving one or more of the tags selected by the user from the user terminal;
prompt creation means for creating a prompt including an element corresponding to the tag;
image generation means for generating an image based on the prompt;
image output means for outputting the image;
has
The prompt creation means includes:
a prompt comprising a predefined predetermined first prompt element; and a prompt comprising an element corresponding to the tag;
a negative prompt consisting of predefined predetermined second prompt elements;
to create,
An image generation device characterized by:

The image is a background image,
The tag is a tag indicating a situation of the background image,
Character image acquisition means for acquiring a character image from the user terminal;
image editing means for arranging a character image on the background image;
The image generation device according to claim 1, characterized in that the image generation device has:

storage means for storing the image in association with the tag and the prompt;
the image output means outputs the image, the tag and the prompt associated with the image;
The image generation device according to claim 1, characterized in that:

to the computer,
UI data transmission means for transmitting UI data for generating a UI including a plurality of tags to a user terminal;
tag receiving means for receiving one or more of the tags selected by the user from the user terminal;
prompt creation means for creating a prompt including an element corresponding to the tag;
image generation means for generating an image based on the prompt;
image output means for outputting the image;
and make it work ,
The prompt creation means includes:
a prompt comprising a predefined predetermined first prompt element; and a prompt comprising an element corresponding to the tag;
a negative prompt consisting of predefined predetermined second prompt elements;
create,
program.

UI data transmission means for transmitting UI data for generating a UI including a plurality of tags to a user terminal;
tag acquisition means for acquiring one or more of the tags selected by the user from the user terminal;
prompt creation means for creating a prompt corresponding to the tag;
has
The prompt creation means includes:
a prompt comprising a predefined predetermined first prompt element; and a prompt comprising an element corresponding to the tag;
a negative prompt consisting of predefined predetermined second prompt elements;
to create,
A prompt creation support device characterized by:

to the computer,
UI data transmission means for transmitting UI data for generating a UI including a plurality of tags to a user terminal;
tag acquisition means for acquiring one or more of the tags selected by the user from the user terminal;
prompt creation means for creating a prompt corresponding to the tag;
and make it work ,
The prompt creation means includes:
a prompt comprising a predefined predetermined first prompt element; and a prompt comprising an element corresponding to the tag;
a negative prompt consisting of predefined predetermined second prompt elements;
create,
program.

to the computer,
Display means for displaying a UI including a plurality of tags on a display device;
tag selection means for allowing a user to select one or more of the tags;
prompt creation means for creating a prompt including an element corresponding to the selected tag;
image generation means for generating an image based on the prompt;
image output means for outputting the image;
and make it work ,
The prompt creation means includes:
a prompt comprising a predefined predetermined first prompt element; and a prompt comprising an element corresponding to the tag;
a negative prompt consisting of predefined predetermined second prompt elements;
create,
application program.