WO2025032985A1 - 生成支援装置、生成支援プログラム、生成支援方法 - Google Patents

生成支援装置、生成支援プログラム、生成支援方法 Download PDF

Info

Publication number
WO2025032985A1
WO2025032985A1 PCT/JP2024/022627 JP2024022627W WO2025032985A1 WO 2025032985 A1 WO2025032985 A1 WO 2025032985A1 JP 2024022627 W JP2024022627 W JP 2024022627W WO 2025032985 A1 WO2025032985 A1 WO 2025032985A1
Authority
WO
WIPO (PCT)
Prior art keywords
generation
information
image
style
acquisition unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/JP2024/022627
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
麟太郎 鈴木
イブラヒマ カン
サリュー カン
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fotographer Ai
Fotographer Ai Inc
Original Assignee
Fotographer Ai
Fotographer Ai Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fotographer Ai, Fotographer Ai Inc filed Critical Fotographer Ai
Publication of WO2025032985A1 publication Critical patent/WO2025032985A1/ja
Priority to US19/264,887 priority Critical patent/US20250336105A1/en
Anticipated expiration legal-status Critical
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/00Two-dimensional [2D] image generation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/00Two-dimensional [2D] image generation
    • G06T11/60Creating or editing images; Combining images with text

Definitions

  • the present invention relates to a generation support device, a generation support program, and a generation support method.
  • Patent Document 1 proposes a technology that uses machine learning to generate character images.
  • Patent Document 1 can only generate images of characters in arbitrary poses, and cannot be applied to generating a variety of images.
  • the present invention was made in light of this background, and aims to enable users to easily generate the images they desire.
  • the generation assistance device disclosed herein is characterized by comprising: a generation information acquisition unit that acquires generation information from a user, the generation information including at least style information related to the style of a result image and an element image that constitutes part of the result image; and a result image generation unit that inputs text information generated based on the generation information into a generation model and generates the result image based on output information output from the generation model.
  • the present invention allows users to easily generate the image they desire.
  • FIG. 1 is a diagram showing an example of the overall configuration of an evaluation system according to an embodiment of the present invention
  • FIG. 2 is a diagram illustrating an example of a hardware configuration of a server device 1 according to the embodiment.
  • FIG. 2 is a diagram illustrating an example of a functional configuration of a server device 1 according to the embodiment.
  • 11 is a diagram showing an example of basic information stored in a generation information storage unit 131.
  • FIG. 13 is an example of a screen on which the generation information acquisition unit 111 acquires position information of a partial image and a material image.
  • FIG. 11 is a diagram illustrating an example of a process of the server device 1 according to the embodiment.
  • a generation support device that supports generation of a result image, a generation information acquisition unit that acquires generation information including at least style information related to a style of the resultant image and an element image that constitutes a part of the resultant image from a user; a resultant image generation unit that inputs text information generated based on the generation information into a generation model and generates the resultant image based on output information output from the generation model;
  • a generation assistance device comprising: [Item 2] the generation information acquisition unit further acquires position information of the element images in the resultant image; 2.
  • the generation support device includes information of a web address, the generation information acquisition unit determines the style information based on web information included in the website specified by the web address as the style information; 3.
  • the generation support device according to item 1 or 2, [Item 4] the information acquisition unit accepts upload or selection of the element image, and acquires the position information based on an arrangement of the element image in a frame of the resultant image; 3.
  • the generation support device [Item 5] the information acquisition unit acquires the generation information through input by the user in a chat format; 3.
  • the generation support device [Item 6] when acquiring the input in the chat format, the information acquisition unit presents to the user a suggestion of information required as the generation information; 6.
  • the generation support device according to item 5, [Item 7] A generation assistance program for assisting in the generation of a resultant image,
  • the processor a generation information acquisition step of acquiring generation information including at least style information related to a style of the resultant image and an element image constituting a part of the resultant image from a user; an image generation step of inputting text information generated based on the generation information into a generation model, and generating the resultant image based on output information output from the generation model;
  • a generation support program that executes the above.
  • a method for assisting in the generation of a result image comprising: The processor: a generation information acquisition step of acquiring generation information including at least style information related to a style of the resultant image and an element image constituting a part of the resultant image from a user; an image generation step of inputting text information generated based on the generation information into a generation model, and generating the resultant image based on output information output from the generation model; A generation assistance method for executing the above.
  • FIG. 1 is a diagram showing an example of the overall configuration of an evaluation system according to one embodiment of the present invention.
  • the generation support system of this embodiment is configured to include a server device 1.
  • the server device 1 is communicably connected to a user terminal 3 via a communication network 2.
  • the communication network 2 is, for example, the Internet, and is constructed using a public telephone line network, a mobile phone line network, a wireless communication path, Ethernet (registered trademark), or the like.
  • the server device 1 may be, for example, a general-purpose computer such as a workstation or a personal computer, or may be logically realized by cloud computing. In this embodiment, one server device is illustrated for convenience of explanation, but the present invention is not limited to this and multiple servers may be used.
  • the user terminal 3 is a computer operated by a user who generates an image.
  • the user terminal 3 is a smartphone, a tablet computer, a personal computer, etc.
  • the user can access the server device 1 by, for example, an application or a web browser executed on the user terminal 3.
  • the server device 1 includes a processor 101, a memory 102, a storage device 103, a communication interface 104, an input device 105, and an output device 106.
  • the storage device 103 is, for example, a hard disk drive, a solid state drive, or a flash memory that stores various data and programs.
  • the communication interface 104 is an interface for connecting to the communication network 2, and is, for example, an adapter for connecting to Ethernet (registered trademark), a modem for connecting to a public telephone line network, a wireless communication device for wireless communication, a USB (Universal Serial Bus) connector or an RS232C connector for serial communication, etc.
  • the input device 105 is, for example, a keyboard, a mouse, a touch panel, a button, a microphone, etc. that input data.
  • the output device 106 is, for example, a display, a printer, a speaker, etc. that output data.
  • Each functional unit of the server device 1, which will be described later, is realized by the processor 101 reading a program stored in the storage device 103 into the memory 102 and executing it, and each storage unit of the server device 1 is realized as part of the storage area provided by the memory 102 and the storage device 103.
  • FIG. 3 shows the functional configuration of the server device 1.
  • the server device 1 includes a generation information storage unit 131 and a result image information storage unit 132, as well as a generation information acquisition unit 111 and a result image generation unit 112.
  • each of the storage units the generation information storage unit 131 and the result image information storage unit 132.
  • the generation information storage unit 131 stores information (hereinafter, referred to as generation information) used to generate a result image (an image generated by the server device 1), as shown in FIG. 4 as an example.
  • the generation information may include, for example, style-related information (including text information, web addresses, web information, etc.).
  • the generation information may also include element images that are the basis for the configuration of a part of the result image.
  • the element images are, for example, partial images that are the subject of the result image (e.g., a person or object, which is described as the main content of the image when the generation information acquisition unit 111 described below generates text to be input to the generation model), and if the subject of the partial images is an image of an object, they may include, for example, an image of a product, an image containing a product, or an image of the appearance of a product container, outer box, etc.
  • the element images may also include material images that are not the subject of the result image.
  • the generation information may include, for example, information on the position of the partial image and material image in the result image, but is not limited to these.
  • the style is the style in the design of the resultant image generated by the generation assistance device, and represents, for example, aesthetic attributes and characteristics such as the requirements of the resultant image (such as objects, people, landscapes, and other elements contained in the image), concept (target, narrative, etc.), color, texture (such as the texture of the image surface that can be felt by the visual senses), layout (such as the arrangement and relative positions of elements), font, and shape (sharp angles, rounded shapes, straight lines, curves, etc.).
  • aesthetic attributes and characteristics such as the requirements of the resultant image (such as objects, people, landscapes, and other elements contained in the image), concept (target, narrative, etc.), color, texture (such as the texture of the image surface that can be felt by the visual senses), layout (such as the arrangement and relative positions of elements), font, and shape (sharp angles, rounded shapes, straight lines, curves, etc.).
  • the material images are images that serve as the basis for part of the resulting image, such as images of hands, faces, plants, everyday items, stands, geometric figures such as circles, triangles, and squares, or free-form figures that are not bound by geometric rules.
  • the material images may also include images (templates) that serve as the basis for the background of the image to be generated, but are not limited to these.
  • the result image information storage unit 132 stores the result image generated by the result image generation unit 112.
  • the generation information acquisition unit 111 acquires generation information from the user terminal 3 via the communication network 2, including style information related to the style of the result image, which is necessary for generating a result image, element images constituting the result image, and position information of the element images in the result image.
  • the generation information acquisition unit 111 stores the acquired generation information in the generation information storage unit 131.
  • the communication in the transmission and reception may be either wired or wireless, and any communication protocol may be used as long as communication between the two parties is possible.
  • the generation information acquisition unit 111 may acquire the generation information as text information.
  • the generation information acquisition unit 111 may acquire a sentence indicating the result image to be generated through an input operation by the user, or may acquire one or more words.
  • the generation information acquisition unit 111 may also present a sentence or word indicating the style of the result image to be generated to the user, and acquire the sentence or word selected by the user as the generation information.
  • the generation information acquisition unit 111 may acquire information on a web address (URL, etc.) as the generation information.
  • the generation information acquisition unit 111 may acquire web information contained in a website specified by the web address, determine the style, and set it as the generation information.
  • the web information may be text information, image information, video information, code information (code constituting the website, which may be in a format such as HTML, CSS, Javascript, etc., but is not limited to these) contained in the website.
  • the generation information acquisition unit 111 may determine the style of the target, concept, etc., based on the text information, for example.
  • the generation information acquisition unit 111 may perform morphological analysis on the text information, and determine the style of the target, concept, etc., based on information on the words contained and the number of words, but is not limited to these methods.
  • the generation information acquisition unit 111 may determine the style of the color, texture, font, shape, etc., from the image information, video information, or code information.
  • the generation information acquisition unit 111 may analyze the image information or video information and determine the style from the predominant colors, textures, fonts, shapes, etc., or may determine the style based on information contained in the code information, such as the colors, textures, shapes, and fonts used on the web of the web background image, but is not limited to these methods.
  • the generation information acquisition unit 111 acquires element images that are the basis for forming part of the resultant image.
  • the generation information acquisition unit 111 may accept uploads of element images.
  • the generation information acquisition unit 111 may store a material image (e.g., 201 in FIG. 5) in the server device 1, present the material image on the user terminal 3, accept a selection operation of the material image by the user, and acquire the material image selected by the user as generation information.
  • the generation information acquisition unit 111 acquires position information of the element image in the result image.
  • the position information indicates the coordinates of the element image in the result image, and may be, for example, XY coordinates with a specific position such as a specific corner of the result image as the origin, and may be the XY coordinates of the center of the element image.
  • the generation information acquisition unit 111 presents a frame (for example, 202.
  • the frame may be horizontal, square, vertical, or other shapes, but is not limited to these) corresponding to the shape of the result image to the user terminal 3, as shown in FIG. 5 as an example.
  • the generation information acquisition unit 111 acquires information on the user's operation on the user terminal 3 in the frame, and acquires position information of the material image in the result image.
  • the generation information acquisition unit 111 may, for example, accept a drag-and-drop operation by the user, acquire arrangement information of the partial image (203) or the material image (204), and acquire position information of the element image.
  • the generation information acquisition unit 111 may also accept enlargement, reduction, rotation, inversion, deformation, and the like of the element image.
  • the generation information acquisition unit 111 may acquire the anteroposterior relationship between multiple element images (which may be layer information, for example) as position information.
  • the position information may be information about the positional relationship between element images (such as that a material image is located below a partial image).
  • the generation information acquisition unit 111 may acquire the generation information in a chat format.
  • the generation information acquisition unit 111 may divide the text information acquired in the chat format into two words by morphological analysis or the like, and acquire the word information as the generation information.
  • the generation information acquisition unit 111 may provide assistance to the user to easily recognize the necessary generation information by presenting guidance such as "Please upload a subject image” or "Please tell us a reference website” regarding the generation information acquired from the user.
  • the generation information acquisition unit 111 may provide guidance to the user regarding information not acquired from the user or information that could not be determined from the acquired generation information from a list of generation information required for image generation prepared in advance, but this method is not limited to this method.
  • the generation information acquisition unit 111 generates a prompt or prerequisite conditions (collectively referred to as prompt information in this specification) to be input to the image generation model based on the acquired generation information.
  • the prerequisite conditions may include, but are not limited to, information such as image size, frame shape, file size, and resolution.
  • the prompt generated by the generation information acquisition unit 111 includes at least text that represents a style.
  • the generation information acquisition unit 111 generates prompt information, for example, by using a feature extraction module and a language model. Note that the generation information acquisition unit 111 may generate one or more pieces of prompt information, present it to the user, and accept selection or editing of a prompt.
  • the generation information acquisition unit 111 may change the structure of the generated prompt depending on the type of generative model used by the result image generation unit 112 for image generation.
  • the generation information acquisition unit 111 may, for example, generate a sentence-type prompt, or a prompt in the form of a list of words.
  • a prompt may be generated in a form that causes the generative model to recognize important words by a method of indicating the importance of words, such as enclosing important words in parentheses, having the order of words appear at the beginning of the prompt, or including multiple important words.
  • the prompt generated by the generation information acquisition unit 111 includes at least text representing a style.
  • the generation information acquisition unit 111 may generate multiple prompts, or may impart a certain degree of randomness to the text included in the prompt.
  • the generation information acquisition unit 111 imparts a certain degree of randomness to the text representing the style, depending on the semantic distance or similarity to the text representing the style.
  • the generation information acquisition unit 111 generates a prompt by including other words that are close in meaning to the "sea” or have a high similarity to the "sea”.
  • the result image generation unit 112 which will be described later, generates a result image that is closer to the image that the user wishes to generate by using these prompts. Conversely, when the generation information acquired by the user's input or the like includes little style information related to the "sea", the generation information acquisition unit 111 generates a prompt by including other words that are distant in meaning to the "sea” or have a low similarity to the "sea”.
  • the result image generating unit 112 uses these prompts to generate a result image, making it easier for the user to consider the direction of the style of the result image, for example, when the user does not yet have an image of the ocean in mind. Note that adding a certain degree of randomness to the text included in the prompts may have other effects.
  • the generation information acquisition unit 111 may acquire additional generation information for the first result image generated by the result image generation unit 112.
  • the additional generation information acquired by the generation information acquisition unit 111 is used to modify or add to the prompt used when generating the first result image, and is used by the result image generation unit 112 when generating the second result image.
  • the result image generation unit 112 generates a result image based on at least one of style information, element image, and position information, as an example.
  • the result image generation unit 112 inputs prompt information generated by the generation information acquisition unit 111 based on at least one of style information, element image, and position information, for example, to a generation model, and acquires an image output by the generation model.
  • the result image generation unit 112 may use the image output by the generation model as a result image, or may generate a result image by performing processing or the like based on the output image.
  • the result image generation unit 112 presents the generated result image to the user. The user can download the presented image.
  • the generative model used by the result image generation unit 112 to generate the result image may be implemented in the server device 1 or in another server accessible via the communication network 2, but is not limited to this. For this reason, when the generative model is implemented in the server device 1, the result image generation unit 112 inputs prompt information to the generative model, and when the generative model is implemented in another server, the result image generation unit 112 transmits the prompt information to the generative model via the communication network 2.
  • the expression "inputting prompt information to the generative model” is used to include the case where the prompt information is transmitted to the generative model.
  • the generative model may be, for example, a model that receives a specific input vector or random noise given as an input and generates an image from that information.
  • the generative model may, for example, include a generator.
  • the generator converts the input information into appropriate features or patterns and converts them into an image.
  • the generator may be constructed using, for example, a Convolutional Neural Network (CNN), a Transformer, or other deep learning architecture, although other architectures may also be used.
  • the generative model may also include, for example, a Discriminator.
  • the Discriminator identifies whether an image is a real image or a fake image generated by the Generator.
  • the Discriminator may, for example, be constructed using a network such as a CNN, but is not limited to these.
  • the generative model may, for example, include an Adversarial Network (GAN).
  • GAN Adversarial Network
  • the adversarial network trains the generator to generate more realistic images, while at the same time training the classifier to improve its ability
  • the result image generating unit 112 may generate two or more result images. In addition, the result image generating unit 112 presents the generated result images to the user.
  • the result image generating unit 112 may accept a selection operation from the user at the user terminal 3 of an image that is close to the result image to be generated from the multiple images, or an image that is different, and may generate a further result image based on the result image selected by the selection operation.
  • the result image generating unit 112 may generate a result image B similar to result image A, for example, based on the characteristics of result image A selected as being close to the result image to be generated from the images.
  • the generation information acquiring unit 111 may modify the information of the prompt input to the generation model when generating result image A selected by the selection operation, or generate a prompt indicating regeneration of an image or variation similar to the selected result image A, and input these prompts again into the generation model to generate the result image, but this is not limited to this method.
  • the result image generating unit 112 may generate a second result image when the generation information acquiring unit 111 acquires additional information from the user for the generated result image (first result image).
  • the result image generating unit 112 inputs to the generation model the prompt information input to the generation model when generating the first result image and the prompt information generated by the generation information acquiring unit 111 based on the additional information, and generates a second result image based on the output information output by the generation model.
  • FIG. 6 is a diagram illustrating an example of the processing performed by the generation support device of this embodiment.
  • the server device 1 acquires generation information from the user (1001).
  • the server device 1 generates a prompt based on the acquired generation information (1002).
  • the server device 1 inputs the prompt into the generation model (1003).
  • the server device 1 acquires output information (result image) of the generation model (1004).
  • the server device 1 presents the output information to the user (1005).
  • the server device 1 may, for example, perform pre-processing of the partial image acquired by the generation information acquisition unit 111.
  • the server device 1 may, for example, determine the subject of the partial image and remove the background other than the subject part.
  • the generation information acquisition unit 111 may, for example, emphasize the subject from the partial image.
  • the server device 1 may, for example, determine the relationship between the subject and the camera position for the subject included in the partial image, detect the camera angle, and generate a prompt that is generated by the generation information acquisition unit 111 accordingly.
  • the prompt may include, for example, a prompt that specifies the angle at which the subject is to be displayed in the resultant image, but is not limited to this example.
  • the generation information acquisition unit 111 may acquire position information in the image to be generated for the partial image after the above-mentioned preprocessing has been performed, or may generate a prompt.
  • the server device 1 may suggest a style to the user based on marketing information.
  • the marketing information may include information acquired in advance, such as information on the product that is the subject of the result image, industry information, the results of marketing research, and information acquired from the user, such as past sales records of the subject product and sales of similar products.
  • the server device 1 suggests to the user a style determined from sales websites and advertising images of similar products with high sales volume for the product that is the subject of the image to be generated, based on information such as sales records of similar products.
  • the server device 1 may present to the user information on the web address of the sales website of the similar product with high sales volume and text information (e.g., "luxurious", “natural”, etc.) to be included in the prompt generated by the generation information acquisition unit 111 as a style, or may include it in the prompt used for image generation.
  • text information e.g., "luxurious", “natural”, etc.
  • the server device 1 may recommend a style to the user based on result images that the user has previously generated using the server device 1, or on information about the prompts used to generate the result images. For example, the server device 1 may analyze result images that the user has previously generated, or determine the style based on text information included in the prompts, present the most frequently detected styles to the user terminal 3, and obtain a selection operation as to whether or not to use the style in generating the result images. Specifically, for example, if the server device 1 determines that the user has only generated realistic images in the past, it may present a question such as "Do you want to generate a realistic image? Yes No" to the user terminal 3 via chat or the like, obtain the user's selection operation, and generate a prompt based on the answer selected by the user.
  • a question such as "Do you want to generate a realistic image? Yes No"
  • the server device 1 may generate information related to product sales, not limited to images.
  • Information generated by the server device 1 may include, but is not limited to, banner advertisement images for advertising products and campaigns, effective catchphrases and catchy copy that concisely express the features of products and brands, product descriptions that are text information that describe the detailed explanation and features of products, designs and layouts used on the top pages of EC sites that sell products, category page designs that are designs and display methods of product category pages, landing page designs for highlighting specific campaigns and products, images and catchphrases for social media and advertising platforms, and the like.
  • the generation information acquisition unit 111 When the server device 1 generates the above-mentioned information, the generation information acquisition unit 111 generates a prompt based on the information acquired, and if it is an image or design, the result image generation unit 112 inputs the prompt into an image generation model, and if it is text information, inputs the prompt into a text generation model (for example, a large-scale language model such as ChatGPT) to generate the information.
  • a text generation model for example, a large-scale language model such as ChatGPT
  • the server device 1 may acquire images included in the website specified by the specified web address as element images.
  • the server device 1 may acquire all images included in the website as element images and store them in the generated information storage unit 121, or may accept a user's selection operation for an image to be acquired as generated information from among the images included in the website, and store the selected image as the element image.
  • the device described in this specification may be realized as a single device, or may be realized by multiple devices (e.g., cloud servers) some or all of which are connected via a communication network 2.
  • the processor 101 and the storage device 103 of the server device 1 may be realized by different servers connected to each other via the communication network 2.
  • the series of processes performed by the device described in this specification may be realized using software, hardware, or a combination of software and hardware.
  • a computer program for realizing each function of the server device 1 according to this embodiment can be created and installed on a PC or the like.
  • a computer-readable recording medium on which such a computer program is stored can also be provided. Examples of the recording medium include a magnetic disk, optical disk, magneto-optical disk, and flash memory.
  • the above computer program may also be distributed, for example, via the communication network 2 without using a recording medium.
  • Reference Signs List 1 Server device 2 Communication network 3 User terminal 101 CPU REFERENCE SIGNS LIST 102 Memory 103 Storage device 104 Communication interface 105 Input device 106 Output device 111 Generated information acquisition unit 112 Resultant image generating unit 131 Generated information storage unit 132 Image information storage unit

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Processing Or Creating Images (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)
PCT/JP2024/022627 2023-08-10 2024-06-21 生成支援装置、生成支援プログラム、生成支援方法 Pending WO2025032985A1 (ja)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US19/264,887 US20250336105A1 (en) 2023-08-10 2025-07-10 Generation support device, generation support program, and generation support method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2023-131665 2023-08-10
JP2023131665A JP7458675B1 (ja) 2023-08-10 2023-08-10 生成支援装置、生成支援プログラム、生成支援方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US19/264,887 Continuation US20250336105A1 (en) 2023-08-10 2025-07-10 Generation support device, generation support program, and generation support method

Publications (1)

Publication Number Publication Date
WO2025032985A1 true WO2025032985A1 (ja) 2025-02-13

Family

ID=90474199

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2024/022627 Pending WO2025032985A1 (ja) 2023-08-10 2024-06-21 生成支援装置、生成支援プログラム、生成支援方法

Country Status (3)

Country Link
US (1) US20250336105A1 (https=)
JP (3) JP7458675B1 (https=)
WO (1) WO2025032985A1 (https=)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7736217B1 (ja) * 2024-05-21 2025-09-09 Toppanホールディングス株式会社 アバター生成システム、アバター生成方法、およびプログラム
WO2025262805A1 (ja) * 2024-06-18 2025-12-26 株式会社Nttドコモ 生成装置及び生成方法
JP7663187B1 (ja) * 2024-12-10 2025-04-16 Clinks株式会社 情報処理システムおよびプログラム

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12363253B2 (en) * 2021-05-18 2025-07-15 Microsoft Technology Licensing, Llc Realistic personalized style transfer in image processing
CN115836319B (zh) * 2021-07-15 2025-10-17 京东方科技集团股份有限公司 图像处理方法及装置
US12322167B2 (en) * 2021-12-28 2025-06-03 Yahoo Ad Tech Llc Computerized system and method for image creation using generative adversarial networks
US12198224B2 (en) * 2022-02-15 2025-01-14 Adobe Inc. Retrieval-based text-to-image generation with visual-semantic contrastive representation
US20240161258A1 (en) * 2022-11-11 2024-05-16 Shopify Inc. System and methods for tuning ai-generated images
US12462441B2 (en) * 2023-03-20 2025-11-04 Sony Interactive Entertainment Inc. Iterative image generation from text
US20240320873A1 (en) * 2023-03-20 2024-09-26 Adobe Inc. Text-based image generation using an image-trained text
US20240330381A1 (en) * 2023-03-29 2024-10-03 Google Llc User-Specific Content Generation Using Text-To-Image Machine-Learned Models
CN116385584A (zh) 2023-04-03 2023-07-04 平安国际融资租赁有限公司 海报的生成方法、装置、系统及计算机可读存储介质
US20240338859A1 (en) * 2023-04-05 2024-10-10 Adobe Inc. Multilingual text-to-image generation
US12406418B2 (en) * 2023-04-20 2025-09-02 Adobe Inc. Personalized text-to-image generation
CN116433825B (zh) 2023-05-24 2024-03-26 北京百度网讯科技有限公司 图像生成方法、装置、计算机设备及存储介质
WO2025024783A2 (en) * 2023-07-26 2025-01-30 Maplebear Inc. Generating artificial intelligence (ai)-based images using large language machine-learned models

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANOBAKA CH: "Behind the scenes of the startup, procurement and development of an image generation AI service that will revolutionize the EC industry [Fotographer AI Suzuki]", XP093288208, Retrieved from the Internet <URL:https://www.youtube.com/watch?v=_WV0lSuN-KA> [retrieved on 20231218] *
ITO, RINTARO: "About trends in artificial intelligence 2023", MEDICAL EYE, vol. 21, no. 4, 31 March 2023 (2023-03-31), pages 78 - 79, XP009560995 *
OTANI, DAI: "The Photorealistic plugin, which outputs a prompt for image generation AI when you input Japanese into ChatGPT, is too convenient", DELAY MANIA, XP009561606, Retrieved from the Internet <URL:https://web.archive.org/web/20230525002720/https://delaymania.com/202305/webservice/chatgpt-plugin-photorealistic/> [retrieved on 20231218] *

Also Published As

Publication number Publication date
JP2025183388A (ja) 2025-12-16
JP7751899B2 (ja) 2025-10-09
US20250336105A1 (en) 2025-10-30
JP2025026277A (ja) 2025-02-21
JP2025026209A (ja) 2025-02-21
JP7458675B1 (ja) 2024-04-01

Similar Documents

Publication Publication Date Title
JP7751899B2 (ja) 生成支援装置、生成支援プログラム、生成支援方法
US11809822B2 (en) Joint visual-semantic embedding and grounding via multi-task training for image searching
Choi et al. Visualizing for the non‐visual: Enabling the visually impaired to use visualization
US8358320B2 (en) Interactive transcription system and method
US7737980B2 (en) Methods and apparatus for supporting and implementing computer based animation
US9754585B2 (en) Crowdsourced, grounded language for intent modeling in conversational interfaces
Lang et al. Attesting similarity: Supporting the organization and study of art image collections with computer vision
EP4336379A1 (en) Tracking concepts within content in content management systems and adaptive learning systems
Choi et al. Assist users' interactions in font search with unexpected but useful concepts generated by multimodal learning
CN114821004A (zh) 虚拟空间构建方法、虚拟空间构建装置、设备及存储介质
CN114693844B (zh) 一种电子绘本生成方法、装置及电子设备
Widiarti et al. Enhancing the Transliteration of Words Written in Javanese Script through Augmented Reality
US20250298838A1 (en) Tracking concepts within content in content management systems and adaptive learning systems
CN100573419C (zh) 将印刷材料与由计算机系统产生的响应关联的方法和系统
Vermeeren Chinese Calligraphy in the digital realm: Aesthetic perfection and remediation of the authentic
Yan From physical to virtual: Enhancing the representation of intangible cultural heritage using mixed reality
Duan et al. Cognitive differences in product shape evaluation between real settings and virtual reality: case study of two-wheel electric vehicles
Cortes-Camarillo et al. Atila: A UIDPs-based educational application generator for mobile devices
Han et al. Hearing with the eyes: modulating lyrics typography for music visualization
Rachabathuni et al. Computer vision and AI TOOLS for enhancing user experience in the cultural heritage domain
CN116912366A (zh) 一种基于ai的平面设计生成方法及系统
Rai et al. MyOcrTool: visualization system for generating associative images of Chinese characters in smart devices
Amr et al. Practical D3. js
Andriushchenko et al. The role of it innovations in shaping changes in the publishing industry of Ukraine
Akbar et al. A Multimodal Analysis of Human-Generated and Machine-Generated Advertisements in Pakistan

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24851418

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE