WO2025032985A1 - 生成支援装置、生成支援プログラム、生成支援方法 - Google Patents
生成支援装置、生成支援プログラム、生成支援方法 Download PDFInfo
- Publication number
- WO2025032985A1 WO2025032985A1 PCT/JP2024/022627 JP2024022627W WO2025032985A1 WO 2025032985 A1 WO2025032985 A1 WO 2025032985A1 JP 2024022627 W JP2024022627 W JP 2024022627W WO 2025032985 A1 WO2025032985 A1 WO 2025032985A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- generation
- information
- image
- style
- acquisition unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—Two-dimensional [2D] image generation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—Two-dimensional [2D] image generation
- G06T11/60—Creating or editing images; Combining images with text
Definitions
- the present invention relates to a generation support device, a generation support program, and a generation support method.
- Patent Document 1 proposes a technology that uses machine learning to generate character images.
- Patent Document 1 can only generate images of characters in arbitrary poses, and cannot be applied to generating a variety of images.
- the present invention was made in light of this background, and aims to enable users to easily generate the images they desire.
- the generation assistance device disclosed herein is characterized by comprising: a generation information acquisition unit that acquires generation information from a user, the generation information including at least style information related to the style of a result image and an element image that constitutes part of the result image; and a result image generation unit that inputs text information generated based on the generation information into a generation model and generates the result image based on output information output from the generation model.
- the present invention allows users to easily generate the image they desire.
- FIG. 1 is a diagram showing an example of the overall configuration of an evaluation system according to an embodiment of the present invention
- FIG. 2 is a diagram illustrating an example of a hardware configuration of a server device 1 according to the embodiment.
- FIG. 2 is a diagram illustrating an example of a functional configuration of a server device 1 according to the embodiment.
- 11 is a diagram showing an example of basic information stored in a generation information storage unit 131.
- FIG. 13 is an example of a screen on which the generation information acquisition unit 111 acquires position information of a partial image and a material image.
- FIG. 11 is a diagram illustrating an example of a process of the server device 1 according to the embodiment.
- a generation support device that supports generation of a result image, a generation information acquisition unit that acquires generation information including at least style information related to a style of the resultant image and an element image that constitutes a part of the resultant image from a user; a resultant image generation unit that inputs text information generated based on the generation information into a generation model and generates the resultant image based on output information output from the generation model;
- a generation assistance device comprising: [Item 2] the generation information acquisition unit further acquires position information of the element images in the resultant image; 2.
- the generation support device includes information of a web address, the generation information acquisition unit determines the style information based on web information included in the website specified by the web address as the style information; 3.
- the generation support device according to item 1 or 2, [Item 4] the information acquisition unit accepts upload or selection of the element image, and acquires the position information based on an arrangement of the element image in a frame of the resultant image; 3.
- the generation support device [Item 5] the information acquisition unit acquires the generation information through input by the user in a chat format; 3.
- the generation support device [Item 6] when acquiring the input in the chat format, the information acquisition unit presents to the user a suggestion of information required as the generation information; 6.
- the generation support device according to item 5, [Item 7] A generation assistance program for assisting in the generation of a resultant image,
- the processor a generation information acquisition step of acquiring generation information including at least style information related to a style of the resultant image and an element image constituting a part of the resultant image from a user; an image generation step of inputting text information generated based on the generation information into a generation model, and generating the resultant image based on output information output from the generation model;
- a generation support program that executes the above.
- a method for assisting in the generation of a result image comprising: The processor: a generation information acquisition step of acquiring generation information including at least style information related to a style of the resultant image and an element image constituting a part of the resultant image from a user; an image generation step of inputting text information generated based on the generation information into a generation model, and generating the resultant image based on output information output from the generation model; A generation assistance method for executing the above.
- FIG. 1 is a diagram showing an example of the overall configuration of an evaluation system according to one embodiment of the present invention.
- the generation support system of this embodiment is configured to include a server device 1.
- the server device 1 is communicably connected to a user terminal 3 via a communication network 2.
- the communication network 2 is, for example, the Internet, and is constructed using a public telephone line network, a mobile phone line network, a wireless communication path, Ethernet (registered trademark), or the like.
- the server device 1 may be, for example, a general-purpose computer such as a workstation or a personal computer, or may be logically realized by cloud computing. In this embodiment, one server device is illustrated for convenience of explanation, but the present invention is not limited to this and multiple servers may be used.
- the user terminal 3 is a computer operated by a user who generates an image.
- the user terminal 3 is a smartphone, a tablet computer, a personal computer, etc.
- the user can access the server device 1 by, for example, an application or a web browser executed on the user terminal 3.
- the server device 1 includes a processor 101, a memory 102, a storage device 103, a communication interface 104, an input device 105, and an output device 106.
- the storage device 103 is, for example, a hard disk drive, a solid state drive, or a flash memory that stores various data and programs.
- the communication interface 104 is an interface for connecting to the communication network 2, and is, for example, an adapter for connecting to Ethernet (registered trademark), a modem for connecting to a public telephone line network, a wireless communication device for wireless communication, a USB (Universal Serial Bus) connector or an RS232C connector for serial communication, etc.
- the input device 105 is, for example, a keyboard, a mouse, a touch panel, a button, a microphone, etc. that input data.
- the output device 106 is, for example, a display, a printer, a speaker, etc. that output data.
- Each functional unit of the server device 1, which will be described later, is realized by the processor 101 reading a program stored in the storage device 103 into the memory 102 and executing it, and each storage unit of the server device 1 is realized as part of the storage area provided by the memory 102 and the storage device 103.
- FIG. 3 shows the functional configuration of the server device 1.
- the server device 1 includes a generation information storage unit 131 and a result image information storage unit 132, as well as a generation information acquisition unit 111 and a result image generation unit 112.
- each of the storage units the generation information storage unit 131 and the result image information storage unit 132.
- the generation information storage unit 131 stores information (hereinafter, referred to as generation information) used to generate a result image (an image generated by the server device 1), as shown in FIG. 4 as an example.
- the generation information may include, for example, style-related information (including text information, web addresses, web information, etc.).
- the generation information may also include element images that are the basis for the configuration of a part of the result image.
- the element images are, for example, partial images that are the subject of the result image (e.g., a person or object, which is described as the main content of the image when the generation information acquisition unit 111 described below generates text to be input to the generation model), and if the subject of the partial images is an image of an object, they may include, for example, an image of a product, an image containing a product, or an image of the appearance of a product container, outer box, etc.
- the element images may also include material images that are not the subject of the result image.
- the generation information may include, for example, information on the position of the partial image and material image in the result image, but is not limited to these.
- the style is the style in the design of the resultant image generated by the generation assistance device, and represents, for example, aesthetic attributes and characteristics such as the requirements of the resultant image (such as objects, people, landscapes, and other elements contained in the image), concept (target, narrative, etc.), color, texture (such as the texture of the image surface that can be felt by the visual senses), layout (such as the arrangement and relative positions of elements), font, and shape (sharp angles, rounded shapes, straight lines, curves, etc.).
- aesthetic attributes and characteristics such as the requirements of the resultant image (such as objects, people, landscapes, and other elements contained in the image), concept (target, narrative, etc.), color, texture (such as the texture of the image surface that can be felt by the visual senses), layout (such as the arrangement and relative positions of elements), font, and shape (sharp angles, rounded shapes, straight lines, curves, etc.).
- the material images are images that serve as the basis for part of the resulting image, such as images of hands, faces, plants, everyday items, stands, geometric figures such as circles, triangles, and squares, or free-form figures that are not bound by geometric rules.
- the material images may also include images (templates) that serve as the basis for the background of the image to be generated, but are not limited to these.
- the result image information storage unit 132 stores the result image generated by the result image generation unit 112.
- the generation information acquisition unit 111 acquires generation information from the user terminal 3 via the communication network 2, including style information related to the style of the result image, which is necessary for generating a result image, element images constituting the result image, and position information of the element images in the result image.
- the generation information acquisition unit 111 stores the acquired generation information in the generation information storage unit 131.
- the communication in the transmission and reception may be either wired or wireless, and any communication protocol may be used as long as communication between the two parties is possible.
- the generation information acquisition unit 111 may acquire the generation information as text information.
- the generation information acquisition unit 111 may acquire a sentence indicating the result image to be generated through an input operation by the user, or may acquire one or more words.
- the generation information acquisition unit 111 may also present a sentence or word indicating the style of the result image to be generated to the user, and acquire the sentence or word selected by the user as the generation information.
- the generation information acquisition unit 111 may acquire information on a web address (URL, etc.) as the generation information.
- the generation information acquisition unit 111 may acquire web information contained in a website specified by the web address, determine the style, and set it as the generation information.
- the web information may be text information, image information, video information, code information (code constituting the website, which may be in a format such as HTML, CSS, Javascript, etc., but is not limited to these) contained in the website.
- the generation information acquisition unit 111 may determine the style of the target, concept, etc., based on the text information, for example.
- the generation information acquisition unit 111 may perform morphological analysis on the text information, and determine the style of the target, concept, etc., based on information on the words contained and the number of words, but is not limited to these methods.
- the generation information acquisition unit 111 may determine the style of the color, texture, font, shape, etc., from the image information, video information, or code information.
- the generation information acquisition unit 111 may analyze the image information or video information and determine the style from the predominant colors, textures, fonts, shapes, etc., or may determine the style based on information contained in the code information, such as the colors, textures, shapes, and fonts used on the web of the web background image, but is not limited to these methods.
- the generation information acquisition unit 111 acquires element images that are the basis for forming part of the resultant image.
- the generation information acquisition unit 111 may accept uploads of element images.
- the generation information acquisition unit 111 may store a material image (e.g., 201 in FIG. 5) in the server device 1, present the material image on the user terminal 3, accept a selection operation of the material image by the user, and acquire the material image selected by the user as generation information.
- the generation information acquisition unit 111 acquires position information of the element image in the result image.
- the position information indicates the coordinates of the element image in the result image, and may be, for example, XY coordinates with a specific position such as a specific corner of the result image as the origin, and may be the XY coordinates of the center of the element image.
- the generation information acquisition unit 111 presents a frame (for example, 202.
- the frame may be horizontal, square, vertical, or other shapes, but is not limited to these) corresponding to the shape of the result image to the user terminal 3, as shown in FIG. 5 as an example.
- the generation information acquisition unit 111 acquires information on the user's operation on the user terminal 3 in the frame, and acquires position information of the material image in the result image.
- the generation information acquisition unit 111 may, for example, accept a drag-and-drop operation by the user, acquire arrangement information of the partial image (203) or the material image (204), and acquire position information of the element image.
- the generation information acquisition unit 111 may also accept enlargement, reduction, rotation, inversion, deformation, and the like of the element image.
- the generation information acquisition unit 111 may acquire the anteroposterior relationship between multiple element images (which may be layer information, for example) as position information.
- the position information may be information about the positional relationship between element images (such as that a material image is located below a partial image).
- the generation information acquisition unit 111 may acquire the generation information in a chat format.
- the generation information acquisition unit 111 may divide the text information acquired in the chat format into two words by morphological analysis or the like, and acquire the word information as the generation information.
- the generation information acquisition unit 111 may provide assistance to the user to easily recognize the necessary generation information by presenting guidance such as "Please upload a subject image” or "Please tell us a reference website” regarding the generation information acquired from the user.
- the generation information acquisition unit 111 may provide guidance to the user regarding information not acquired from the user or information that could not be determined from the acquired generation information from a list of generation information required for image generation prepared in advance, but this method is not limited to this method.
- the generation information acquisition unit 111 generates a prompt or prerequisite conditions (collectively referred to as prompt information in this specification) to be input to the image generation model based on the acquired generation information.
- the prerequisite conditions may include, but are not limited to, information such as image size, frame shape, file size, and resolution.
- the prompt generated by the generation information acquisition unit 111 includes at least text that represents a style.
- the generation information acquisition unit 111 generates prompt information, for example, by using a feature extraction module and a language model. Note that the generation information acquisition unit 111 may generate one or more pieces of prompt information, present it to the user, and accept selection or editing of a prompt.
- the generation information acquisition unit 111 may change the structure of the generated prompt depending on the type of generative model used by the result image generation unit 112 for image generation.
- the generation information acquisition unit 111 may, for example, generate a sentence-type prompt, or a prompt in the form of a list of words.
- a prompt may be generated in a form that causes the generative model to recognize important words by a method of indicating the importance of words, such as enclosing important words in parentheses, having the order of words appear at the beginning of the prompt, or including multiple important words.
- the prompt generated by the generation information acquisition unit 111 includes at least text representing a style.
- the generation information acquisition unit 111 may generate multiple prompts, or may impart a certain degree of randomness to the text included in the prompt.
- the generation information acquisition unit 111 imparts a certain degree of randomness to the text representing the style, depending on the semantic distance or similarity to the text representing the style.
- the generation information acquisition unit 111 generates a prompt by including other words that are close in meaning to the "sea” or have a high similarity to the "sea”.
- the result image generation unit 112 which will be described later, generates a result image that is closer to the image that the user wishes to generate by using these prompts. Conversely, when the generation information acquired by the user's input or the like includes little style information related to the "sea", the generation information acquisition unit 111 generates a prompt by including other words that are distant in meaning to the "sea” or have a low similarity to the "sea”.
- the result image generating unit 112 uses these prompts to generate a result image, making it easier for the user to consider the direction of the style of the result image, for example, when the user does not yet have an image of the ocean in mind. Note that adding a certain degree of randomness to the text included in the prompts may have other effects.
- the generation information acquisition unit 111 may acquire additional generation information for the first result image generated by the result image generation unit 112.
- the additional generation information acquired by the generation information acquisition unit 111 is used to modify or add to the prompt used when generating the first result image, and is used by the result image generation unit 112 when generating the second result image.
- the result image generation unit 112 generates a result image based on at least one of style information, element image, and position information, as an example.
- the result image generation unit 112 inputs prompt information generated by the generation information acquisition unit 111 based on at least one of style information, element image, and position information, for example, to a generation model, and acquires an image output by the generation model.
- the result image generation unit 112 may use the image output by the generation model as a result image, or may generate a result image by performing processing or the like based on the output image.
- the result image generation unit 112 presents the generated result image to the user. The user can download the presented image.
- the generative model used by the result image generation unit 112 to generate the result image may be implemented in the server device 1 or in another server accessible via the communication network 2, but is not limited to this. For this reason, when the generative model is implemented in the server device 1, the result image generation unit 112 inputs prompt information to the generative model, and when the generative model is implemented in another server, the result image generation unit 112 transmits the prompt information to the generative model via the communication network 2.
- the expression "inputting prompt information to the generative model” is used to include the case where the prompt information is transmitted to the generative model.
- the generative model may be, for example, a model that receives a specific input vector or random noise given as an input and generates an image from that information.
- the generative model may, for example, include a generator.
- the generator converts the input information into appropriate features or patterns and converts them into an image.
- the generator may be constructed using, for example, a Convolutional Neural Network (CNN), a Transformer, or other deep learning architecture, although other architectures may also be used.
- the generative model may also include, for example, a Discriminator.
- the Discriminator identifies whether an image is a real image or a fake image generated by the Generator.
- the Discriminator may, for example, be constructed using a network such as a CNN, but is not limited to these.
- the generative model may, for example, include an Adversarial Network (GAN).
- GAN Adversarial Network
- the adversarial network trains the generator to generate more realistic images, while at the same time training the classifier to improve its ability
- the result image generating unit 112 may generate two or more result images. In addition, the result image generating unit 112 presents the generated result images to the user.
- the result image generating unit 112 may accept a selection operation from the user at the user terminal 3 of an image that is close to the result image to be generated from the multiple images, or an image that is different, and may generate a further result image based on the result image selected by the selection operation.
- the result image generating unit 112 may generate a result image B similar to result image A, for example, based on the characteristics of result image A selected as being close to the result image to be generated from the images.
- the generation information acquiring unit 111 may modify the information of the prompt input to the generation model when generating result image A selected by the selection operation, or generate a prompt indicating regeneration of an image or variation similar to the selected result image A, and input these prompts again into the generation model to generate the result image, but this is not limited to this method.
- the result image generating unit 112 may generate a second result image when the generation information acquiring unit 111 acquires additional information from the user for the generated result image (first result image).
- the result image generating unit 112 inputs to the generation model the prompt information input to the generation model when generating the first result image and the prompt information generated by the generation information acquiring unit 111 based on the additional information, and generates a second result image based on the output information output by the generation model.
- FIG. 6 is a diagram illustrating an example of the processing performed by the generation support device of this embodiment.
- the server device 1 acquires generation information from the user (1001).
- the server device 1 generates a prompt based on the acquired generation information (1002).
- the server device 1 inputs the prompt into the generation model (1003).
- the server device 1 acquires output information (result image) of the generation model (1004).
- the server device 1 presents the output information to the user (1005).
- the server device 1 may, for example, perform pre-processing of the partial image acquired by the generation information acquisition unit 111.
- the server device 1 may, for example, determine the subject of the partial image and remove the background other than the subject part.
- the generation information acquisition unit 111 may, for example, emphasize the subject from the partial image.
- the server device 1 may, for example, determine the relationship between the subject and the camera position for the subject included in the partial image, detect the camera angle, and generate a prompt that is generated by the generation information acquisition unit 111 accordingly.
- the prompt may include, for example, a prompt that specifies the angle at which the subject is to be displayed in the resultant image, but is not limited to this example.
- the generation information acquisition unit 111 may acquire position information in the image to be generated for the partial image after the above-mentioned preprocessing has been performed, or may generate a prompt.
- the server device 1 may suggest a style to the user based on marketing information.
- the marketing information may include information acquired in advance, such as information on the product that is the subject of the result image, industry information, the results of marketing research, and information acquired from the user, such as past sales records of the subject product and sales of similar products.
- the server device 1 suggests to the user a style determined from sales websites and advertising images of similar products with high sales volume for the product that is the subject of the image to be generated, based on information such as sales records of similar products.
- the server device 1 may present to the user information on the web address of the sales website of the similar product with high sales volume and text information (e.g., "luxurious", “natural”, etc.) to be included in the prompt generated by the generation information acquisition unit 111 as a style, or may include it in the prompt used for image generation.
- text information e.g., "luxurious", “natural”, etc.
- the server device 1 may recommend a style to the user based on result images that the user has previously generated using the server device 1, or on information about the prompts used to generate the result images. For example, the server device 1 may analyze result images that the user has previously generated, or determine the style based on text information included in the prompts, present the most frequently detected styles to the user terminal 3, and obtain a selection operation as to whether or not to use the style in generating the result images. Specifically, for example, if the server device 1 determines that the user has only generated realistic images in the past, it may present a question such as "Do you want to generate a realistic image? Yes No" to the user terminal 3 via chat or the like, obtain the user's selection operation, and generate a prompt based on the answer selected by the user.
- a question such as "Do you want to generate a realistic image? Yes No"
- the server device 1 may generate information related to product sales, not limited to images.
- Information generated by the server device 1 may include, but is not limited to, banner advertisement images for advertising products and campaigns, effective catchphrases and catchy copy that concisely express the features of products and brands, product descriptions that are text information that describe the detailed explanation and features of products, designs and layouts used on the top pages of EC sites that sell products, category page designs that are designs and display methods of product category pages, landing page designs for highlighting specific campaigns and products, images and catchphrases for social media and advertising platforms, and the like.
- the generation information acquisition unit 111 When the server device 1 generates the above-mentioned information, the generation information acquisition unit 111 generates a prompt based on the information acquired, and if it is an image or design, the result image generation unit 112 inputs the prompt into an image generation model, and if it is text information, inputs the prompt into a text generation model (for example, a large-scale language model such as ChatGPT) to generate the information.
- a text generation model for example, a large-scale language model such as ChatGPT
- the server device 1 may acquire images included in the website specified by the specified web address as element images.
- the server device 1 may acquire all images included in the website as element images and store them in the generated information storage unit 121, or may accept a user's selection operation for an image to be acquired as generated information from among the images included in the website, and store the selected image as the element image.
- the device described in this specification may be realized as a single device, or may be realized by multiple devices (e.g., cloud servers) some or all of which are connected via a communication network 2.
- the processor 101 and the storage device 103 of the server device 1 may be realized by different servers connected to each other via the communication network 2.
- the series of processes performed by the device described in this specification may be realized using software, hardware, or a combination of software and hardware.
- a computer program for realizing each function of the server device 1 according to this embodiment can be created and installed on a PC or the like.
- a computer-readable recording medium on which such a computer program is stored can also be provided. Examples of the recording medium include a magnetic disk, optical disk, magneto-optical disk, and flash memory.
- the above computer program may also be distributed, for example, via the communication network 2 without using a recording medium.
- Reference Signs List 1 Server device 2 Communication network 3 User terminal 101 CPU REFERENCE SIGNS LIST 102 Memory 103 Storage device 104 Communication interface 105 Input device 106 Output device 111 Generated information acquisition unit 112 Resultant image generating unit 131 Generated information storage unit 132 Image information storage unit
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- General Health & Medical Sciences (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Processing Or Creating Images (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Transfer Between Computers (AREA)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US19/264,887 US20250336105A1 (en) | 2023-08-10 | 2025-07-10 | Generation support device, generation support program, and generation support method |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023-131665 | 2023-08-10 | ||
| JP2023131665A JP7458675B1 (ja) | 2023-08-10 | 2023-08-10 | 生成支援装置、生成支援プログラム、生成支援方法 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/264,887 Continuation US20250336105A1 (en) | 2023-08-10 | 2025-07-10 | Generation support device, generation support program, and generation support method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025032985A1 true WO2025032985A1 (ja) | 2025-02-13 |
Family
ID=90474199
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2024/022627 Pending WO2025032985A1 (ja) | 2023-08-10 | 2024-06-21 | 生成支援装置、生成支援プログラム、生成支援方法 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20250336105A1 (https=) |
| JP (3) | JP7458675B1 (https=) |
| WO (1) | WO2025032985A1 (https=) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7736217B1 (ja) * | 2024-05-21 | 2025-09-09 | Toppanホールディングス株式会社 | アバター生成システム、アバター生成方法、およびプログラム |
| WO2025262805A1 (ja) * | 2024-06-18 | 2025-12-26 | 株式会社Nttドコモ | 生成装置及び生成方法 |
| JP7663187B1 (ja) * | 2024-12-10 | 2025-04-16 | Clinks株式会社 | 情報処理システムおよびプログラム |
Family Cites Families (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12363253B2 (en) * | 2021-05-18 | 2025-07-15 | Microsoft Technology Licensing, Llc | Realistic personalized style transfer in image processing |
| CN115836319B (zh) * | 2021-07-15 | 2025-10-17 | 京东方科技集团股份有限公司 | 图像处理方法及装置 |
| US12322167B2 (en) * | 2021-12-28 | 2025-06-03 | Yahoo Ad Tech Llc | Computerized system and method for image creation using generative adversarial networks |
| US12198224B2 (en) * | 2022-02-15 | 2025-01-14 | Adobe Inc. | Retrieval-based text-to-image generation with visual-semantic contrastive representation |
| US20240161258A1 (en) * | 2022-11-11 | 2024-05-16 | Shopify Inc. | System and methods for tuning ai-generated images |
| US12462441B2 (en) * | 2023-03-20 | 2025-11-04 | Sony Interactive Entertainment Inc. | Iterative image generation from text |
| US20240320873A1 (en) * | 2023-03-20 | 2024-09-26 | Adobe Inc. | Text-based image generation using an image-trained text |
| US20240330381A1 (en) * | 2023-03-29 | 2024-10-03 | Google Llc | User-Specific Content Generation Using Text-To-Image Machine-Learned Models |
| CN116385584A (zh) | 2023-04-03 | 2023-07-04 | 平安国际融资租赁有限公司 | 海报的生成方法、装置、系统及计算机可读存储介质 |
| US20240338859A1 (en) * | 2023-04-05 | 2024-10-10 | Adobe Inc. | Multilingual text-to-image generation |
| US12406418B2 (en) * | 2023-04-20 | 2025-09-02 | Adobe Inc. | Personalized text-to-image generation |
| CN116433825B (zh) | 2023-05-24 | 2024-03-26 | 北京百度网讯科技有限公司 | 图像生成方法、装置、计算机设备及存储介质 |
| WO2025024783A2 (en) * | 2023-07-26 | 2025-01-30 | Maplebear Inc. | Generating artificial intelligence (ai)-based images using large language machine-learned models |
-
2023
- 2023-08-10 JP JP2023131665A patent/JP7458675B1/ja active Active
-
2024
- 2024-03-12 JP JP2024038021A patent/JP7751899B2/ja active Active
- 2024-06-21 WO PCT/JP2024/022627 patent/WO2025032985A1/ja active Pending
-
2025
- 2025-07-10 US US19/264,887 patent/US20250336105A1/en active Pending
- 2025-09-19 JP JP2025155495A patent/JP2025183388A/ja active Pending
Non-Patent Citations (3)
| Title |
|---|
| ANOBAKA CH: "Behind the scenes of the startup, procurement and development of an image generation AI service that will revolutionize the EC industry [Fotographer AI Suzuki]", XP093288208, Retrieved from the Internet <URL:https://www.youtube.com/watch?v=_WV0lSuN-KA> [retrieved on 20231218] * |
| ITO, RINTARO: "About trends in artificial intelligence 2023", MEDICAL EYE, vol. 21, no. 4, 31 March 2023 (2023-03-31), pages 78 - 79, XP009560995 * |
| OTANI, DAI: "The Photorealistic plugin, which outputs a prompt for image generation AI when you input Japanese into ChatGPT, is too convenient", DELAY MANIA, XP009561606, Retrieved from the Internet <URL:https://web.archive.org/web/20230525002720/https://delaymania.com/202305/webservice/chatgpt-plugin-photorealistic/> [retrieved on 20231218] * |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2025183388A (ja) | 2025-12-16 |
| JP7751899B2 (ja) | 2025-10-09 |
| US20250336105A1 (en) | 2025-10-30 |
| JP2025026277A (ja) | 2025-02-21 |
| JP2025026209A (ja) | 2025-02-21 |
| JP7458675B1 (ja) | 2024-04-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7751899B2 (ja) | 生成支援装置、生成支援プログラム、生成支援方法 | |
| US11809822B2 (en) | Joint visual-semantic embedding and grounding via multi-task training for image searching | |
| Choi et al. | Visualizing for the non‐visual: Enabling the visually impaired to use visualization | |
| US8358320B2 (en) | Interactive transcription system and method | |
| US7737980B2 (en) | Methods and apparatus for supporting and implementing computer based animation | |
| US9754585B2 (en) | Crowdsourced, grounded language for intent modeling in conversational interfaces | |
| Lang et al. | Attesting similarity: Supporting the organization and study of art image collections with computer vision | |
| EP4336379A1 (en) | Tracking concepts within content in content management systems and adaptive learning systems | |
| Choi et al. | Assist users' interactions in font search with unexpected but useful concepts generated by multimodal learning | |
| CN114821004A (zh) | 虚拟空间构建方法、虚拟空间构建装置、设备及存储介质 | |
| CN114693844B (zh) | 一种电子绘本生成方法、装置及电子设备 | |
| Widiarti et al. | Enhancing the Transliteration of Words Written in Javanese Script through Augmented Reality | |
| US20250298838A1 (en) | Tracking concepts within content in content management systems and adaptive learning systems | |
| CN100573419C (zh) | 将印刷材料与由计算机系统产生的响应关联的方法和系统 | |
| Vermeeren | Chinese Calligraphy in the digital realm: Aesthetic perfection and remediation of the authentic | |
| Yan | From physical to virtual: Enhancing the representation of intangible cultural heritage using mixed reality | |
| Duan et al. | Cognitive differences in product shape evaluation between real settings and virtual reality: case study of two-wheel electric vehicles | |
| Cortes-Camarillo et al. | Atila: A UIDPs-based educational application generator for mobile devices | |
| Han et al. | Hearing with the eyes: modulating lyrics typography for music visualization | |
| Rachabathuni et al. | Computer vision and AI TOOLS for enhancing user experience in the cultural heritage domain | |
| CN116912366A (zh) | 一种基于ai的平面设计生成方法及系统 | |
| Rai et al. | MyOcrTool: visualization system for generating associative images of Chinese characters in smart devices | |
| Amr et al. | Practical D3. js | |
| Andriushchenko et al. | The role of it innovations in shaping changes in the publishing industry of Ukraine | |
| Akbar et al. | A Multimodal Analysis of Human-Generated and Machine-Generated Advertisements in Pakistan |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24851418 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |