CN117649461B - Interactive image generation method and system based on space layout and use method thereof - Google Patents
Interactive image generation method and system based on space layout and use method thereof Download PDFInfo
- Publication number
- CN117649461B CN117649461B CN202410115444.3A CN202410115444A CN117649461B CN 117649461 B CN117649461 B CN 117649461B CN 202410115444 A CN202410115444 A CN 202410115444A CN 117649461 B CN117649461 B CN 117649461B
- Authority
- CN
- China
- Prior art keywords
- image
- target
- reference image
- attribute
- code
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 230000002452 interceptive effect Effects 0.000 title claims abstract description 22
- 239000013598 vector Substances 0.000 claims description 14
- 230000000694 effects Effects 0.000 claims description 12
- 230000005540 biological transmission Effects 0.000 claims description 8
- 210000003128 head Anatomy 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 3
- 230000003993 interaction Effects 0.000 abstract description 3
- 230000000007 visual effect Effects 0.000 abstract description 3
- 238000013459 approach Methods 0.000 abstract description 2
- 230000000875 corresponding effect Effects 0.000 description 22
- 230000001276 controlling effect Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 10
- 230000008676 import Effects 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 6
- 238000012546 transfer Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000002537 cosmetic Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000011410 subtraction method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention discloses an interactive image generation method and system based on space layout and a using method thereof, wherein the interactive image generation method and system based on space layout comprises the following steps: importing a target image and a reference image; wherein, the target image is: the original image of the task to be generated by the user is the reference image: an image for providing a feature attribute for a target picture; transferring the characteristic attribute of the reference image to the target image, and adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image; based on the adjusted target image and the reference image, a new target image is generated. The invention can easily adjust the image position to control the generated result, and enhance visual interaction and efficiency. This approach allows users to better control the results and increase their degrees of freedom of authoring.
Description
Technical Field
The invention belongs to the technical field of picture generation, and particularly relates to an interactive image generation method and system based on spatial layout and a use method thereof.
Background
Although the ability to generate a countermeasure network (GAN) is impressive, controlling the style and content of its generated image remains a challenge. Active control of the image style and content generated by GAN is critical to meet specific requirements in practical applications. This has led to the advent of GAN image generation tools that enable users to express their creativity and ideas. In addition, personalized generation has gained acceptance, yielding various tools aimed at meeting the user's wishes.
However, existing tools still lack control flexibility and user friendliness. For example, sketch-based tools typically require a user to have specific drawing skills or image editing experience. In addition, the resulting image often fails to achieve the desired level of realism. Slider-based tools offer limited options, ignoring diverse user requirements and creative expressions. Text-based tools rely on abstract inputs, resulting in a gap between the user's expectations and the results obtained. In addition, some tools have visual efficiency problems such as interface confusion, complex operation, and the like. As users increasingly seek freedom and personalization of generation tools, traditional tool modes have not been adequate to meet the needs of a particular scenario.
Disclosure of Invention
The invention aims to provide a novel tool specially designed for flexible image generation, and the main aim of the novel tool is to provide a multifunctional image generation platform which surpasses the traditional control function, through innovative 2D layout design, a user can easily adjust the image position to control the generated result, visual interaction and efficiency are enhanced, the method enables the user to better control the result, the degree of freedom of creation of the user is improved, in addition, the tool integrates the real world image as a reference, the user can use the attribute of the existing image to guide the generation of a target image, the result is expected, creative exploration and experiment can be stimulated, and firm user and content connection is promoted.
To achieve the above object, the present invention provides a spatial layout-based interactive image generation method, comprising:
importing a target image and a reference image; wherein, the target image is: the user wants to generate the original image of the task, and the reference image is: an image for providing a feature attribute for a target picture;
transferring the characteristic attribute of the reference image to the target image, and adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image;
And generating a new target image according to the final weight.
Optionally, the characteristic attribute of the reference image includes: local and global properties;
the local attributes include: eyes, nose, mouth and hair;
The global attributes include: dressing, age, face shape and head orientation.
Optionally, transferring the local attribute of the reference image to the target image comprises:
Providing a corresponding mask for each reference image by adopting a mask preprocessing method; when one local attribute is selected from the reference image, identifying a mask corresponding to the selected local attribute; combining the identified mask with the reference image to extract regions of the local attribute;
Adding the extracted region to the target image to create a new input image; processing the input image and the target image using a pre-trained encoder and generating two corresponding potential vectors Code t and Code i;
The latent vectors Code t and Code i are input to a pre-trained image generator using weighted addition, resulting in the transmission of the local attributes.
Optionally, transferring the global attribute of the reference image to the target image comprises:
Selecting a base image from a pre-collected set of image data that is aligned with the reference image; wherein, the basic image is: the aligned basic image only has one attribute difference from the reference image, and the basic image is used for extracting the global feature of the reference image;
Inputting the basic image, the target image and the reference image into a pre-trained encoder to obtain potential vectors Code t, code r and Code b;
extracting a representation of an attribute related to the global attribute by subtracting Code b from Code r, and adding Code t to Code r to generate new potential parameters;
The new potential parameters are input into a pre-trained image generator to generate a single global attribute transmission image.
Optionally, the method for adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image comprises the following steps:
Wherein, Is the distance weight, k is a constant, tar is the (x, y) coordinates of the target image, refi is the (x, y) coordinates of the ith reference image,/>For the final generated picture result, G is a generator, n is the number of reference images, and Code i is the potential vector of the i-th reference image.
To achieve the above object, the present invention also provides an interactive image generation system based on spatial layout, including: the device comprises an importing module, an adjusting module and a generating module;
The importing module is used for importing a target image and a reference image; wherein, the target image is: the user wants to generate the original image of the task, and the reference image is: an image for providing a feature attribute for a target picture;
The adjusting module is used for transferring the characteristic attribute of the reference image to the target image and adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image;
The generation module is used for generating a new target image based on the adjusted target image and the reference image.
Optionally, the characteristic attribute of the reference image includes: local and global properties;
the local attributes include: eyes, nose, mouth and hair;
The global attributes include: dressing, age, face shape and head orientation.
To achieve the above object, the present invention further provides a method for using an interactive image generation system based on spatial layout, including:
Importing a target picture and a reference picture;
And selecting the characteristic attribute of the reference picture, and controlling the generation effect by adjusting the distance between the target picture and the reference picture.
The invention has the following beneficial effects:
The invention leads in the target image and the reference image; transferring the characteristic attribute of the reference image to the target image, and adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image; the user can easily adjust the image position to control the generated result, and the intuitive interaction and efficiency are enhanced. This approach allows users to better control the results and increase their degrees of freedom of authoring. Furthermore, the tool integrates real world images as a reference, enabling a user to use the properties of existing images to guide the generation of target images. This not only aids in the outcome expectations, but also motivates creative exploration and experimentation, thereby facilitating a more secure user to content contact.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:
FIG. 1 is a schematic flow chart of an interactive image generation method based on spatial layout according to an embodiment of the invention;
FIG. 2 is a schematic diagram of an interface design for generating pictures according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a local attribute transfer according to an embodiment of the present invention;
FIG. 4 is a global attribute transfer diagram of an embodiment of the present invention;
FIG. 5 is a diagram of an imported image according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a picture import to a user workspace according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating an attribute selection box according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of controlling the intensity of a generated effect by dragging a reference picture according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of a generated picture effect according to an embodiment of the present invention;
FIG. 10 is a mask exemplary diagram of an embodiment of the present invention;
fig. 11 is a schematic diagram showing the difference between base images REFERENCE IMAGE according to an embodiment of the present invention.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
Example 1
As shown in fig. 1, the present embodiment provides a spatial layout-based interactive image generation method, including:
Importing a target image and a reference image; wherein, the target image is: the original image of the task to be generated by the user is the reference image: an image for providing a feature attribute for a target picture;
Transferring the characteristic attribute of the reference image to the target image, and adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image;
Based on the adjusted target image and the reference image, a new target image is generated.
Further, the characteristic attributes of the reference image include: local and global properties;
the local attributes include: eyes, nose, mouth and hair;
the global attributes include: dressing, age, face shape and head orientation.
Further, transferring the local attribute of the reference image to the target image includes:
Providing a corresponding mask for each reference image by adopting a mask preprocessing method; when a local attribute is selected from the reference image, the tool automatically identifies the related mask; the mask is combined with the reference image to extract regions of local attribute;
adding the extracted region to the target image to create a new input image; processing the input image and the target image using a pre-trained encoder and generating two corresponding potential parameters Code t and Code i;
The latent parameters Code t and Code i are input into the pre-trained image generator using weighted addition, resulting in a transmission of the local properties.
Further, transferring the global attribute of the reference image to the target image includes:
Selecting a base image from a pre-collected dataset that is aligned with the reference image;
Inputting the basic image, the target image and the reference image into a pre-trained encoder to obtain potential parameters Code t, code r and Code b;
extracting a representation of an attribute related to the global attribute by subtracting Code b from Code r, and adding Code t to Code r to generate new potential parameters;
The new potential parameters are input into a pre-trained image generator to generate a single global attribute transmission image.
The interactive image generation method is realized based on the space layout in the embodiment. The method specifically comprises the following steps:
data preparation: is collected from FFHQ datasets and head pose datasets and truncated in the results plot of Image2StyleGAN and BeautyGAN. These data are all portrait.
Preparing a model: the use of gan for the task of generating pictures in the real world requires two components, the first being a generator (typically stylegan) to generate results and the other being an encoder (a gan version technique) to map features of pictures in the real world to the hidden space of stylegan. In this embodiment, styleGAN is used as a picture generator, and e4e (encoder for editing) is used as an encoder, and the pre-training model is used for both models, so that the tool in this embodiment does not involve training of the model.
Interface design: as shown in fig. 2, the present embodiment proposes a mode of picture generation in which a target picture is provided with an attribute by importing a reference picture in the real world. For this reason, a completely new two-dimensional user interface needs to be designed. An interface to the system was written using Pyqt, which is a novelty of this embodiment. In this two-dimensional interface, the target picture is located in the middle of the work area, and the reference picture can be arbitrarily placed in the work area by the user, wherein the reference picture is used for providing attributes for the target picture, for example, adding glasses in the reference picture to the target picture. The embodiment also designs an attribute selection box through which a user can autonomously select the attribute of the reference picture. In addition, the user can control the size of the attribute of the picture in the generated result by adjusting the position of the reference picture, the closer the reference picture is to the target picture, the stronger the attribute, and vice versa.
The user use flow: the user may import an image from the real world or a provided dataset through a reference image field as a reference image. The reference image is used to provide attributes for the generated image. At the same time, the present embodiment introduces a novel 2D spatial layout. The user imports an image (target image) to be generated by clicking on the center area of the spatial layout work area. The user can then add reference images by clicking or dragging them from the reference image column to the workspace, and also add connecting lines to display their relationship. Right clicking on the reference image triggers an attribute selection box allowing the user to select the desired attribute from the image.
Thereafter, the user may move the reference image to control the effect of the selected attribute on the generated result. The closer the reference image is to the target image, the more similar they are and vice versa. This embodiment allows the user to redo, undo, or reset adjustments in the generation process by clicking the corresponding button. Furthermore, the present embodiment changes the color and thickness of the connection line accordingly to further visualize the effect. Integrating these interactive functions into the 2D workspace greatly enhances the user friendliness and intuitiveness of the tools of the present embodiment.
The following describes the principle of algorithm in the interface, the attribute transfer describes how to add the features of the reference picture to the target picture, and the distance weight calculation describes how the user controls the generation effect of the pictures by adjusting the distance between the pictures.
And (5) transferring the attribute.
Through continuous research on existing work, this embodiment finds that existing picture attributes are generally divided into two categories, namely local attributes (e.g. glasses, nose, mouth) and global attributes (e.g. makeup, age, face shape). While global and local properties are applicable to different methods.
Wherein,Local attributes, represents local attributes.
Local properties. The present embodiment employs a mask preprocessing method to provide a corresponding mask for each reference image. Due to the standardization of GAN-based image generation, masks can be shared among most images. When the user selects an attribute (e.g., mouth) from the reference image i, the tool automatically identifies the associated mask. The mask is combined with the reference image to extract the mouth region. The extracted region is then added to the target image to create a new input image. The input and target images are processed using a pre-trained e4e encoder and two corresponding potential vectors Code t and Code i are generated. Finally, the weighted addition is input to a pre-trained StyleGAN generator to produce a single-mouth attribute transmission result. Where two corresponding potential vectors are used in local attribute control, code t= TARGET IMAGE's potential vector, code i represents the i REFERENCE IMAGES th potential vector.
The mouth mask in fig. 3 is a mask. Specific procedure (as shown in fig. 3): the user imports the target picture and the reference picture- > the user selects a certain attribute (for example, mouth) - > of the reference picture, the system automatically selects mask- > representing the mouth, then the reference picture is combined with the mask, the position of the mouth in the reference picture- > the extracted position of the mouth is extracted to be combined with the target picture, and the fusion picture- > and the target picture are obtained through the decoder. An example of the mask is shown in fig. 10.
Wherein,Global attributes, a global attribute is represented.
Global properties. For global properties, the above algorithm is not applicable. Therefore, the present embodiment adopts an image code subtraction method. When the user selects a global attribute (e.g., makeup) in the attribute selection box, the system will select a corresponding base image from the pre-collected portrait dataset that is aligned with the reference image, as shown in FIG. 11, showing the differences between base images and REFERENCE IMAGE, which differ in only one attribute. The target image i, reference image i and base image i are then input into an e4e encoder to obtain the corresponding potential vectors Code t, code r and Code b. By subtracting Code b from Code r, a representation of the cosmetic-related attribute is extracted. Code t is then added to it to generate a new potential Code. Finally, a single global property transmission image is generated by inputting a new code into StyleGAN generator. As shown in fig. 4. Wherein, code t= = TARGET IMAGE, code r= REFERENCE IMAGE, and Code b= base image.
The role of the base image is to extract the global features of the reference image, since the global features are not as simple as the local features are available through the mask. Therefore, only one global attribute on the base image and the reference picture corresponding thereto is different. For example, the reference image is a female who is looking up, the corresponding base image is a female of plain color, the female is identical in both images, and the expression, posture, etc. are approximately identical (gan's generator allows for slight differences). The calculation method of the present embodiment can be regarded as a simple code of the reference image-the code of the base image, so the base image is referred to as the base because it is a reduction number which is used to extract the features of the reference image.
Only one global attribute on the base image and the reference picture corresponding thereto is different, which is also a normal image.
And calculating the distance weight.
To achieve the result of adjusting the distance between the reference image and the target image to affect the generation, the tool of the present embodiment automatically calculates this distance and assigns an inverse attribute weight to the reference image. Specifically, in the two-dimensional working area, the more distant the reference picture is from the target picture, the fewer features of the reference picture will be generated, and the closer the reference picture is from the target picture, the more features of the reference picture will be generated.
Finally, for multiple reference pictures, the present embodiment performs a weighted sum of Codei generated for each reference picture and adds them to Code t.
Wherein,For distance weights, k is a constant (selected over several simple tests), tar is the (x, y) coordinates of the TARGET IMAGE target image, refi is the (x, y) coordinates of the i REFERENCE IMAGE reference image,For the final generated picture result, G is a generator, n is the number of REFERENCE IMAGE reference images, and Code i is the potential vector of the i REFERENCE IMAGE reference image.
Example 2
The embodiment also provides an interactive image generation system based on space layout, which comprises: the device comprises an importing module, an adjusting module and a generating module;
The importing module is used for importing a target image and a reference image; wherein, the target image is: the original image of the task to be generated by the user is the reference image: an image for providing a feature attribute for a target picture;
The adjusting module is used for transferring the characteristic attribute of the reference image to the target image and adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image;
and the generation module is used for generating a new target image based on the adjusted target image and the reference image.
Further, the characteristic attributes of the reference image include: local and global properties;
the local attributes include: eyes, nose, mouth and hair;
the global attributes include: dressing, age, face shape and head orientation.
The workflow of the interactive image generation system based on the spatial layout proposed in the embodiment is as follows:
the two words are explained first. 1. Target picture: the original picture of the task to be generated by the user, namely the user imports the picture into the system, and the new picture is obtained on the basis of the picture. 2. Reference picture: the picture used to provide the feature for the target picture may be selected by the user if he wants to swap the target picture with another smile, and then the attribute of this smile is selected by feature selection. Typically selected from the real world by the user.
The user first selects a target picture and a reference picture (the reference picture is imported into the reference picture column) (as shown in fig. 5).
Next, the user may click on the corresponding picture in the reference picture field to import the corresponding picture into the user workspace. Fig. 6 shows four reference pictures being imported, note that the four reference pictures are draggable. In addition, when a reference picture is imported into the user's workspace, a weight line is automatically generated between the reference picture and the target picture that changes color and thickness according to the distance between the reference picture and the target picture, which color and thickness show the weight of the reference picture in subsequent calculations.
At the same time, the user may click on a button on the right side of the workspace, which is called a generate button. Clicking the button will produce the corresponding result.
Further, the present embodiment provides the user with an attribute selection box (as shown in fig. 7). The user clicks on a reference picture by right-clicking on it. The user may generate a picture in which the person is more aged than the original by selecting whether certain attributes in the reference picture appear, e.g., the age (age) attribute of only the picture is currently selected. The currently designed attribute selection frame has 8 attributes, namely eyes, nose, mouth, hair, age, face, head orientation and makeup.
In addition, the user can drag the reference picture by clicking a left button of the mouse. The user can realize control of the intensity of the generated effect by dragging the reference picture, as shown in fig. 8.
The specific realization of the functions:
The tool is mainly written using PyQt5 and matplotlib. QT DESIGNER in PyQt5 for use in the user interface of the tool. The generation network used by this tool is stylegan a, while in order to be able to use pictures in the real world for generation tasks, a gan transformation technique is required, this embodiment uses e4e (an encoder).
Definition of reference pictures: QGraphicsPixmapItem is used for definition, the specific name of the class is CLASS GRAPHICITEM (QGraphicsPixmapItem). This class has two most important functions, one is def mouseMoveEvent (self, event), by which the dragging of the reference picture can be achieved. The other is def mousePressEvent (self, event), which defines two user operations, a middle and right click/drag, respectively, corresponding to delete and appear property selection boxes, respectively.
For the read picture function (import target picture and reference picture): the pictures and paths of pictures are read using the qfiiedialog dialog.
For the mobile function: because def mouseMoveEvent (self, event) is included in the definition of the reference picture, the user can drag the reference picture through the left mouse button.
For the feature selection box: the feature selection box is designed with a PyQt5 bond QDialog. For each reference picture, there is a list, in which 8 variables are stored, corresponding to 8 attributes, and initially, all the attributes have values of true, and when the user clicks (cancels) the attribute, the corresponding variable is changed from true to false, that is, the attribute of the reference picture does not participate in the final generation effect. And vice versa.
For the weight line: QGRAPHICSLINEITEM was used to accomplish this function. First, the distance between the reference picture and the target picture is calculated, and then the thickness and color of the weight line are adjusted, wherein the thickness and color are inversely proportional to the distance, and the thicker the distance is, the darker the line is, the farther the distance is, the thinner the line is, and the lighter the color is.
The picture generation process comprises the following steps: all the reference pictures and the target pictures pass through an encoder (e 4 e) to generate corresponding codes, the codes are added according to a certain weight according to the selection result of the feature selection frame, and the adding calculation method of the codes refers to attribute transfer and distance weight calculation. Then a final code is obtained, and the code is passed through a generator (stylegan) to generate a picture with a corresponding effect.
Example 3
The embodiment provides a using method of an interactive image generating system based on space layout, which comprises the following steps:
Importing a target picture and a reference picture;
And selecting the characteristic attribute of the reference picture, and controlling the generation effect by adjusting the distance between the target picture and the reference picture.
Step profile used by the user:
1. The user selects a target picture (original picture intended for the generation task) and a reference picture (picture providing the feature).
2. The user places the reference picture in the middle user workspace.
3. The user can adjust the distance (reference distance weight calculation) and the attribute (reference attribute transfer) of the reference picture to control the generated effect.
4. The iteration is continued until the desired result is obtained.
Experiments and user researches prove that the image generation tool of the embodiment is superior to the existing tool in the aspects of degree of freedom, easiness in achieving the target, result satisfaction, user use intention and the like. As shown in fig. 9, an exemplary view of the effect is shown.
The present application is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present application are intended to be included in the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.
Claims (5)
1. A method for generating an interactive image based on a spatial layout, comprising:
importing a target image and a reference image; wherein, the target image is: the user wants to generate the original image of the task, and the reference image is: an image for providing a feature attribute for a target picture;
transferring the characteristic attribute of the reference image to the target image, and adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image;
The characteristic attributes of the reference image include: local and global properties;
Transferring the local properties of the reference image to the target image comprises:
Providing a corresponding mask for each reference image by adopting a mask preprocessing method; when one local attribute is selected from the reference image, identifying a mask corresponding to the selected local attribute; combining the identified mask with the reference image to extract regions of the local attribute;
Adding the extracted region to the target image to create a new input image; processing the input image and the target image using a pre-trained encoder and generating two corresponding potential vectors Code t and Code i;
inputting potential vectors Code t and Code i into a pre-trained image generator by using weighted addition to generate a transmission result of the local attribute;
transferring global properties of the reference image to the target image includes:
selecting a base image from a pre-collected set of image data that is aligned with the reference image; wherein, the basic image is: the aligned basic image only has one global attribute difference from the reference image, and the basic image is used for extracting global features of the reference image;
Inputting the basic image, the target image and the reference image into a pre-trained encoder to obtain potential vectors Code t, code r and Code b;
extracting a representation of an attribute related to the global attribute by subtracting Code b from Code r, and adding Code t to Code r to generate new potential parameters;
inputting the new potential parameters into a pre-trained image generator to generate a single global attribute transmission image;
the method for adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image comprises the following steps:
wherein/> Is the distance weight, k is a constant, tar is the (x, y) coordinates of the target image, refi is the (x, y) coordinates of the ith reference image,/>For the final generated picture result, G is a generator, n is the number of reference images, and Code i is the potential vector of the ith reference image;
And generating a new target image according to the final weight.
2. The method for generating a spatially layout-based interactive image in accordance with claim 1,
The local attributes include: eyes, nose, mouth and hair;
The global attributes include: dressing, age, face shape and head orientation.
3. A spatial layout based interactive image generation system for implementing the spatial layout based interactive image generation method of any of claims 1-2, the system comprising: the device comprises an importing module, an adjusting module and a generating module;
The importing module is used for importing a target image and a reference image; wherein, the target image is: the user wants to generate the original image of the task, and the reference image is: an image for providing a feature attribute for a target picture;
The adjusting module is used for transferring the characteristic attribute of the reference image to the target image and adjusting the influence weight of the characteristic attribute of the reference image on the target image by controlling the distance between the target image and the reference image;
The generation module is used for generating a new target image based on the adjusted target image and the reference image.
4. A spatial layout based interactive image generation system according to claim 3 wherein the characteristic attributes of said reference image comprise: local and global properties;
the local attributes include: eyes, nose, mouth and hair;
The global attributes include: dressing, age, face shape and head orientation.
5. A method of using a spatial layout based interactive image generation system, wherein the spatial layout based interactive image generation system of any of claims 3-4 is applied, the method of using comprising:
Importing a target picture and a reference picture;
And selecting the characteristic attribute of the reference picture, and controlling the generation effect by adjusting the distance between the target picture and the reference picture.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410115444.3A CN117649461B (en) | 2024-01-29 | 2024-01-29 | Interactive image generation method and system based on space layout and use method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410115444.3A CN117649461B (en) | 2024-01-29 | 2024-01-29 | Interactive image generation method and system based on space layout and use method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117649461A CN117649461A (en) | 2024-03-05 |
CN117649461B true CN117649461B (en) | 2024-05-07 |
Family
ID=90048042
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410115444.3A Active CN117649461B (en) | 2024-01-29 | 2024-01-29 | Interactive image generation method and system based on space layout and use method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117649461B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111814566A (en) * | 2020-06-11 | 2020-10-23 | 北京三快在线科技有限公司 | Image editing method, image editing device, electronic equipment and storage medium |
CN113426129A (en) * | 2021-06-24 | 2021-09-24 | 网易(杭州)网络有限公司 | User-defined role appearance adjusting method, device, terminal and storage medium |
CN113744257A (en) * | 2021-09-09 | 2021-12-03 | 展讯通信(上海)有限公司 | Image fusion method and device, terminal equipment and storage medium |
CN116824625A (en) * | 2023-05-29 | 2023-09-29 | 北京交通大学 | Target re-identification method based on generation type multi-mode image fusion |
CN116862759A (en) * | 2023-06-19 | 2023-10-10 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Personalized portrait generation system and method based on generation countermeasure network |
CN117235114A (en) * | 2023-09-20 | 2023-12-15 | 江苏华真信息技术有限公司 | Retrieval method based on cross-modal semantic and mixed inverse fact training |
-
2024
- 2024-01-29 CN CN202410115444.3A patent/CN117649461B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111814566A (en) * | 2020-06-11 | 2020-10-23 | 北京三快在线科技有限公司 | Image editing method, image editing device, electronic equipment and storage medium |
CN113426129A (en) * | 2021-06-24 | 2021-09-24 | 网易(杭州)网络有限公司 | User-defined role appearance adjusting method, device, terminal and storage medium |
CN113744257A (en) * | 2021-09-09 | 2021-12-03 | 展讯通信(上海)有限公司 | Image fusion method and device, terminal equipment and storage medium |
CN116824625A (en) * | 2023-05-29 | 2023-09-29 | 北京交通大学 | Target re-identification method based on generation type multi-mode image fusion |
CN116862759A (en) * | 2023-06-19 | 2023-10-10 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Personalized portrait generation system and method based on generation countermeasure network |
CN117235114A (en) * | 2023-09-20 | 2023-12-15 | 江苏华真信息技术有限公司 | Retrieval method based on cross-modal semantic and mixed inverse fact training |
Also Published As
Publication number | Publication date |
---|---|
CN117649461A (en) | 2024-03-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11868515B2 (en) | Generating textured polygon strip hair from strand-based hair for a virtual character | |
US20210201551A1 (en) | Skeletal systems for animating virtual avatars | |
Dai | Virtual reality for industrial applications | |
Wang et al. | Mixed reality in architecture, design, and construction | |
Gannon et al. | Tactum: a skin-centric approach to digital design and fabrication | |
Duval et al. | Improving awareness for 3D virtual collaboration by embedding the features of users’ physical environments and by augmenting interaction tools with cognitive feedback cues | |
US20210158591A1 (en) | Computer generated hair groom transfer tool | |
US11145138B2 (en) | Virtual reality presentation of layers of clothing on avatars | |
US20190303658A1 (en) | Motion generating apparatus, model generating apparatus and motion generating method | |
US20100231590A1 (en) | Creating and modifying 3d object textures | |
US12079947B2 (en) | Virtual reality presentation of clothing fitted on avatars | |
US11704871B2 (en) | Garment deformation method based on the human body's Laplacian deformation | |
Ishikawa et al. | Semantic segmentation of 3D point cloud to virtually manipulate real living space | |
WO2019204164A1 (en) | Systems and methods for cross-application authoring, transfer, and evaluation of rigging control systems for virtual characters | |
CN113570634B (en) | Object three-dimensional reconstruction method, device, electronic equipment and storage medium | |
Fadzli et al. | VoxAR: 3D modelling editor using real hands gesture for augmented reality | |
Keshavarzi et al. | Mutual scene synthesis for mixed reality telepresence | |
Okuya et al. | ShapeGuide: Shape-based 3D interaction for parameter modification of native CAD data | |
CN117649461B (en) | Interactive image generation method and system based on space layout and use method thereof | |
Tian et al. | Enhancing augmented vr interaction via egocentric scene analysis | |
CN118115642A (en) | Three-dimensional digital person generation method, three-dimensional digital person generation device, electronic device, storage medium, and program product | |
Nishino et al. | A virtual environment for modeling 3D objects through spatial interaction | |
IRIQAT et al. | Comparison of reality types | |
Li et al. | Anchor‐based crowd formation transformation | |
Shumaker | Virtual, Augmented and Mixed Reality: Designing and Developing Augmented and Virtual Environments: 5th International Conference, VAMR 2013, Held as Part of HCI International 2013, Las Vegas, NV, USA, July 21-26, 2013, Proceedings, Part I |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |