CN117726720A - Image editing method, device, electronic equipment and readable storage medium - Google Patents

Image editing method, device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN117726720A
CN117726720A CN202311787479.3A CN202311787479A CN117726720A CN 117726720 A CN117726720 A CN 117726720A CN 202311787479 A CN202311787479 A CN 202311787479A CN 117726720 A CN117726720 A CN 117726720A
Authority
CN
China
Prior art keywords
image
edited
editing
hidden space
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311787479.3A
Other languages
Chinese (zh)
Inventor
张健
牟冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Shenzhen Graduate School
Original Assignee
Peking University Shenzhen Graduate School
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School filed Critical Peking University Shenzhen Graduate School
Priority to CN202311787479.3A priority Critical patent/CN117726720A/en
Publication of CN117726720A publication Critical patent/CN117726720A/en
Pending legal-status Critical Current

Links

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The application discloses an image editing method, an image editing device, electronic equipment and a readable storage medium, wherein the image editing method comprises the following steps: acquiring an image to be edited and a corresponding image diffusion model; extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited; according to the hidden space image characteristics, carrying out iterative diffusion on the image to be edited to obtain a hidden space characteristic image; and carrying out image editing on the hidden space feature image through the image diffusion model to obtain a target image. The method and the device solve the technical problem that the generalization of editing for editing the image is poor.

Description

Image editing method, device, electronic equipment and readable storage medium
Technical Field
The present disclosure relates to the field of image editing technologies, and in particular, to an image editing method, an image editing device, an electronic device, and a readable storage medium.
Background
Along with the continuous development of technology, the requirements for image editing capability are also increasing, and the requirements for diversification and intellectualization of image editing by users are met by combining the deep learning technology when image editing is performed currently, unlike the traditional image editing mode of modifying image contents such as color, contrast, adding filters and the like.
At present, when image editing is performed, the image editing is usually performed based on a generation countermeasure network, that is, the image editing is performed by editing the representation of the image in the GAN model hidden space, but since the GAN model can only edit images of a specific type, when a user wants to edit a natural image, the editing effect meeting the user's requirement cannot be achieved, and further, the situation that the user experiences poor in performing image editing easily occurs, so that the editing generalization of the image editing is poor at present.
Disclosure of Invention
The main purpose of the present application is to provide an image editing method, an image editing device, an electronic device, and a readable storage medium, which aim to solve the technical problem of poor editing generalization of image editing in the prior art.
To achieve the above object, the present application provides an image editing method including:
acquiring an image to be edited and a corresponding image diffusion model;
extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited;
according to the hidden space image characteristics, carrying out iterative diffusion on the image to be edited to obtain a hidden space characteristic image;
And carrying out image editing on the hidden space feature image through the image diffusion model to obtain a target image.
Optionally, the image to be edited comprises a natural image to be edited and a reference image to be edited of the natural image to be edited,
the step of extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited comprises the following steps:
performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate natural image characteristics of the natural image to be edited in a hidden space;
performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate reference image characteristics of the natural image to be edited in a hidden space;
and taking the natural image characteristic and the reference image characteristic as the hidden space image characteristic.
Optionally, before the step of iteratively diffusing the image to be edited according to the hidden space image feature according to the selection identifier to obtain the hidden space feature image, the image editing method further includes:
acquiring first intermediate sampling information of the natural image to be edited in an iterative inverse sampling process and second intermediate sampling information of the reference image to be edited in the iterative inverse sampling process;
Converting the first intermediate sampled information to a first intermediate sampled feature and converting the second intermediate sampled information to a second intermediate sampled feature;
calculating a target guiding gradient of the hidden space image feature according to the first intermediate sampling feature and the second intermediate sampling feature;
and carrying out iterative diffusion on the image to be edited according to the target guide gradient and the hidden space image characteristics to obtain a hidden space characteristic image.
Optionally, the step of calculating the target guidance gradient of the latent spatial image feature from the first intermediate sampling feature and the second intermediate sampling feature comprises:
according to the first intermediate sampling feature and the second intermediate sampling feature, calculating editing similarity of a target editing area of the image to be edited before and after editing;
and calculating the target guiding gradient of the hidden space image characteristic according to the energy loss function corresponding to the editing similarity.
Optionally, the energy loss function is as follows:
wherein S is the editing similarity, F t gen For the first intermediate sampling feature, F t gud For the second intermediate sampled feature, m gen For the target editing area before editing, m gud After being editedIs a target editing area of (1).
Optionally, the calculation formula of the target guidance gradient is as follows:
optionally, the target image includes a first target image and a second target image, where the first target image carries image features of the natural image to be edited, and the second target image carries image features of the natural image to be edited and the reference image to be edited.
To achieve the above object, the present application also provides an image editing apparatus comprising:
the acquisition module is used for acquiring the image to be edited and the corresponding image diffusion model;
the extraction module is used for extracting the characteristics of the image to be edited through the image diffusion model to obtain the hidden space image characteristics of the image to be edited;
the diffusion module is used for carrying out iterative diffusion on the image to be edited according to the hidden space image characteristics to obtain a hidden space characteristic image;
and the editing module is used for carrying out image editing on the hidden space feature image through the image diffusion model to obtain a target image.
Optionally, the image to be edited includes a natural image to be edited and a reference image to be edited of the natural image to be edited, and the extracting module is further configured to:
Performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate natural image characteristics of the natural image to be edited in a hidden space;
performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate reference image characteristics of the natural image to be edited in a hidden space;
and taking the natural image characteristic and the reference image characteristic as the hidden space image characteristic.
Optionally, the image editing apparatus is further configured to:
acquiring first intermediate sampling information of the natural image to be edited in an iterative inverse sampling process and second intermediate sampling information of the reference image to be edited in the iterative inverse sampling process;
converting the first intermediate sampled information to a first intermediate sampled feature and converting the second intermediate sampled information to a second intermediate sampled feature;
calculating a target guiding gradient of the hidden space image feature according to the first intermediate sampling feature and the second intermediate sampling feature;
and carrying out iterative diffusion on the image to be edited according to the target guide gradient and the hidden space image characteristics to obtain a hidden space characteristic image.
Optionally, the image editing apparatus is further configured to:
according to the first intermediate sampling feature and the second intermediate sampling feature, calculating editing similarity of a target editing area of the image to be edited before and after editing;
and calculating the target guiding gradient of the hidden space image characteristic according to the energy loss function corresponding to the editing similarity.
Optionally, the energy loss function is as follows:
wherein S is the editing similarity, F t gen For the first intermediate sampling feature, F t gud For the second intermediate sampled feature, m gen For the target editing area before editing, m gud And editing the area for the edited target.
Optionally, the calculation formula of the target guidance gradient is as follows:
optionally, the target image includes a first target image and a second target image, where the first target image carries image features of the natural image to be edited, and the second target image carries image features of the natural image to be edited and the reference image to be edited.
The application also provides an electronic device comprising: at least one processor and a memory communicatively coupled to the at least one processor, the memory storing instructions executable by the at least one processor to enable the at least one processor to perform the steps of the image editing method as described above.
The present application also provides a computer-readable storage medium having stored thereon a program for implementing an image editing method, which when executed by a processor implements the steps of the image editing method as described above.
The present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of an image editing method as described above.
The application provides an image editing method, an image editing device, electronic equipment and a readable storage medium, namely, an image to be edited and a corresponding image diffusion model are acquired; extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited; according to the hidden space image characteristics, carrying out iterative diffusion on the image to be edited to obtain a hidden space characteristic image; and carrying out image editing on the hidden space feature image through the image diffusion model to obtain a target image.
According to the method, when the image to be edited is edited, the image to be edited and the corresponding image diffusion model thereof are firstly obtained, then the image to be edited is subjected to feature extraction through the image diffusion model, so that hidden space image features of the image to be edited under the hidden space are extracted, then the image to be edited is subjected to iteration diffusion through the hidden space image features, so that the hidden space feature image is obtained, finally, the image diffusion model is used for finishing the image editing of the hidden space feature image, so that a target image is obtained, and meanwhile, the image diffusion model can rely on the image to be edited to finally output the target image meeting the user requirement, and has strong generalization capability, so that the image type of the image to be edited is not required to be limited when the image to be edited is input for image editing, and the purpose of editing any image to obtain the target image meeting the user requirement can be achieved.
After extracting the hidden space image features of the image to be edited, the iteration diffusion of the image to be edited can be carried out based on the hidden space image features to obtain a hidden space feature image under the hidden space, so that the aim of randomly editing the image to be edited can be achieved, and finally, the image editing of the hidden space feature image is carried out through an image diffusion model to obtain a target image meeting the user editing requirement, namely, the aim of editing any image into the target image meeting the user editing requirement by means of the powerful generalization capability of the image diffusion model is achieved.
Based on the method, the image to be edited is subjected to feature extraction through the image diffusion model, so that hidden space image features of the image to be edited under the hidden space are obtained, iteration diffusion of the image to be edited is completed depending on the hidden space image features, the hidden space feature image is obtained, finally, image editing of the hidden space feature image is completed through the image diffusion model, and a target image is obtained, so that the aim of editing the image to be edited is fulfilled by utilizing the powerful generalization capability of the image diffusion model. Rather than relying on the generation of the countermeasure network for image editing. The technical defect that the GAN model can only edit images of specific categories, so that when a user wants to edit natural images, the editing effect meeting the user requirement cannot be achieved, and the situation that the user experiences poor image editing is easy to occur is overcome, and therefore the editing generalization of image editing is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
Fig. 1 is a flowchart of an image editing method according to an embodiment of the present application;
fig. 2 is a schematic diagram of an editing flow of an image editing method according to an embodiment of the present application for performing image editing based on an image diffusion model;
fig. 3 is a schematic structural diagram of an image editing apparatus according to a second embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to a third embodiment of the present application.
The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings in conjunction with the embodiments.
Detailed Description
In order to make the above objects, features and advantages of the present invention more comprehensible, the following description of the embodiments accompanied with the accompanying drawings will be given in detail. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
An embodiment of the present application provides an image editing method, in a first embodiment of the image editing method of the present application, referring to fig. 1, the image editing method includes:
step S10, obtaining an image to be edited and a corresponding image diffusion model;
step S20, extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited;
step S30, carrying out iterative diffusion on the image to be edited according to the hidden space image characteristics to obtain a hidden space characteristic image;
and S40, performing image editing on the hidden space feature image through the image diffusion model to obtain a target image.
In this embodiment, it should be noted that, although fig. 1 shows a logic sequence, in some cases, the steps shown or described may be performed in a sequence different from that shown here, the image editing device deployed with the image editing method may be specifically a computer, a mobile phone, or a personal PC, etc., the image to be edited is used to characterize an image waiting for image editing, the image content carried by the image to be edited in this embodiment is not specifically limited, that is, the image to be edited may be either a human image or an animal image, the image diffusion model is used to implement image editing of the image to be edited, that is, a new image similar to training data may be generated, it may be understood that the image diffusion model is a diffusion model trained in advance, the greater the data amount of training data is, the greater the generalization capability of the image diffusion model is possessed by the image diffusion model may specifically include an autoencoder, an autodecoder, an inverse sampling unit, a diffusion unit, a denoising unit, etc., before feature extraction of the image to be edited is performed by the image diffusion model, the feature vector needs to be encoded in a space, or the image to be coded in a space, and the image is further characterized by the image diffusion model.
Additionally, it should be noted that, after the latent space feature image is used to characterize the feature image under the latent space and the latent space image feature of an image to be edited is found through the inverse sampling process of the image diffusion model, the original image can be reconstructed further through the forward sampling process of the image diffusion model, where both the inverse sampling process and the forward sampling process can be completed through iterative T steps, for example, in one implementation, the image to be edited and the pre-trained image diffusion model ε can be selected first θ Then the image to be edited is encoded into the hidden space by the self encoder of the image diffusion model, and then the image diffusion model epsilon is used for encoding the image to be edited into the hidden space θ Performing an inverse sampling process to find hidden space image features Z of the image to be edited t At each diffusion time step t, the hidden space image characteristic Z is updated through iteration t And finally decoding the edited image by the hidden space by an encoder of the image diffusion model after the iterative diffusion is completed to obtain a target image, wherein the target image is used for representing the image meeting the editing requirement of a user.
Additionally, since the image generation process of the image diffusion model includes sampling in step T, the hidden space of the image diffusion model is discrete and may include Z T ,...Z 0 The drag-type image editing operation is modeled as the change of the corresponding relation of the features, namely, the image features are converted from the position before editing to the position after editing, so that the image to be edited is edited according with the user editing requirement by using the pre-trained image diffusion model without any training.
As an example, steps S10 to S40 include: acquiring an image to be edited and completing a pre-trained image diffusion model; encoding the image to be edited into a hidden space through a self-encoder of the image diffusion model, and inversely sampling the image to be edited positioned in the hidden space through the image diffusion model to obtain hidden space image characteristics of the image to be edited; according to the hidden space image characteristics at each time step, carrying out iterative diffusion on the image to be edited through the image diffusion model to obtain a hidden space characteristic image; and decoding the hidden space characteristic image through a self decoder of the image diffusion model to obtain a target image.
According to the embodiment of the application, firstly, the image to be edited and the pre-trained image diffusion model are obtained, then, the feature extraction of the image to be edited is carried out through the image diffusion model, the hidden space image feature of the image to be edited is obtained, finally, the image to be edited is decoded into the target image after iterative diffusion of the preset time step based on the image diffusion model, because the image diffusion model has strong generalization capability, the purpose that any image to be edited can be subjected to image editing meeting the user editing requirement can be achieved, and the purpose that the image editing can only be carried out depending on generating an countermeasure network when the image editing is carried out is achieved, so that the technical defect that the GAN model can only edit images of specific categories, when a user wants to edit a natural image, the editing effect meeting the user requirement cannot be achieved, and the situation that the user has poor image editing experience is easy to occur is overcome.
The step of extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited comprises the following steps:
step A10, performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate natural image characteristics of the natural image to be edited in a hidden space;
step A20, performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate reference image characteristics of the natural image to be edited in a hidden space;
and step A30, the natural image features and the reference image features are used as the hidden space image features together.
In this embodiment, it should be noted that, in the process of performing image editing, a user may need to carry image elements that are not included in an original image to be edited in an object image obtained by editing, for example, an apple is assumed to be included in an image a, and a banana is included in an image B, and the user's editing needs to be performed on both the apple and the banana in the object image, so that preparation of a reference image needs to be performed at an initial stage of image editing, that is, the natural image to be edited is used for characterizing the natural image to be subjected to image editing, the reference image to be edited is used for characterizing the reference image of the natural image to be subjected to image editing, wherein, when feature extraction is performed, feature extraction needs to be performed on different images, for example, in an implementable manner, after selecting the natural image to be edited and the reference image to be edited, the reference image to be edited is encoded into an hidden space through an autogenous encoder of an image diffusion model, so as to perform feature extraction on the natural image to be edited and the reference image to be subjected to image feature extraction, wherein the natural image to be edited and the reference image to be subjected to image feature extraction refer to the natural image to the feature extraction in the natural image to the final feature image.
As an example, steps a10 to a30 include: performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate natural image characteristics of the natural image to be edited in a hidden space; performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate reference image characteristics of the natural image to be edited in a hidden space; and taking the natural image characteristic and the reference image characteristic as the hidden space image characteristic.
Before the step of iteratively diffusing the image to be edited according to the hidden space image features to obtain the hidden space feature image, the image editing method further comprises the following steps:
step B10, acquiring first intermediate sampling information of the natural image to be edited in an iterative inverse sampling process and second intermediate sampling information of the reference image to be edited in the iterative inverse sampling process;
step B20, converting the first intermediate sampling information into a first intermediate sampling feature, and converting the second intermediate sampling information into a second intermediate sampling feature;
step B30, calculating a target guiding gradient of the hidden space image feature according to the first intermediate sampling feature and the second intermediate sampling feature;
And step B40, carrying out iterative diffusion on the image to be edited according to the target guide gradient and the hidden space image characteristics to obtain a hidden space characteristic image.
In this embodiment, it should be noted that, in the process of performing inverse sampling on the natural image to be edited and the reference image to be edited, a large number of intermediate features are generated, and in order to fully utilize the intermediate features in the inverse sampling process, the embodiment of the present application proposes to construct an intermediate storage in the inverse sampling process, so as to reuse intermediate information in the inverse sampling process and guide the subsequent editing process, and it is understood that, in the sample generation stage, guide information provided by the intermediate feature storage may be divided into the following two parts: 1) Feature transfer for maintaining consistency between the target image and the image to be edited; 2) For converting editing operations into gradient correction in diffusion, by multiplexing guiding information of image editing in inverse sampling process, additionally generated computing cost can be saved while guiding information is generated, for example, in one implementation, natural image characteristics Z of natural image to be edited under hidden space are obtained after inverse sampling of t steps t And build an intermediate store in which { Z t gud ,K t gud ,V t gud First intermediate sampling information for characterizing a natural image to be edited, { Z t ref ,K t ref ,V t ref Second intermediate sampling information for characterizing a reference image to be edited { Z }, provided that the reference image to be edited is not present during image editing t ref ,K t ref ,V t ref Null to generate natural image feature Z t As a starting point, an edited image is generated by sampling in t steps, and in the process, two guide branches are arranged: 1) { K t gud ,V t gud Sum { K } t ref ,V t ref Replacing the image diffusion model epsilon at the current time step by means of feature transfer θ The key and value in the self-attribute module in the document are used for keeping the final image editing result consistent with the original image to be edited; 2) Generating a guiding gradient by an energy loss function, wherein m is gud And m gen Representing the position of the image content of the image to be edited before editing and the position of the image content of the image to be edited after editing in sequence, wherein the constraint is executed on the characteristic domain, and Z is calculated through an image diffusion model t gud And Z t gen Conversion to feature domain to generate F t gud And F t gen Wherein F is t gud For the first intermediate sampling feature, F t gen For the second intermediate sampling feature, a target guide gradient is used to measure the loss of the pre-edit region of the first intermediate sampling feature and the post-edit region of the second intermediate sampling feature.
As an example, steps B10 to B40 include: acquiring first intermediate sampling information of the natural image to be edited in an iterative inverse sampling process and second intermediate sampling information of the reference image to be edited in the iterative inverse sampling process; converting the first intermediate sampling information to a feature domain to obtain a first intermediate sampling feature, and converting the second intermediate sampling information to the feature domain to obtain a second intermediate sampling feature; calculating a target guiding gradient of the hidden space image feature according to the first intermediate sampling feature and the second intermediate sampling feature; and carrying out iterative diffusion on the image to be edited according to the target guide gradient and the hidden space image characteristics to obtain a hidden space characteristic image.
Wherein the step of calculating the target guidance gradient of the latent spatial image feature from the first intermediate sampling feature and the second intermediate sampling feature comprises:
step C10, calculating editing similarity of a target editing area of the image to be edited before and after editing according to the first intermediate sampling feature and the second intermediate sampling feature;
and step C20, calculating target guiding gradients of the hidden space image features according to the energy loss function corresponding to the editing similarity.
In this embodiment, it should be noted that, editing similarity is used to characterize similarity of regional similarity of the first intermediate sampling feature and the second intermediate sampling feature before and after editing, energy loss function is constructed by multiplexing an image diffusion model, and the calculated target guiding gradient is led into an iterative diffusion process of the image diffusion model, so as to obtain a hidden space feature image finally, so that training overhead of the energy function can be avoided while image editing accuracy is ensured, and the target editing region refers to a designated region in the image to be edited.
As an example, steps C10 to C20 include: calculating to obtain editing similarity of the target editing area of the image to be edited before and after editing through the first intermediate sampling feature and the second intermediate sampling feature; and calculating the target guiding gradient of the hidden space image characteristic through an energy loss function constructed by the editing similarity.
Wherein the energy loss function is as follows:
wherein S is the editing similarity, F t gen For the first intermediate sampling feature, F t gud For the second intermediate sampled feature, m gen For the target editing area before editing, m gud And editing the area for the edited target.
The calculation formula of the target guidance gradient is as follows:
the target image comprises a first target image and a second target image, wherein the first target image carries image characteristics of the natural image to be edited, and the second target image carries image characteristics of the natural image to be edited and the reference image to be edited.
It can be understood that the first target image may be a natural image obtained by performing image editing processing such as object movement in an image, size change of the image, point dragging of the image and the like on the natural image to be edited, and the second target image may be a natural image obtained by performing content copy pasting or appearance change between the natural image to be edited and the reference image to be edited, and when the image editing is performed, different image editing tasks may be performed through the image diffusion model due to the existence of the reference image to be edited, so as to expand scene adaptability of image editing based on the image diffusion model.
In an embodiment, referring to fig. 2, fig. 2 is a schematic diagram showing an editing flow for editing an image based on an image diffusion model, where a natural image X to be edited and a reference image Y to be edited are input into the image diffusion model together, the image diffusion model performs an inverse sampling process to obtain hidden space image features of different images, and intermediate features in the inverse sampling process are stored to perform editing guidance on a sampling process of the image diffusion model, so as to finally obtain a target image.
The embodiment of the application provides an image editing method, namely, an image to be edited and a corresponding image diffusion model are acquired; extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited; according to the hidden space image characteristics, carrying out iterative diffusion on the image to be edited to obtain a hidden space characteristic image; and carrying out image editing on the hidden space feature image through the image diffusion model to obtain a target image.
When the image to be edited is edited, the image to be edited and the corresponding image diffusion model thereof are firstly obtained, then the image to be edited is subjected to feature extraction through the image diffusion model, so that hidden space image features of the image to be edited under the hidden space are extracted, then the image to be edited is subjected to iteration diffusion through the hidden space image features, so that the hidden space feature image is obtained, finally, the image diffusion model is used for completing the image editing of the hidden space feature image, so that a target image is obtained, and meanwhile, the image diffusion model can be used for finally outputting the target image meeting the user requirement depending on the image to be edited, and has strong generalization capability, so that the image type of the image to be edited is not required to be limited when the image to be edited is input, and the aim of editing any image to obtain the target image meeting the user requirement can be achieved.
After extracting the hidden space image features of the image to be edited, the iteration diffusion of the image to be edited can be carried out based on the hidden space image features to obtain a hidden space feature image under the hidden space, so that the aim of randomly editing the image to be edited can be achieved, and finally, the image editing of the hidden space feature image is carried out through an image diffusion model to obtain a target image meeting the user editing requirement, namely, the aim of editing any image into the target image meeting the user editing requirement by means of the powerful generalization capability of the image diffusion model is achieved.
Based on the above, the embodiment of the application performs feature extraction on the image to be edited through the image diffusion model, so as to obtain the hidden space image feature of the image to be edited under the hidden space, and completes iterative diffusion on the image to be edited depending on the hidden space image feature to obtain the hidden space feature image, and finally completes image editing on the hidden space feature image by utilizing the image diffusion model to obtain the target image, thereby realizing the aim of completing editing on the image to be edited by utilizing the powerful generalization capability of the image diffusion model. Rather than relying on the generation of the countermeasure network for image editing. The technical defect that the GAN model can only edit images of specific categories, so that when a user wants to edit natural images, the editing effect meeting the user requirement cannot be achieved, and the situation that the user experiences poor image editing is easy to occur is overcome, and therefore the editing generalization of image editing is improved.
Example two
An embodiment of the present application further provides an image editing apparatus, referring to fig. 3, including:
an acquiring module 101, configured to acquire an image to be edited and a corresponding image diffusion model;
the extracting module 102 is configured to perform feature extraction on the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited;
the diffusion module 103 is configured to iteratively diffuse the image to be edited according to the hidden space image feature to obtain a hidden space feature image;
and the editing module 104 is used for carrying out image editing on the hidden space feature image through the image diffusion model to obtain a target image.
Optionally, the image to be edited includes a natural image to be edited and a reference image to be edited of the natural image to be edited, and the extracting module 102 is further configured to:
performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate natural image characteristics of the natural image to be edited in a hidden space;
performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate reference image characteristics of the natural image to be edited in a hidden space;
And taking the natural image characteristic and the reference image characteristic as the hidden space image characteristic.
Optionally, the image editing apparatus is further configured to:
acquiring first intermediate sampling information of the natural image to be edited in an iterative inverse sampling process and second intermediate sampling information of the reference image to be edited in the iterative inverse sampling process;
converting the first intermediate sampled information to a first intermediate sampled feature and converting the second intermediate sampled information to a second intermediate sampled feature;
calculating a target guiding gradient of the hidden space image feature according to the first intermediate sampling feature and the second intermediate sampling feature;
and carrying out iterative diffusion on the image to be edited according to the target guide gradient and the hidden space image characteristics to obtain a hidden space characteristic image.
Optionally, the image editing apparatus is further configured to:
according to the first intermediate sampling feature and the second intermediate sampling feature, calculating editing similarity of a target editing area of the image to be edited before and after editing;
and calculating the target guiding gradient of the hidden space image characteristic according to the energy loss function corresponding to the editing similarity.
Optionally, the energy loss function is as follows:
wherein S is the editing similarity, F t gen For the first intermediate sampling feature, F t gud For the second intermediate sampled feature, m gen For the target editing area before editing, m gud And editing the area for the edited target.
Optionally, the calculation formula of the target guidance gradient is as follows:
optionally, the target image includes a first target image and a second target image, where the first target image carries image features of the natural image to be edited, and the second target image carries image features of the natural image to be edited and the reference image to be edited.
The image editing device provided by the invention adopts the image editing method in the embodiment, and solves the technical problem of poor editing generalization of image editing. Compared with the prior art, the image editing device provided by the embodiment of the invention has the same beneficial effects as the image editing method provided by the embodiment, and other technical features in the image editing device are the same as the features disclosed by the method of the embodiment, and are not described in detail herein.
Example III
The embodiment of the invention provides electronic equipment, which comprises: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the image editing method in the first embodiment.
Referring now to fig. 4, a schematic diagram of an electronic device suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 4 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 4, the electronic device may include a processing apparatus 1001 (e.g., a central processing unit, a graphics processor, etc.), which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1002 or a program loaded from a storage apparatus 1003 into a Random Access Memory (RAM) 1004. In the RAM1004, various programs and data required for the operation of the electronic device are also stored. The processing device 1001, the ROM1002, and the RAM1004 are connected to each other by a bus 1005. An input/output (I/O) interface 1006 is also connected to the bus.
In general, the following systems may be connected to the I/O interface 1006: input devices 1007 including, for example, a touch screen, touchpad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, and the like; an output device 1008 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage device 1003 including, for example, a magnetic tape, a hard disk, and the like; and communication means 1009. The communication means may allow the electronic device to communicate with other devices wirelessly or by wire to exchange data. While electronic devices having various systems are shown in the figures, it should be understood that not all of the illustrated systems are required to be implemented or provided. More or fewer systems may alternatively be implemented or provided.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 1009, or installed from the storage device 1003, or installed from the ROM 1002. The above-described functions defined in the method of the embodiment of the present disclosure are performed when the computer program is executed by the processing device 1001.
The electronic equipment provided by the invention adopts the image editing method in the embodiment, and solves the technical problem of poor editing generalization of image editing. Compared with the prior art, the electronic device provided by the embodiment of the invention has the same beneficial effects as the image editing method provided by the embodiment, and other technical features in the electronic device are the same as the features disclosed by the method of the embodiment, and are not repeated here.
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the description of the above embodiments, particular features, structures, materials, or characteristics may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Example IV
The present embodiment provides a computer-readable storage medium having computer-readable program instructions stored thereon for performing the image editing method in the above-described embodiment.
The computer readable storage medium according to the embodiments of the present invention may be, for example, a usb disk, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this embodiment, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The above-described computer-readable storage medium may be contained in an electronic device; or may exist alone without being assembled into an electronic device.
The computer-readable storage medium carries one or more programs that, when executed by an electronic device, cause the electronic device to: acquiring an image to be edited and a corresponding image diffusion model; extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited; according to the hidden space image characteristics, carrying out iterative diffusion on the image to be edited to obtain a hidden space characteristic image; and carrying out image editing on the hidden space feature image through the image diffusion model to obtain a target image.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present disclosure may be implemented in software or hardware. Wherein the name of the module does not constitute a limitation of the unit itself in some cases.
The computer readable storage medium provided by the invention stores the computer readable program instructions for executing the image editing method, and solves the technical problem of poor editing generalization of image editing. Compared with the prior art, the beneficial effects of the computer readable storage medium provided by the embodiment of the invention are the same as those of the image editing method provided by the above embodiment, and are not described herein.
Example five
The present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of an image editing method as described above.
The computer program product solves the technical problem that the editing generalization of image editing is poor. Compared with the prior art, the beneficial effects of the computer program product provided by the embodiment of the present invention are the same as those of the image editing method provided by the above embodiment, and are not described herein.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the claims, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application, or direct or indirect application in other related technical fields are included in the scope of the claims.

Claims (10)

1. An image editing method, characterized in that the image editing method comprises:
acquiring an image to be edited and a corresponding image diffusion model;
extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited;
according to the hidden space image characteristics, carrying out iterative diffusion on the image to be edited to obtain a hidden space characteristic image;
and carrying out image editing on the hidden space feature image through the image diffusion model to obtain a target image.
2. The image editing method according to claim 1, wherein the image to be edited includes a natural image to be edited and a reference image to be edited of the natural image to be edited,
the step of extracting features of the image to be edited through the image diffusion model to obtain hidden space image features of the image to be edited comprises the following steps:
performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate natural image characteristics of the natural image to be edited in a hidden space;
performing iterative inverse sampling on the natural image to be edited through the image diffusion model to generate reference image characteristics of the natural image to be edited in a hidden space;
And taking the natural image characteristic and the reference image characteristic as the hidden space image characteristic.
3. The image editing method according to claim 2, wherein before the step of iteratively diffusing the image to be edited based on the hidden space image feature to obtain a hidden space feature image, the image editing method further comprises:
acquiring first intermediate sampling information of the natural image to be edited in an iterative inverse sampling process and second intermediate sampling information of the reference image to be edited in the iterative inverse sampling process;
converting the first intermediate sampled information to a first intermediate sampled feature and converting the second intermediate sampled information to a second intermediate sampled feature;
calculating a target guiding gradient of the hidden space image feature according to the first intermediate sampling feature and the second intermediate sampling feature;
and carrying out iterative diffusion on the image to be edited according to the target guide gradient and the hidden space image characteristics to obtain a hidden space characteristic image.
4. The image editing method of claim 3, wherein the step of calculating a target guidance gradient for the hidden space image feature from the first intermediate sampled feature and the second intermediate sampled feature comprises:
According to the first intermediate sampling feature and the second intermediate sampling feature, calculating editing similarity of a target editing area of the image to be edited before and after editing;
and calculating the target guiding gradient of the hidden space image characteristic according to the energy loss function corresponding to the editing similarity.
5. The image editing method of claim 4, wherein the energy loss function is as follows:
wherein S is the followingEdit similarity, F t gen For the first intermediate sampling feature, F t gud For the second intermediate sampled feature, m gen For the target editing area before editing, m gud And editing the area for the edited target.
6. The image editing method of claim 4, wherein the target guidance gradient is calculated as follows:
wherein,guiding a gradient for said target,>for intermediate guidance gradient +.>For initial guidance gradients.
7. The image editing method according to claim 2, wherein the target image includes a first target image and a second target image, wherein the first target image carries image features of the natural image to be edited, and the second target image carries image features of the natural image to be edited and the reference image to be edited.
8. An image editing apparatus, characterized in that the image editing apparatus comprises:
the acquisition module is used for acquiring the image to be edited and the corresponding image diffusion model;
the extraction module is used for extracting the characteristics of the image to be edited through the image diffusion model to obtain the hidden space image characteristics of the image to be edited;
the diffusion module is used for carrying out iterative diffusion on the image to be edited according to the hidden space image characteristics to obtain a hidden space characteristic image;
and the editing module is used for carrying out image editing on the hidden space feature image through the image diffusion model to obtain a target image.
9. An electronic device, the electronic device comprising:
at least one processor;
a memory communicatively coupled to the at least one processor;
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the image editing method of any of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a program that implements an image editing method, the program implementing the image editing method being executed by a processor to implement the steps of the image editing method according to any one of claims 1 to 7.
CN202311787479.3A 2023-12-22 2023-12-22 Image editing method, device, electronic equipment and readable storage medium Pending CN117726720A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311787479.3A CN117726720A (en) 2023-12-22 2023-12-22 Image editing method, device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311787479.3A CN117726720A (en) 2023-12-22 2023-12-22 Image editing method, device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN117726720A true CN117726720A (en) 2024-03-19

Family

ID=90201400

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311787479.3A Pending CN117726720A (en) 2023-12-22 2023-12-22 Image editing method, device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN117726720A (en)

Similar Documents

Publication Publication Date Title
CN109597617B (en) Method and device for quickly generating service page based on template
CN110047121B (en) End-to-end animation generation method and device and electronic equipment
CN110365973B (en) Video detection method and device, electronic equipment and computer readable storage medium
CN113778419B (en) Method and device for generating multimedia data, readable medium and electronic equipment
CN110035271B (en) Fidelity image generation method and device and electronic equipment
CN115937033B (en) Image generation method and device and electronic equipment
WO2023138498A1 (en) Method and apparatus for generating stylized image, electronic device, and storage medium
CN110288532B (en) Method, apparatus, device and computer readable storage medium for generating whole body image
CN111652675A (en) Display method and device and electronic equipment
CN111898338B (en) Text generation method and device and electronic equipment
CN112734631A (en) Video image face changing method, device, equipment and medium based on fine adjustment model
CN117726720A (en) Image editing method, device, electronic equipment and readable storage medium
CN111325668A (en) Training method and device for image processing deep learning model and electronic equipment
CN112070888B (en) Image generation method, device, equipment and computer readable medium
CN111696041B (en) Image processing method and device and electronic equipment
KR20110107802A (en) Media portability and compatibility for different destination platforms
CN111611420B (en) Method and device for generating image description information
CN115049537A (en) Image processing method, image processing device, electronic equipment and storage medium
CN114564606A (en) Data processing method and device, electronic equipment and storage medium
CN110033413B (en) Image processing method, device, equipment and computer readable medium of client
CN117376634B (en) Short video music distribution method and device, electronic equipment and storage medium
CN114187408B (en) Three-dimensional face model reconstruction method and device, electronic equipment and storage medium
CN116974684B (en) Map page layout method, map page layout device, electronic equipment and computer readable medium
CN111382556B (en) Data conversion method, device, equipment and storage medium
CN110263797B (en) Method, device and equipment for estimating key points of skeleton and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination