WO2023087656A1 - Image generation method and apparatus - Google Patents

Image generation method and apparatus Download PDF

Info

Publication number
WO2023087656A1
WO2023087656A1 PCT/CN2022/094971 CN2022094971W WO2023087656A1 WO 2023087656 A1 WO2023087656 A1 WO 2023087656A1 CN 2022094971 W CN2022094971 W CN 2022094971W WO 2023087656 A1 WO2023087656 A1 WO 2023087656A1
Authority
WO
WIPO (PCT)
Prior art keywords
style
fusion
network
sample
target
Prior art date
Application number
PCT/CN2022/094971
Other languages
French (fr)
Chinese (zh)
Inventor
刘明聪
李强
秦泽奎
张国鑫
万鹏飞
郑文
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2023087656A1 publication Critical patent/WO2023087656A1/en

Links

Images

Classifications

    • G06T3/04
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular to an image generation method, device, electronic equipment, and storage medium.
  • the image style conversion technology can generate an image with a target style based on a given object image such as a human face, so that it has an artistic effect similar to the target style, thereby realizing the conversion of the object image into an object of different styles such as animation, oil painting, and pencil drawing. style image.
  • the present disclosure provides an image generation method, device, electronic equipment, and storage medium, which can quickly generate high-quality object-style images, and improve the efficiency of self-adaptive generation of multi-style object-style images.
  • the disclosed technical scheme is as follows:
  • an image generation method including:
  • the network fusion parameters Based on network fusion parameters corresponding to a preset number of network layers in the style fusion network, perform style fusion processing on the target style code and the preset object code to obtain a target style fusion code; the network fusion parameters are based on the The fusion data corresponding to a preset number of network layers and the target fusion weight are determined, and the target fusion weight is obtained by performing fusion weight learning based on the target style code and the preset object code;
  • the target style fusion code is input into the target image generation network for image generation processing, and the preset object style image corresponding to the target style is obtained.
  • performing style fusion processing on the target style code and the preset object code based on the network fusion parameters corresponding to a preset number of network layers in the style fusion network, and obtaining the target style fusion code includes:
  • the target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network;
  • the determining the fusion data corresponding to the preset number of network layers according to the target fusion control data includes:
  • said obtaining the target style encoding of the target style includes:
  • the method also includes:
  • the trained style coding network to be trained is used as the style coding network.
  • said obtaining the target style encoding of the target style includes:
  • the preset object code includes acquiring through the following steps:
  • the method also includes:
  • the trained style fusion network to be trained is used as the style fusion network, and the trained image generation network to be trained is used as the target image generation network.
  • the sample style fusion coding includes a first style fusion coding and a second style fusion coding; the sample network fusion parameters corresponding to a preset number of network layers to be trained based on the style fusion network to be trained, for Performing style fusion processing on the first sample style code and the sample object code to obtain the sample style fusion code includes:
  • the first fusion control data is used to control the first sample style code and the sample object code, from the first style fusion network to be trained
  • the first network layer starts to fuse
  • the second fusion control data is used to control the first sample style code not to participate in fusion in the style fusion network to be trained
  • first fusion control data and the second fusion control data respectively determine the first sample fusion data and the second sample fusion data corresponding to the preset number of network layers to be trained;
  • the sample object style image includes a first sample object style image corresponding to the first style fusion encoding and a second sample object style image corresponding to the second style fusion encoding;
  • the to-be-trained The discriminant network includes an object discriminant network, a style object discriminant network, and a style code discriminant network;
  • the target discriminant information includes object discriminant information, style object discriminant information, and style code discriminant information;
  • the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code are input into the discriminant network to be trained for style discrimination processing, and the obtained Target discrimination information includes:
  • an image generation method including:
  • the first style conversion network is a preset object style image based on the first sample object image and the target style generated by any image generation method provided in the first aspect, and the first preset image generation network is subjected to adversarial training to obtain of.
  • an image generation method including:
  • the second style conversion network is based on the second sample object image, a variety of target style labels and a variety of target styles of preset object style images generated by any image generation method provided in the first aspect, for the second preset
  • the target image generation network is obtained by adversarial training.
  • an image generation device including:
  • an encoding acquisition module configured to perform target style encoding for acquiring preset object encodings and target styles
  • the first style fusion processing module is configured to perform style fusion processing on the target style code and the preset object code based on network fusion parameters corresponding to a preset number of network layers in the style fusion network to obtain a target style fusion Coding; the network fusion parameters are determined based on fusion data corresponding to the preset number of network layers and target fusion weights, and the target fusion weights are fusion weights based on the target style coding and the preset object coding learned;
  • the first image generation processing module is configured to input the target style fusion code into the target image generation network for image generation processing, and obtain a preset object style image corresponding to the target style.
  • the first style fusion processing module includes:
  • a target fusion control data acquisition unit configured to acquire target fusion control data, the target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network;
  • the fused data determination unit is configured to determine fused data corresponding to the preset number of network layers according to the target fused control data
  • the first splicing processing unit is configured to perform splicing processing on the target style code and the preset object code to obtain the target spliced code;
  • the first fusion weight learning unit is configured to perform fusion weight learning based on the target splicing code to obtain the target fusion weight
  • a first weighting processing unit configured to perform weighting processing on the fusion data and the target fusion weight to obtain the network fusion parameters
  • the first style fusion processing unit is configured to perform style fusion processing on the target style code and the preset object code in the preset number of network layers based on the network fusion parameters to obtain the target style fusion encoding.
  • the fusion data determination unit includes:
  • the comparison unit is configured to compare the number of layers corresponding to the preset number of network layers with the target fusion control data to obtain a comparison result
  • the fused data determining subunit is configured to determine fused data corresponding to the preset number of network layers according to the comparison result.
  • the code acquisition module includes:
  • a reference style image acquisition unit configured to acquire a reference style image of the target style
  • the style encoding processing unit is configured to input the reference style image into the style encoding network for style encoding processing to obtain the target style encoding.
  • the device also includes:
  • the sample image acquisition module is configured to acquire the positive sample style image pair and the negative sample style image pair of the target style
  • the style coding processing module is configured to perform style coding processing by inputting the positive sample style image pair and the negative sample style image pair into the style coding network to be trained, to obtain the positive sample style image pair and the negative sample style image pair The respective sample style codes;
  • the perceptual processing module is configured to input the sample style code into the perceptual network to be trained for perceptual processing, and obtain the sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair;
  • a comparison loss information determination module configured to determine the comparison loss information according to the sample perception feature information
  • a first network training module configured to perform training of the style coding network to be trained and the perception network to be trained based on the comparison loss information
  • the style coding network determining module is configured to execute the trained style coding network to be trained as the style coding network.
  • the code acquisition module includes:
  • an initial style code generating unit configured to randomly generate an initial style code based on a first preset distribution
  • the first perceptual processing unit is configured to input the initial style code into the first multi-layer perceptual network for perceptual processing to obtain the target style code.
  • the code acquisition module includes:
  • the initial object code generating unit is configured to randomly generate the initial object code based on the second preset distribution
  • the second perceptual processing unit is configured to input the initial object code into the second multi-layer perceptual network for perceptual processing to obtain the preset object code.
  • the device also includes:
  • a sample data acquisition module configured to acquire the first sample style code of the target style, the sample object code, the second sample style code of the non-target style, the preset style object image and the preset object image;
  • the second style fusion processing module is configured to perform style encoding on the first sample style code and the sample object code based on sample network fusion parameters corresponding to a preset number of network layers to be trained in the style fusion network to be trained Fusion processing to obtain sample style fusion coding;
  • the second image generation processing module is configured to input the sample style fusion code into the image generation network to be trained for image generation processing, and obtain the sample object style image corresponding to the target style;
  • a style discrimination processing module configured to input the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminator to be trained
  • the network performs style discrimination processing to obtain target discrimination information
  • a target loss information determination module configured to determine target loss information according to the target discrimination information
  • the second network training module is configured to perform training of the style fusion network to be trained, the image generation network to be trained, and the discrimination network to be trained based on the target loss information;
  • the network determination module is configured to use the trained style fusion network to be trained as the style fusion network, and use the trained image generation network to be trained as the target image generation network.
  • the style fusion coding of the sample includes a first style fusion coding and a second style fusion coding; the second style fusion processing module includes:
  • the sample fusion control data acquisition unit is configured to perform acquisition of first fusion control data and second fusion control data, the first fusion control data is used to control the first sample style code and the sample object code, from The first network layer in the style fusion network to be trained starts to fuse; the second fusion control data is used to control the first sample style code not to participate in fusion in the style fusion network to be trained;
  • the sample fusion data determination unit is configured to determine the first sample fusion data and the second fusion data corresponding to the preset number of network layers to be trained according to the first fusion control data and the second fusion control data.
  • a second splicing processing unit configured to perform splicing processing on the first sample style code and the sample object code to obtain a sample spliced code
  • the second fusion weight learning unit is configured to perform fusion weight learning based on the sample splicing code to obtain the sample fusion weight;
  • the second weighting processing unit is configured to perform weighting processing on the first sample fusion data, the second sample fusion data and the sample fusion weight respectively, to obtain the first sample network fusion parameters and the second sample network fusion parameter;
  • the second style fusion processing unit is configured to perform, based on the network fusion parameters of the first sample and the network fusion parameters of the second sample, the first sample style in the preset number of network layers to be trained respectively. performing style fusion processing on the coding and the sample object coding to obtain the first style fusion coding and the second style fusion coding.
  • the sample object style image includes a first sample object style image corresponding to the first style fusion encoding and a second sample object style image corresponding to the second style fusion encoding;
  • the to-be-trained The discriminant network includes an object discriminant network, a style object discriminant network, and a style code discriminant network;
  • the target discriminant information includes object discriminant information, style object discriminant information, and style code discriminant information;
  • the style discrimination processing module includes:
  • an object discrimination processing unit configured to perform object discrimination processing by inputting the second sample object style image and the preset object image into the object discrimination network to obtain object discrimination information
  • a style object discrimination processing unit configured to perform style object discrimination processing by inputting the first sample object style image and the preset style object image into the style object discrimination network to obtain style object discrimination information
  • a style encoding discrimination processing unit configured to perform style encoding discrimination processing by inputting the second sample object style image, the first sample style code and the second sample style code into the style coding discrimination network to obtain Style encodes discriminative information.
  • an image generating device including:
  • an original object image acquisition module configured to perform acquisition of a first original object image of the first target object
  • the first style conversion processing module is configured to perform style conversion processing by inputting the first original object image into a first style conversion network to obtain a first target object style image corresponding to the first target object;
  • the first style conversion network is a preset object style image based on the first sample object image and the target style generated by any image generation method provided in the first aspect, and the first preset image generation network is subjected to adversarial training to obtain of.
  • an image generation device including:
  • a data acquisition module configured to perform acquisition of a second original object image and a target style label of a second target object
  • the second style conversion processing module is configured to input the second original object image and the target style label into a second style conversion network for style conversion processing, and obtain the second target object style corresponding to the second target object image;
  • the second style conversion network is based on the second sample object image, a variety of target style labels and a variety of target styles of preset object style images generated by any image generation method provided in the first aspect, for the second preset
  • the target image generation network is obtained by adversarial training.
  • an electronic device including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions to implement The method as described in any one of the first aspect, the second aspect and the third aspect above.
  • a computer-readable storage medium When the instructions in the storage medium are executed by the processor of the electronic device, the electronic device can execute the first method of the embodiments of the present disclosure. Aspect, the method described in any one of the second aspect and the third aspect.
  • a computer program product containing instructions, which, when run on a computer, causes the computer to execute any one of the first aspect, the second aspect and the third aspect of the embodiments of the present disclosure. method described in the item.
  • the stylized image is decoupled into two parts, object encoding and style encoding, and in the style fusion network, the target fusion weight is combined with the preset
  • the network fusion parameters determined by the fusion data corresponding to the number of network layers are used to perform style fusion processing on the target style code and the preset object code.
  • the target style fusion coding that effectively integrates the target style, and the target fusion weight is based on the target style coding and the preset object coding for fusion weight learning, which can realize adaptive adjustment of the fusion weight of the object under different target styles, and then better fusion Object style encoding under the target style.
  • the stylized effect and stylized image quality it can greatly improve the adaptive generation efficiency of multi-style object style images.
  • Fig. 1 is a schematic diagram showing an application environment according to an exemplary embodiment
  • Fig. 2 is a flowchart of an image generation method shown according to an exemplary embodiment
  • Fig. 3 is a schematic network structure diagram of a style coding network provided according to an exemplary embodiment
  • Fig. 4 is a flowchart showing a pre-trained style coding network according to an exemplary embodiment
  • Fig. 5 is a network fusion parameter corresponding to a preset number of network layers in a style fusion network according to an exemplary embodiment, and performs style fusion processing on target style coding and preset object coding to obtain target style fusion coding flow chart;
  • Fig. 6 is a flowchart showing a pre-trained target image generation network and style fusion network according to an exemplary embodiment
  • Fig. 7 shows a style fusion process based on the sample network fusion parameters corresponding to the preset number of network layers to be trained in the style fusion network to be trained, according to an exemplary embodiment, to perform style fusion processing on the first sample style coding and sample object coding , to obtain the flow chart of sample style fusion coding;
  • Fig. 8 is a schematic diagram of a training style fusion network and a target image generation network according to an exemplary embodiment
  • Fig. 9 is a flowchart of an image generation method according to an exemplary embodiment
  • Fig. 10 is a flowchart of another image generation method according to an exemplary embodiment
  • Fig. 11 is a block diagram of an image generating device according to an exemplary embodiment
  • Fig. 12 is a block diagram of another image generating device according to an exemplary embodiment
  • Fig. 13 is a block diagram of another image generating device according to an exemplary embodiment
  • Fig. 14 is a block diagram of an electronic device for image generation according to an exemplary embodiment
  • Fig. 15 is a block diagram showing another electronic device for image generation according to an exemplary embodiment.
  • the user information including but not limited to user equipment information, user personal information, etc.
  • data including but not limited to data for display, data for analysis, etc.
  • FIG. 1 is a schematic diagram showing an application environment according to an exemplary embodiment.
  • the application environment may include a terminal 100 and a server 200 .
  • the terminal 100 can be used to provide a stylized image (object style image) generation service of a target object for any user.
  • the terminal 100 may include, but not limited to, smartphones, desktop computers, tablet computers, notebook computers, smart speakers, digital assistants, augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) devices, Electronic devices such as smart wearable devices may also be software running on the above-mentioned electronic devices, such as application programs.
  • the operating system running on the electronic device may include but not limited to Android system, IOS system, linux, windows and so on.
  • the server 200 can provide background services for the terminal 100, pre-generate object style images for training the style transfer network, and train the object images that can be used to convert object images into stylized images (object style images of the target style) .
  • the server 200 can be an independent physical server, or a server cluster or a distributed system composed of multiple physical servers, and can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, Cloud servers for basic cloud computing services such as network services, cloud communications, middleware services, domain name services, security services, CDN (Content Delivery Network, content distribution network), and big data and artificial intelligence platforms.
  • FIG. 1 is only an application environment provided by the present disclosure, and in actual application, other application environments may also be included, for example, more terminals may be included.
  • the above-mentioned terminal 100 and server 200 may be directly or indirectly connected through wired or wireless communication, which is not limited in this disclosure.
  • Fig. 2 is a flowchart of an image generation method according to an exemplary embodiment. As shown in Fig. 2 , the image generation method is used in electronic devices such as terminals and servers, and includes the following steps.
  • step S201 a preset object code and a target style code of a target style are obtained.
  • the target style can be any image style.
  • the division of image style can be divided into multiple ways according to actual application requirements.
  • the target style may include but not limited to image styles such as animation, oil painting, and pencil drawing.
  • the target style code may be coded information that can characterize the style features of the target style.
  • the preset object code may be coded information capable of characterizing object features of a certain type of object.
  • the objects may include, but are not limited to, human faces, cat faces, dog faces, and other objects requiring style conversion.
  • the above-mentioned target style may be an image style extracted from a certain reference style image, and correspondingly, the above-mentioned target style encoding for obtaining the target style may include:
  • the reference style image is input into the style encoding network for style encoding processing, and the target style encoding is obtained.
  • the style image may be an image with a certain style. Taking the subject as a human face as an example, and the target style as an example of an anime style, the style image is an anime human face image.
  • the style coding network can be obtained by comparing the style coding network to be trained with the target style-based positive sample style image pair and negative sample style image pair.
  • the network structure of the style coding network can be preset in combination with actual applications.
  • FIG. 3 is a schematic network structure diagram of a style coding network provided according to an exemplary embodiment.
  • the style encoding network may include sequentially connecting a convolutional neural network, a style feature extraction network, a feature splicing network, and a multi-layer perceptual network.
  • the convolutional neural network can be used to extract the image feature information of the reference style image; the style feature extraction network can be used to extract the style feature information in the image feature information; the feature splicing network can be used to stitch the style feature information into a long vector of a preset dimension (the preset dimension is consistent with the input dimension of the multi-layer perceptual network); the above-mentioned multi-layer perceptual network can include two parts of the multi-layer perceptual network connected in sequence, and the first part of the multi-layer perceptual network can be used to reduce Dimensions, and convert long vectors of preset dimensions into more dense style codes, which makes it easier to train image generation networks.
  • the second part of the multi-layer perceptual network can be used to reduce the dimensionality of the encoded information output by the first part of the multi-layer perceptual network, and transform the encoded information from (encoding space) to the distribution space where the image generation network is located.
  • the specific network structures of convolutional neural network, style feature extraction network, feature splicing network and multi-layer perceptual network can also be set in combination with practical applications.
  • the above method may also include: the step of pre-training the style coding network, as shown in FIG. 4 , the pre-training of the style coding network may include the following steps:
  • step S401 a positive sample style image pair and a negative sample style image pair of the target style are obtained;
  • step S403 the positive sample style image pair and the negative sample style image pair are input into the style coding network to be trained for style coding processing, and the respective corresponding sample style codes of the positive sample style image pair and the negative sample style image pair are obtained;
  • step S405 the sample style code is input into the perception network to be trained for perceptual processing, and the respective sample perception feature information corresponding to the positive sample style image pair and the negative sample style image pair are obtained;
  • step S407 according to the sample perceptual feature information, determine the contrast loss information
  • step S409 based on the comparison loss information, the style encoding network to be trained and the perception network to be trained are trained;
  • step S411 the trained style coding network to be trained is used as the style coding network.
  • the network structure of the style coding network to be trained is the same as that of the style coding network, but the network parameters are different.
  • a reference style image of a target style can be obtained, and by performing multiple affine transformations on the reference style image, correspondingly, the style image after each affine transformation is combined with the reference style image A pair of positive sample style images; in some embodiments, the translation amount of multiple affine changes can be different.
  • a plurality of non-target style reference style images may be obtained, and the plurality of non-target style reference images are respectively combined with target style reference style images to form multiple pairs of target style negative sample style image pairs.
  • multiple affine transformations may be performed on a reference style image of a non-target style, and the style images after multiple affine transformations are respectively combined with the reference style image of the target style to form multiple pairs of negative Sample style image pair.
  • the corresponding sample style codes of the positive sample style image pair and the negative sample style image pair output by the style coding network to be trained can be input to the training
  • the perceptual network performs perceptual processing to obtain the sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair, and determines the comparative loss information according to the sample perceptual feature information.
  • the preset contrast loss function may be combined in the process of determining the contrast loss information according to the sample perceptual feature information.
  • the preset contrast loss function may be any kind of contrast loss function.
  • the NT-Xent contrast loss function the normalized temperature-scaled cross entropy loss, the normalized temperature-scaled cross entropy loss function.
  • the above-mentioned training of the style encoding network to be trained and the perception network to be trained based on the comparison loss information may include: updating the network parameters in the style encoding network to be trained and the perception network to be trained based on the comparison loss information; then, based on The updated style coding network to be trained and the perception network to be trained repeatedly input the positive sample style image pair and the negative sample style image pair into the style coding network to be trained for style coding processing, and obtain the positive sample style image pair and the negative sample style image pair respectively.
  • the corresponding sample style is encoded to a training iterative operation of updating network parameters in the style encoding network to be trained and the perception network to be trained based on the comparison loss information, until the first preset convergence condition is reached.
  • the current style encoding network to be trained (the trained style encoding network to be trained) is used as the style encoding network.
  • reaching the first preset convergence condition may be that the number of training iteration operations reaches the first preset training number. In some embodiments, reaching the first preset convergence condition may also be that the contrast loss information is smaller than the first preset threshold. In the embodiment of this specification, the first preset training times and the first preset threshold can be preset in combination with the training speed and accuracy of the network in practical applications.
  • the positive sample style image pair and the negative sample style image pair based on the target style are obtained by comparative training of the style coding network to be trained, which can realize the training of the self-supervised style coding network, and effectively ensure that the trained style coding The representation accuracy of the network on style features.
  • the target style coding of the target style is extracted from the style image of the target style, which can effectively improve the accuracy of the target style coding for the target style representation.
  • the above-mentioned target style may also be an image style obtained by random sampling.
  • the above-mentioned target style encoding for obtaining the target style includes:
  • the initial style code is input into the first multi-layer perceptual network for perceptual processing, and the target style code is obtained.
  • the first preset distribution may be a preset encoding distribution, and in some embodiments, the first preset distribution may include but not limited to a Gaussian distribution.
  • the first multi-layer perceptual network can be used to reduce the dimensionality of the initial style code, and transform the initial style code from (encoding space, such as Gaussian distribution space) to the distribution space where the image generation network is located.
  • the generation efficiency and diversity of the style code can be improved, and the flexibility of generating the style image of the object can be greatly improved; and after the initial style code is randomly generated based on the first preset distribution, combined with the first multiple Layer-aware networks perform perceptual processing, making it easier to train image generation networks.
  • the above-mentioned preset object code includes obtaining by the following steps:
  • the initial object code is input into the second multi-layer perceptual network for perceptual processing, and the preset object code is obtained.
  • the second preset distribution may be a preset encoding distribution, and in some embodiments, the second preset distribution may include but not limited to a Gaussian distribution.
  • the second multi-layer perceptual network can be used to reduce the dimensionality of the initial object encoding and transform the initial object encoding from (encoding space, eg Gaussian distribution space) to the distribution space where the image generation network resides.
  • encoding space eg Gaussian distribution space
  • the generation efficiency of the object code can be improved, and a large number of object style images can be quickly generated for a certain style, which effectively improves the image generation efficiency of a certain style, and based on the second preset After the initial object encoding is randomly generated by the distribution, combined with the second multi-layer perceptual network for perceptual processing, it can be easier to train the image generation network.
  • step S203 based on the network fusion parameters corresponding to the preset number of network layers in the style fusion network, the style fusion processing is performed on the target style code and the preset object code to obtain the target style fusion code.
  • the style fusion network can be used to perform style fusion processing on target style codes and preset object codes.
  • the style fusion network may include a preset number of network layers; in some embodiments, the number of network layers (preset number) may be set in conjunction with actual applications.
  • the aforementioned style fusion network may be a style fusion network capable of regulating the degree of fusion and adaptively adjusting fusion weights.
  • the degree of fusion can be controlled by regulating the fusion position of the target style code and the preset object code in the style fusion network (that is, from which network layer to start fusion).
  • fusion weight learning of target style codes and preset object codes can be performed to achieve adaptive adjustment of fusion weights of objects under different target styles.
  • the above-mentioned network fusion parameters may be determined based on fusion data corresponding to a preset number of network layers and target fusion weights, and the above-mentioned target fusion weights are obtained by performing fusion weight learning based on target style coding and preset object coding .
  • the above-mentioned network fusion parameters corresponding to the preset number of network layers in the style fusion network perform style fusion processing on the target style code and the preset object code, and the target style fusion code can be obtained. Include the following steps:
  • step S501 target fusion regulation data is acquired.
  • the above target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network.
  • step S503 according to the target fusion control data, the fusion data corresponding to a preset number of network layers are determined.
  • the determination of the fusion data corresponding to the preset number of network layers according to the target fusion control data may include:
  • the fused data corresponding to the preset number of network layers is determined.
  • the fusion data corresponding to each network layer may represent whether the target style code participates in fusion at the network layer.
  • the preset number of network layers included in the style fusion network may be a preset number of sequentially arranged network layers, such as the 0th layer to the nth layer; in some embodiments, the comparison result indicates that the network If the number of layers is less than the target fusion control data, the fusion data corresponding to this network layer is 0, that is, the target style code does not participate in fusion at this network layer; otherwise, when the comparison result indicates that the number of layers of the network layer is greater than or equal to the target fusion In the case of regulating data, the fusion data corresponding to this network layer is 1, that is, the target style coding participates in fusion at this network layer.
  • the controllable style feature fusion can be realized, and then the subsequent generation can be guaranteed to have a higher similarity with the natural object.
  • step S505 concatenate the target style code and the preset object code to obtain the target concatenated code.
  • step S507 fusion weight learning is performed based on the target splicing code to obtain target fusion weights.
  • the target concatenation code can be input into the fully connected layer to perform fusion weight learning to obtain the above target fusion weight.
  • the target fusion weight can be used to control the fusion ratio of the target style code and the preset object code in each fusion layer.
  • the fusion weight learning is performed in combination with target splicing coding including target style coding and preset object coding, so that the learned target fusion weights are adaptive to different types of objects and styles, and then the target style is better fused.
  • target splicing coding including target style coding and preset object coding
  • step S509 the fusion data and target fusion weights are weighted to obtain network fusion parameters.
  • both the target fusion weight and the fusion data are data in the form of a matrix.
  • the target fusion weight and the corresponding elements in the fusion data can be multiplied to obtain the above network fusion parameters.
  • step S511 the target style code and the preset object code are subjected to style fusion processing in a preset number of network layers based on the network fusion parameters to obtain the target style fusion code.
  • the network fusion parameters corresponding to the above preset number of network layers can be used as the weight of the target style coding, and the matrix difference after subtracting the network fusion parameters from the matrix whose elements are all ones can be used as the weight of the preset object coding , and based on the weight of the target style code and the weight of the preset object code, the target style code and the preset object code are weighted and summed to obtain the above target fusion code.
  • the network fusion layer includes 18 network layers.
  • the subsequent image generation network outputs a fully stylized object style image; when the target fusion control data is 18, the target style code does not participate in the fusion; correspondingly, the subsequent image generation network outputs an unstylized natural object Image; when the target fusion control data is greater than 1 and less than 18, the target style code and the preset object code are fused from the i-th (target fusion control data) layer, and the low-resolution layer (that is, the first layer to the i-th layer 1) It is not affected by the target style coding; based on the target fusion control data, it can ensure that the target style fusion code retains the same feature information as the natural object, and has different degrees of style characteristics, which in turn can ensure that the subsequent generation is more similar to the natural object.
  • the above-mentioned target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network, which may be the start fusion position of the control target style code and the preset object code in the style fusion network.
  • the style fusion network in the style fusion network, combining the target fusion control data to control the fusion position of the target style code and the preset object code in multiple network layers, the regulation of the degree of fusion can be realized .
  • target splicing coding including target style coding and preset object coding
  • fusion weight learning is carried out, so that the learned target fusion weight is adaptive to different types of objects and styles, and the adaptive adjustment of objects under different target styles is realized. Fusion weights, and then better integrate the object style code under the target style, which can ensure that the obtained target style fusion code retains the same feature information as the natural object, and has adjustable style features and adaptive fusion weights.
  • step S205 the target style fusion code is input into the target image generation network for image generation processing, and a preset object style image corresponding to the target style is obtained.
  • the target image generation network can be used to generate a preset object style image corresponding to the target style.
  • the above method also includes: the step of pre-training the target image generation network and the style fusion network, in some embodiments, as shown in Figure 6, the step of pre-training the target image generation network and the style fusion network Can include:
  • step S601 a first sample style code of a target style, a sample object code, a second sample style code of a non-target style, a preset style object image, and a preset object image are acquired.
  • the acquisition manner of the first sample style code and the second sample style code may refer to the above-mentioned manner of obtaining the target style code, which will not be repeated here.
  • the acquisition method of the sample object code refer to the above-mentioned method for obtaining the preset object code, which will not be repeated here.
  • the preset style object image can be obtained from the stylized object image training set corresponding to the target style. Taking the object as a human face as an example, the preset style can be obtained from the collected face style image training set of the target style face image. In some embodiments, the preset object image may be an original object image. Taking the object as a human face as an example, it may be an image of a real human face.
  • step S603 based on the sample network fusion parameters corresponding to the preset number of network layers to be trained in the style fusion network to be trained, the style fusion processing is performed on the first sample style code and the sample object code to obtain the sample style fusion code.
  • the style fusion network to be trained may include a preset number of network layers to be trained; the above-mentioned sample network fusion parameters may be determined based on the sample fusion data and sample fusion weights corresponding to the preset number of network layers to be trained, The above-mentioned sample fusion weight is obtained by performing fusion weight learning based on the first sample style coding and sample object coding.
  • the above sample style fusion coding includes the first style fusion coding and the second style fusion coding; correspondingly, as shown in FIG.
  • the sample network fusion parameters of the first sample style code and the sample object code are subjected to style fusion processing, and the sample style fusion code obtained may include the following steps:
  • step S701 first fusion control data and second fusion control data are obtained.
  • the above-mentioned first fusion control data is used to control the first sample style coding and sample object coding, and the fusion starts from the first network layer in the style fusion network to be trained, combined with the implementation of the above-mentioned target fusion control data
  • the first fusion control data is 1; in some embodiments, the second fusion control data can be used to control the first sample style encoding not to participate in fusion in the style fusion network to be trained, and the above preset number is 18 Taking the embodiment of the example as an example, the second fusion regulation data is 18.
  • step S703 according to the first fused control data and the second fused control data, first sample fused data and second sample fused data corresponding to a preset number of network layers to be trained are respectively determined.
  • the first fusion regulation data determine the first sample fusion data corresponding to the preset number of network layers to be trained; and according to the second fusion regulation data, determine the preset number of network layers corresponding to the training The second sample fuses the data.
  • step S705 concatenate the first sample style code and the sample object code to obtain the sample spliced code
  • step S707 fusion weight learning is performed based on sample splicing coding to obtain sample fusion weights
  • step S709 the first sample fusion data and the second sample fusion data are weighted with the sample fusion weight respectively to obtain the first sample network fusion parameters and the second sample network fusion parameters;
  • step S711 based on the network fusion parameters of the first sample and the network fusion parameters of the second sample, style fusion processing is performed on the style coding of the first sample and the coding of the sample object in the preset number of network layers to be trained respectively, to obtain the first One-style fusion encoding and second-style fusion encoding.
  • the specific details of the above steps S705 to 711 can refer to the specific details of the above steps S505 to 511 , which will not be repeated here.
  • the sample object encoding in the network training process, can be combined with the first sample style encoding to varying degrees, and the object style is strong. Weak and flexible control, in order to improve the image quality of subsequent generated object-style images and the generation efficiency of object-style images.
  • step S605 the sample style fusion code is input into the image generation network to be trained for image generation processing, and the sample object style image corresponding to the target style is obtained.
  • the sample object style image may include a first sample object style image corresponding to the first style fusion coding and a second sample object style image corresponding to the second style fusion coding; correspondingly, the above-mentioned sample style fusion coding
  • Inputting the image generation network to be trained for image generation processing to obtain the sample object style image corresponding to the target style includes: inputting the first style fusion code and the second style fusion code into the image generation network to be trained for image generation processing to obtain the first sample Object style image and second sample object style image.
  • step S607 input the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminant network to be trained for style discriminant processing to obtain target discriminant information.
  • the discriminant network to be trained includes an object discriminant network, a style object discriminant network, and a style code discriminant network; correspondingly, the target discriminant information may include object discriminant information, style object discriminant information, and style code discriminant information.
  • the style fusion network to be trained is a network with a fixed fusion structure
  • the sample style code is input into the discrimination network to be trained for style discrimination processing
  • the target discrimination information obtained includes: inputting the sample object style image and the preset object image into the object discrimination network for object discrimination processing, and obtaining object discrimination information
  • the object discrimination information may include object Discriminate the corresponding characteristic information of the sample object style image output by the network and the preset object image
  • the discriminant information may include the corresponding feature information of the sample object style image output by the style object discrimination network and the preset style object image
  • the sample object style image, the first sample style code and the second sample style code are input into the style code discrimination network Perform style coding
  • the style fusion network to be trained is a network capable of regulating the fusion degree
  • the encoding and the second sample style encoding are input into the discriminant network to be trained for style discriminant processing, and obtaining the target discriminant information may include: inputting the second sample object style image and the preset object image into the object discriminant network for object discriminant processing, and obtaining the object discriminant information
  • the object discrimination information may include the second sample object style image output by the object discrimination network and the corresponding feature information of the preset object image); the first sample object style image and the preset style object image are input into the style object discrimination network for style Object discrimination processing, obtain style object discrimination information (this style object discrimination information can comprise the feature information corresponding to the first sample object style image output by the style object discrimination network and the preset style object image respectively); the second sample object style image , the first sample style code and the second sample style code are input
  • the adversarial training of the image generation network to be trained for generating the style image of the object is carried out from the three dimensions of style discrimination, style object discrimination and style coding discrimination, which can greatly improve the ability of the trained image generation network to stylize the image of the object.
  • the representation ability of the method improves the quality of the generated object-style images.
  • step S609 target loss information is determined according to the target discrimination information.
  • the target loss information may include generation loss information corresponding to the image generation network to be trained and discrimination loss information corresponding to the discriminative network to be trained.
  • an adversarial loss function may be combined in the process of determining the target loss information according to the target discrimination information.
  • the object discrimination loss between the feature information corresponding to the second sample object style image output by the object discrimination network and the preset object image can be determined in combination with the adversarial loss function; in combination with the adversarial loss function, the style object discrimination network can be determined The style object discrimination loss between the corresponding feature information of the output first sample object style image and the preset style object image; combined with the adversarial loss function, determine the first sample style code and the second sample style code output by the style coding discriminant network Encodes the style encoding discriminative loss between the corresponding feature information.
  • the object discrimination loss, the style object discrimination loss and the style encoding discrimination loss can be added to obtain the above generation loss information, and the negative number of the generation loss information can be used as the discrimination loss information.
  • step S611 based on the target loss information, the style fusion network to be trained, the image generation network to be trained and the discrimination network to be trained are trained.
  • the above-mentioned training of the style fusion network to be trained, the image generation network to be trained, and the discrimination network to be trained based on the target loss information may include: updating the image generation network to be trained and the style fusion network to be trained based on the generation loss information , and update the network parameters in the discriminant network to be trained (object discriminant network, style object discriminant network, and style encoding discriminant network) based on discriminative loss information; then, based on the updated style fusion network to be trained, images to be trained are generated
  • the network and the discriminant network to be trained are repeatedly based on the sample network fusion parameters corresponding to the preset number of network layers to be trained in the style fusion network to be trained, and the style fusion process is performed on the first sample style code and the sample object code to obtain the sample style fusion code , to update the network parameters in the image generation network to be trained and the style fusion network to be trained based on the generation loss information, and update the discriminant network to be trained (object discriminant network)
  • reaching the second preset convergence condition may be that the number of training iteration operations reaches the second preset training number. In some embodiments, reaching the second preset convergence condition may also be that the generated loss information is less than a second preset threshold. In the embodiment of this specification, the second preset training times and the second preset threshold can be preset in combination with the training speed and accuracy of the network in practical applications.
  • step S613 the trained style fusion network to be trained is used as a style fusion network, and the trained image generation network to be trained is used as a target image generation network.
  • the current style fusion network to be trained (the trained style fusion network to be trained) is used as the above-mentioned style fusion network, and the current image to be trained is generated Network (trained image generation network to be trained) is used as the above-mentioned target image generation network.
  • FIG. 8 is a schematic diagram of a training style fusion network and a target image generation network provided according to an exemplary embodiment.
  • the joint training of the target image generation network and the style fusion network can realize the fusion of style features and object features, and the target image generation network can be trained based on the fused sample object style images, which can greatly improve the training performance.
  • the ability of the image generation network to represent the object style and object features can effectively improve the quality of the subsequent generated object style images.
  • a large number of preset object style images of the target style can be generated based on the above-mentioned image generation method provided by the embodiments of this specification.
  • the image generation method generates a large number of preset object style images of the target style, and conducts adversarial training on the first preset image generation network to obtain the first style transfer network.
  • the first style transfer network may be used to generate an object image of a target style for a certain target object.
  • the first sample object can be The image is input into the first preset image generation network for style conversion processing to obtain the object style image corresponding to the first sample object image; and the object style image and the corresponding preset object style image are input into the corresponding discriminant network for style discrimination processing , to obtain the first style discriminant information; and based on the first style discriminant information, determine the corresponding discriminant loss information; furthermore, the first preset image generation network and the corresponding discriminant network can be trained based on the discriminative loss information, and the trained The first preset image generation network is used as the first style transfer network.
  • preset object style images of various target styles can be generated.
  • the target style label and the above image generation method generate preset object style images of multiple target styles, and conduct confrontation training on the second preset image generation network to obtain a second style transfer network.
  • the second style transfer network can be used to generate object style images of various target styles.
  • the second preset image is generated
  • the second sample object image and the corresponding target style label can be input into the second preset image generation network for style conversion processing, and the object style corresponding to the second sample object image matching the target style label can be obtained image; and the object style image and the corresponding preset object style image are input into the corresponding discriminant network for style discriminant processing to obtain the second style discriminant information; in addition, a discriminant network can also be added to generate the second preset image
  • the object style image output by the network is judged whether it is the style corresponding to the target style label, and the third style discrimination information is obtained; and based on the second style discrimination information and the third style discrimination information, the corresponding loss information is determined; and then based on the loss information To train the second preset image generation network and
  • this specification decouples the stylized image into two parts: object coding and style coding.
  • the style fusion processing is performed on the target style code and the preset object code, and the fusion of the two can be performed , to obtain the target style fusion code that can not only represent the object characteristics of a certain type of object, but also effectively integrate into the target style, and the target fusion weight is based on the target style code and the preset object code for fusion weight learning, which can realize adaptive adjustment
  • it can greatly improve the adaptive generation efficiency of multi-style object style images.
  • Fig. 9 is a flowchart of another image generation method according to an exemplary embodiment. As shown in Fig. 9, the image generation method is used in terminals and server electronic devices, and includes the following steps.
  • step S901 a first original object image of a first target object is acquired
  • step S903 input the first original object image into the first style conversion network to perform style conversion processing, and obtain the first target object style image corresponding to the first target object;
  • the first original object image may be an object image of the first target object uploaded by the user through the terminal. Taking the first target object as an example of a user's face, the first original object image may be a real face image of the user.
  • the first target object style image may be an object image of the target style of the first target object.
  • the terminal may input the first original object image into the first style conversion network for style conversion processing to obtain the first target object style image corresponding to the first target object.
  • the terminal may also send the first original object image to the server, and the server generates the first target object style image based on the first style conversion network and transmits it to the terminal.
  • the original object image of the first target object is combined with the first style conversion network trained by the preset object style image that not only retains the object characteristics of the natural object but also effectively incorporates the style characteristics of the target style.
  • Style conversion on the basis of effectively improving the stylization effect, can ensure the consistency between the object features in the style image of the first target object after style conversion and the object features of the first target object, thereby greatly improving the quality of the stylized image .
  • Fig. 10 is a flow chart of another image generation method according to an exemplary embodiment. As shown in Fig. 10, the image generation method is used in terminals and server electronic devices, and includes the following steps.
  • step S1001 a second original object image and a target style label of a second target object are obtained;
  • step S1003 input the second original object image and the target style label into the second style conversion network for style conversion processing, and obtain the second target object style image corresponding to the second target object;
  • the second original object image may be an object image of the second target object uploaded by the user through the terminal. Taking the second target object as an example of a user's face, the second original object image may be a real face image of the user.
  • the target style tag may be identification information of a certain style selected for the user.
  • the style image of the second target object may be an object image of a style corresponding to the target style label of the second target object.
  • the terminal may input the second original object image and the target style label into the second style conversion network for style conversion processing to obtain the second target object style image corresponding to the second target object.
  • the terminal may also send the second original object image and the target style label to the server, and the server generates the second target object style image based on the second style conversion network and transmits it to the terminal.
  • the second style conversion network obtained by training the preset object style image that not only retains the object characteristics of the natural object but also effectively incorporates the style characteristics of various target styles is used to convert the original object of the second target object
  • the style conversion of the image corresponding to the style of the target style label can effectively improve the stylization effect, and ensure the consistency between the object features in the style image of the second target object after style conversion and the object features of the second target object, and then Greatly improved the quality of stylized images.
  • Fig. 11 is a block diagram of an image generating device according to an exemplary embodiment. Referring to Figure 11, the device includes:
  • the code acquisition module 1110 is configured to perform target style coding for acquiring preset object codes and target styles
  • the first style fusion processing module 1120 is configured to perform style fusion processing on the target style code and the preset object code based on network fusion parameters corresponding to a preset number of network layers in the style fusion network to obtain the target style fusion code; network The fusion parameters are determined based on fusion data corresponding to a preset number of network layers and target fusion weights, and the target fusion weights are obtained by learning fusion weights based on target style coding and preset object coding;
  • the first image generation processing module 1130 is configured to input the target style fusion code into the target image generation network for image generation processing, and obtain a preset object style image corresponding to the target style.
  • the first style fusion processing module 1120 includes:
  • the target fusion control data acquisition unit is configured to execute the acquisition of target fusion control data, and the target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network;
  • the fused data determination unit is configured to perform fusion control data according to the target, and determine fused data corresponding to a preset number of network layers;
  • the first splicing processing unit is configured to perform splicing processing on the target style code and the preset object code to obtain the target spliced code;
  • the first fusion weight learning unit is configured to perform fusion weight learning based on target splicing coding to obtain target fusion weights
  • the first weighting processing unit is configured to perform weighting processing on the fusion data and the target fusion weight to obtain network fusion parameters
  • the first style fusion processing unit is configured to perform style fusion processing on the target style code and the preset object code in a preset number of network layers based on the network fusion parameters to obtain the target style fusion code.
  • the fusion data determination unit includes:
  • the comparison unit is configured to compare the number of layers corresponding to the preset number of network layers with the target fusion control data to obtain a comparison result
  • the fused data determination subunit is configured to determine fused data corresponding to a preset number of network layers according to the comparison result.
  • the code acquisition module 1110 includes:
  • a reference style image acquiring unit configured to acquire a reference style image of a target style
  • the style encoding processing unit is configured to input the reference style image into the style encoding network for style encoding processing to obtain the target style encoding.
  • the above-mentioned device also includes:
  • the sample image acquisition module is configured to perform acquisition of a positive sample style image pair and a negative sample style image pair of the target style
  • the style coding processing module is configured to perform style coding processing on inputting the positive sample style image pair and the negative sample style image pair into the style coding network to be trained, and obtain the respective sample style codes corresponding to the positive sample style image pair and the negative sample style image pair ;
  • the perceptual processing module is configured to input the sample style code into the perceptual network to be trained for perceptual processing, and obtain the respective sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair;
  • the comparison loss information determination module is configured to determine the comparison loss information according to the sample perception feature information
  • the first network training module is configured to perform the training of the style encoding network to be trained and the perception network to be trained based on the comparison loss information;
  • the style coding network determination module is configured to execute the trained style coding network to be trained as the style coding network.
  • the code acquisition module 1110 includes:
  • an initial style code generating unit configured to randomly generate an initial style code based on a first preset distribution
  • the first perceptual processing unit is configured to input the initial style code into the first multi-layer perceptual network for perceptual processing to obtain the target style code.
  • the code acquisition module 1110 includes:
  • the initial object code generating unit is configured to randomly generate the initial object code based on the second preset distribution
  • the second perceptual processing unit is configured to input the initial object code into the second multi-layer perceptual network for perceptual processing to obtain the preset object code.
  • the above-mentioned device also includes:
  • a sample data acquisition module configured to perform acquisition of a first sample style code of a target style, a sample object code, a second sample style code of a non-target style, a preset style object image, and a preset object image;
  • the second style fusion processing module is configured to perform style fusion processing on the first sample style code and the sample object code based on the sample network fusion parameters corresponding to a preset number of network layers to be trained in the style fusion network to be trained, and obtain Sample style fusion encoding;
  • the second image generation processing module is configured to input the sample style fusion code into the image generation network to be trained for image generation processing, and obtain the sample object style image corresponding to the target style;
  • the style discrimination processing module is configured to perform style discrimination processing by inputting the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminant network to be trained to obtain the target discriminant information;
  • the target loss information determination module is configured to determine the target loss information according to the target discrimination information
  • the second network training module is configured to perform training based on the target loss information to train the style fusion network to be trained, the image generation network to be trained and the discrimination network to be trained;
  • the network determination module is configured to use the trained style fusion network to be trained as the style fusion network, and use the trained image generation network to be trained as the target image generation network.
  • the sample style fusion encoding includes a first style fusion encoding and a second style fusion encoding;
  • the second style fusion processing module includes:
  • the sample fusion control data acquisition unit is configured to execute the acquisition of the first fusion control data and the second fusion control data, the first fusion control data is used to control the first sample style coding and sample object coding, from the style fusion network to be trained
  • the first network layer starts to fuse;
  • the second fusion control data is used to control the first sample style code not to participate in fusion in the style fusion network to be trained;
  • the sample fusion data determination unit is configured to determine the first sample fusion data and the second sample fusion data corresponding to the preset number of network layers to be trained according to the first fusion control data and the second fusion control data;
  • the second splicing processing unit is configured to perform splicing processing on the first sample style code and the sample object code to obtain the sample spliced code;
  • the second fused weight learning unit is configured to perform fused weight learning based on sample splicing coding to obtain sample fused weights
  • the second weighting processing unit is configured to perform weighting processing on the first sample fusion data, the second sample fusion data and the sample fusion weight respectively, to obtain the first sample network fusion parameters and the second sample network fusion parameters;
  • the second style fusion processing unit is configured to perform the first sample style coding and sample object coding in a preset number of network layers to be trained based on the first sample network fusion parameters and the second sample network fusion parameters, respectively. Style fusion processing to obtain the first style fusion code and the second style fusion code.
  • the sample object style image includes a first sample object style image corresponding to the first style fusion encoding and a second sample object style image corresponding to the second style fusion encoding;
  • the discriminant network to be trained includes an object discriminant network, a style Object discrimination network and style coding discrimination network;
  • target discrimination information includes object discrimination information, style object discrimination information and style coding discrimination information;
  • the style discrimination processing module includes:
  • the object discrimination processing unit is configured to perform object discrimination processing by inputting the second sample object style image and the preset object image into the object discrimination network to obtain object discrimination information;
  • a style object discrimination processing unit configured to input the first sample object style image and the preset style object image into the style object discrimination network for style object discrimination processing, and obtain style object discrimination information
  • the style code discrimination processing unit is configured to input the second sample object style image, the first sample style code and the second sample style code into the style code discrimination network for style code discrimination processing to obtain style code discrimination information.
  • Fig. 12 is a block diagram of another image generating device according to an exemplary embodiment. Referring to Figure 12, the device includes:
  • an original object image acquisition module 1210 configured to perform acquisition of a first original object image of the first target object
  • the first style conversion processing module 1220 is configured to input the first original object image into the first style conversion network for style conversion processing, and obtain the first target object style image corresponding to the first target object;
  • the first style conversion network is obtained by performing adversarial training on the first preset image generation network based on the first sample object image and the preset object style image of the target style generated by the above-mentioned image generation method.
  • Fig. 13 is a block diagram of another image generating device according to an exemplary embodiment. Referring to Figure 13, the device includes:
  • a data acquisition module 1310 configured to acquire a second original object image and a target style label of a second target object
  • the second style conversion processing module 1320 is configured to perform style conversion processing by inputting the second original object image and the target style label into the second style conversion network to obtain a second target object style image corresponding to the second target object;
  • the second style conversion network is a preset object style image based on the second sample object image, multiple target style labels, and multiple target styles generated by the above image generation method, and the second preset target image generation network is subjected to adversarial training to obtain of.
  • Fig. 14 is a block diagram of an electronic device for image generation according to an exemplary embodiment.
  • the electronic device may be a terminal, and its internal structure may be as shown in Fig. 14 .
  • the electronic device includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus. Wherein, the processor of the electronic device is used to provide calculation and control capabilities.
  • the memory of the electronic device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer programs.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the network interface of the electronic device is used to communicate with an external terminal through a network connection.
  • the display screen of the electronic device may be a liquid crystal display screen or an electronic ink display screen
  • the input device of the electronic device may be a touch layer covered on the display screen, or a button, a trackball or a touch pad provided on the housing of the electronic device , and can also be an external keyboard, touchpad, or mouse.
  • Fig. 15 is a block diagram of another electronic device for image generation according to an exemplary embodiment.
  • the electronic device may be a server, and its internal structure may be as shown in Fig. 15 .
  • the electronic device includes a processor, memory and network interface connected by a system bus. Wherein, the processor of the electronic device is used to provide calculation and control capabilities.
  • the memory of the electronic device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer programs.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the network interface of the electronic device is used to communicate with an external terminal through a network connection. When the computer program is executed by a processor, an image generation method is realized.
  • FIG. 14 or FIG. 15 is only a block diagram of a part of the structure related to the disclosed solution, and does not constitute a limitation on the electronic equipment to which the disclosed solution is applied.
  • the electronic device may include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • an electronic device including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions, so as to realize the implementation of the present disclosure.
  • the image generation method in the example is also provided.
  • a computer-readable storage medium is also provided, and when instructions in the storage medium are executed by a processor of the electronic device, the electronic device can execute the image generation method in the embodiments of the present disclosure.
  • a computer program product including instructions, which, when run on a computer, cause the computer to execute the image generation method in the embodiments of the present disclosure.
  • Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM random access memory
  • RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Abstract

The present invention relates to an image generation method and apparatus, an electronic device, and a storage medium. The method comprises: obtaining a preset object code and a target style code of a target style; performing style fusion processing on the target style code and the preset object code on the basis of network fusion parameters corresponding to a preset number of network layers in a style fusion network to obtain a target style fusion code, wherein the network fusion parameters are determined on the basis of fusion data corresponding to the preset number of network layers and a target fusion weight, and the target fusion weight is obtained by performing fusion weight learning on the basis of the target style code and the preset object code; and inputting the target style fusion code into a target image generation network for image generation processing to obtain a preset object style image corresponding to the target style.

Description

图像生成方法及装置Image generation method and device
相关申请的交叉引用Cross References to Related Applications
本申请要求于2021年11月18日递交的中国专利申请第202111371705.0号的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。This application claims the priority of Chinese Patent Application No. 202111371705.0 submitted on November 18, 2021, and the content disclosed in the above Chinese Patent Application is cited in its entirety as a part of this application.
技术领域technical field
本公开涉及图像处理技术领域,尤其涉及一种图像生成方法、装置、电子设备及存储介质。The present disclosure relates to the technical field of image processing, and in particular to an image generation method, device, electronic equipment, and storage medium.
背景技术Background technique
随着图像处理技术的不断发展,在图像应用领域,图像风格转换功能已经成为了一种新的趣味性玩法。图像风格转换技术能够基于给定的人脸等对象图像生成具有目标风格的图像,使之具有与目标风格相似的艺术化效果,从而实现将对象图像转化为动漫、油画、铅笔画等不同风格的对象风格图像。With the continuous development of image processing technology, in the field of image applications, the image style conversion function has become a new interesting way to play. The image style conversion technology can generate an image with a target style based on a given object image such as a human face, so that it has an artistic effect similar to the target style, thereby realizing the conversion of the object image into an object of different styles such as animation, oil painting, and pencil drawing. style image.
相关技术中,在生成某一对象的图像生成过程中,需要预先训练具有风格图像生成功能的风格转换网络,但训练风格转换网络往往需要大量的对象风格图像作为训练图像,且为了保证风格转换网络生成的对象风格图像与原始对象的特征一致性,不产生扭曲形变等图像失真情况,需要保证训练图像的质量。In related technologies, in the process of generating an image of an object, it is necessary to pre-train a style transfer network with a style image generation function, but training a style transfer network often requires a large number of object style images as training images, and in order to ensure that the style transfer network The generated object-style image has the same features as the original object, and does not cause image distortion such as distortion and deformation. It is necessary to ensure the quality of the training image.
发明内容Contents of the invention
本公开提供一种图像生成方法、装置、电子设备及存储介质,可以快速生成高质量的对象风格图像,提升多风格的对象风格图像自适应生成效率。本公开的技术方案如下:The present disclosure provides an image generation method, device, electronic equipment, and storage medium, which can quickly generate high-quality object-style images, and improve the efficiency of self-adaptive generation of multi-style object-style images. The disclosed technical scheme is as follows:
根据本公开实施例的第一方面,提供一种图像生成方法,包括:According to a first aspect of an embodiment of the present disclosure, an image generation method is provided, including:
获取预设对象编码和目标风格的目标风格编码;Obtain the target style encoding of the preset object encoding and target style;
基于风格融合网络中预设数量个网络层对应的网络融合参数,对所述目标风格编码和所述预设对象编码进行风格融合处理,得到目标风格融合编码;所述网络融合参数为基于所述预设数量个网络层对应的融合数据和目标融合权重确定的,所述目标融合权重为基于所述目标风格编码和所述预设对象编码进行融合权重学习得到的;Based on network fusion parameters corresponding to a preset number of network layers in the style fusion network, perform style fusion processing on the target style code and the preset object code to obtain a target style fusion code; the network fusion parameters are based on the The fusion data corresponding to a preset number of network layers and the target fusion weight are determined, and the target fusion weight is obtained by performing fusion weight learning based on the target style code and the preset object code;
将所述目标风格融合编码输入目标图像生成网络进行图像生成处理,得到所述目标风格对应的预设对象风格图像。The target style fusion code is input into the target image generation network for image generation processing, and the preset object style image corresponding to the target style is obtained.
在一些实施例中,所述基于风格融合网络中预设数量个网络层对应的网络融合参数,对所述目标风格编码和所述预设对象编码进行风格融合处理,得到目标风格融合编码包括:In some embodiments, performing style fusion processing on the target style code and the preset object code based on the network fusion parameters corresponding to a preset number of network layers in the style fusion network, and obtaining the target style fusion code includes:
获取目标融合调控数据,所述目标融合调控数据用于调控所述目标风格编码和所述预设对象编码在所述风格融合网络中的融合位置;Acquiring target fusion control data, the target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network;
根据所述目标融合调控数据,确定所述预设数量个网络层对应的融合数据;determining fusion data corresponding to the preset number of network layers according to the target fusion control data;
对所述目标风格编码和所述预设对象编码进行拼接处理,得到目标拼接编码;Perform splicing processing on the target style code and the preset object code to obtain the target spliced code;
基于所述目标拼接编码进行融合权重学习,得到所述目标融合权重;performing fusion weight learning based on the target splicing code to obtain the target fusion weight;
对所述融合数据和所述目标融合权重进行加权处理,得到所述网络融合参数;performing weighting processing on the fusion data and the target fusion weight to obtain the network fusion parameters;
基于所述网络融合参数在所述预设数量个网络层中对所述目标风格编码和所述预设对象编码进行风格融合处理,得到所述目标风格融合编码。Perform style fusion processing on the target style code and the preset object code in the preset number of network layers based on the network fusion parameters to obtain the target style fusion code.
在一些实施例中,所述根据所述目标融合调控数据,确定所述预设数量个网络层对应的融合数据包括:In some embodiments, the determining the fusion data corresponding to the preset number of network layers according to the target fusion control data includes:
比较所述预设数量个网络层对应的层数与所述目标融合调控数据,得到比较结果;Comparing the number of layers corresponding to the preset number of network layers with the target fusion control data to obtain a comparison result;
根据所述比较结果,确定所述预设数量个网络层对应的融合数据。According to the comparison result, determine fusion data corresponding to the preset number of network layers.
在一些实施例中,所述获取目标风格的目标风格编码包括:In some embodiments, said obtaining the target style encoding of the target style includes:
获取所述目标风格的参考风格图像;Acquiring a reference style image of the target style;
将所述参考风格图像输入风格编码网络进行风格编码处理,得到所述目标风格编码。Inputting the reference style image into the style encoding network for style encoding processing to obtain the target style encoding.
在一些实施例中,所述方法还包括:In some embodiments, the method also includes:
获取所述目标风格的正样本风格图像对和负样本风格图像对;Obtaining a positive sample style image pair and a negative sample style image pair of the target style;
将所述正样本风格图像对和负样本风格图像对输入待训练风格编码网络进行风格编码处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本风格编码;Inputting the positive sample style image pair and the negative sample style image pair into the style coding network to be trained for style coding processing to obtain the sample style codes corresponding to the positive sample style image pair and the negative sample style image pair;
将所述样本风格编码输入待训练感知网络进行感知处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本感知特征信息;Inputting the sample style code into the perceptual network to be trained for perceptual processing, and obtaining the respective sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair;
根据所述样本感知特征信息,确定对比损失信息;determining comparative loss information according to the perceptual feature information of the sample;
基于所述对比损失信息,训练所述待训练风格编码网络和所述待训练感知网络;training the style encoding network to be trained and the perception network to be trained based on the comparative loss information;
将训练好的待训练风格编码网络,作为所述风格编码网络。The trained style coding network to be trained is used as the style coding network.
在一些实施例中,所述获取目标风格的目标风格编码包括:In some embodiments, said obtaining the target style encoding of the target style includes:
基于第一预设分布,随机生成初始风格编码;Randomly generate an initial style code based on a first preset distribution;
将所述初始风格编码输入第一多层感知网络进行感知处理,得到所述目标风格编码。Inputting the initial style code into the first multi-layer perceptual network for perceptual processing to obtain the target style code.
在一些实施例中,所述预设对象编码包括采用下述步骤获取:In some embodiments, the preset object code includes acquiring through the following steps:
基于第二预设分布,随机生成初始对象编码;Randomly generating an initial object code based on a second preset distribution;
将所述初始对象编码输入第二多层感知网络进行感知处理,得到所述预设对象编码。Inputting the initial object code into the second multi-layer perceptual network for perceptual processing to obtain the preset object code.
在一些实施例中,所述方法还包括:In some embodiments, the method also includes:
获取所述目标风格的第一样本风格编码、样本对象编码、非目标风格的第二样本风格编码、预设风格对象图像和预设对象图像;Acquiring the first sample style code of the target style, the sample object code, the second sample style code of the non-target style, the preset style object image and the preset object image;
基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到样本风格融合编码;Based on the sample network fusion parameters corresponding to the preset number of network layers to be trained in the style fusion network to be trained, perform style fusion processing on the first sample style code and the sample object code to obtain a sample style fusion code;
将所述样本风格融合编码输入待训练图像生成网络进行图像生成处理,得到所述目标风格对应的样本对象风格图像;Inputting the sample style fusion code into the image generation network to be trained for image generation processing to obtain the sample object style image corresponding to the target style;
将所述样本对象风格图像、所述预设风格对象图像、预设对象图像、所述第一样本风格编码和所述第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息;Input the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminant network to be trained for style discrimination processing, and obtain the target discrimination information;
根据所述目标判别信息,确定目标损失信息;determining target loss information according to the target discrimination information;
基于所述目标损失信息,训练所述待训练风格融合网络、所述待训练图像生成网络和所述待训练判别网络;Training the style fusion network to be trained, the image generation network to be trained, and the discrimination network to be trained based on the target loss information;
将训练好的待训练风格融合网络作为所述风格融合网络,以及将训练好的待训练图像生成网络作为所述目标图像生成网络。The trained style fusion network to be trained is used as the style fusion network, and the trained image generation network to be trained is used as the target image generation network.
在一些实施例中,所述样本风格融合编码包括第一风格融合编码和第二风格融合编码;所述基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到样本风格融合编码包括:In some embodiments, the sample style fusion coding includes a first style fusion coding and a second style fusion coding; the sample network fusion parameters corresponding to a preset number of network layers to be trained based on the style fusion network to be trained, for Performing style fusion processing on the first sample style code and the sample object code to obtain the sample style fusion code includes:
获取第一融合调控数据和第二融合调控数据,所述第一融合调控数据用于调控所述第一样本风格编码和所述样本对象编码,从所述待训练风格融合网络中的第一个网络层开始融合;所述第二融合调控数据用于调控所述第一样本风格编码在所述待训练风格融合网络中不参与融合;Acquiring first fusion control data and second fusion control data, the first fusion control data is used to control the first sample style code and the sample object code, from the first style fusion network to be trained The first network layer starts to fuse; the second fusion control data is used to control the first sample style code not to participate in fusion in the style fusion network to be trained;
根据所述第一融合调控数据和所述第二融合调控数据,分别确定所述预设数量个待训练网络层对应的第一样本融合数据和第二样本融合数据;According to the first fusion control data and the second fusion control data, respectively determine the first sample fusion data and the second sample fusion data corresponding to the preset number of network layers to be trained;
对所述第一样本风格编码和所述样本对象编码进行拼接处理,得到样本拼接编码;Concatenating the first sample style code and the sample object code to obtain a sample concatenation code;
基于所述样本拼接编码进行融合权重学习,得到所述样本融合权重;Perform fusion weight learning based on the sample splicing code to obtain the sample fusion weight;
将所述第一样本融合数据、第二样本融合数据分别与所述样本融合权重进行加权处理,得到第一样本网络融合参数和第二样本网络融合参数;performing weighting processing on the first sample fusion data, the second sample fusion data and the sample fusion weight respectively, to obtain the first sample network fusion parameters and the second sample network fusion parameters;
基于所述第一样本网络融合参数和第二样本网络融合参数,分别在所述预设数量个待训练网络层中对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到所述第一风格融合编码和所述第二风格融合编码。Based on the first sample network fusion parameters and the second sample network fusion parameters, respectively perform style fusion processing on the first sample style code and the sample object code in the preset number of network layers to be trained , to obtain the first style fusion code and the second style fusion code.
在一些实施例中,所述样本对象风格图像包括所述第一风格融合编码对应的第一样本对象风格图像和所述第二风格融合编码对应的第二样本对象风格图像;所述待训练判别网络包括对象判别网络、风格对象判别网络和风格编码判别网络;所述目标判别信息包括对象判别信息、风格对象判别信息和风格编码判别信息;In some embodiments, the sample object style image includes a first sample object style image corresponding to the first style fusion encoding and a second sample object style image corresponding to the second style fusion encoding; the to-be-trained The discriminant network includes an object discriminant network, a style object discriminant network, and a style code discriminant network; the target discriminant information includes object discriminant information, style object discriminant information, and style code discriminant information;
所述将所述样本对象风格图像、所述预设风格对象图像、预设对象图像、所述第一样本风格编码和所述第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息包括:The sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code are input into the discriminant network to be trained for style discrimination processing, and the obtained Target discrimination information includes:
将所述第二样本对象风格图像和所述预设对象图像输入所述对象判别网络进行对象判别处理,得到对象判别信息;inputting the second sample object style image and the preset object image into the object discrimination network for object discrimination processing to obtain object discrimination information;
将所述第一样本对象风格图像和所述预设风格对象图像输入所述风格对象判别网络进行风格对象判别处理,得到风格对象判别信息;inputting the first sample object style image and the preset style object image into the style object discrimination network to perform style object discrimination processing to obtain style object discrimination information;
将所述第二样本对象风格图像、所述第一样本风格编码和所述第二样本风格编码输入所述风格编码判别网络进行风格编码判别处理,得到风格编码判别信息。Inputting the second sample object style image, the first sample style code and the second sample style code into the style code discrimination network for style code discrimination processing to obtain style code discrimination information.
根据本公开实施例的第二方面,提供一种图像生成方法,包括:According to a second aspect of an embodiment of the present disclosure, an image generation method is provided, including:
获取第一目标对象的第一原始对象图像;acquiring a first original object image of a first target object;
将所述第一原始对象图像输入第一风格转换网络进行风格转换处理,得到所述第一目标对象对应的第一目标对象风格图像;inputting the first original object image into a first style conversion network to perform style conversion processing to obtain a first target object style image corresponding to the first target object;
所述第一风格转换网络为基于第一样本对象图像和如第一方面提供的任一图像生成方法生成的目标风格的预设对象风格图像,对第一预设图像生成网络进行对抗训练得到的。The first style conversion network is a preset object style image based on the first sample object image and the target style generated by any image generation method provided in the first aspect, and the first preset image generation network is subjected to adversarial training to obtain of.
根据本公开实施例的第三方面,提供一种图像生成方法,包括:According to a third aspect of an embodiment of the present disclosure, an image generation method is provided, including:
获取第二目标对象的第二原始对象图像和目标风格标签;Obtaining a second original object image and a target style label of a second target object;
将所述第二原始对象图像和所述目标风格标签输入第二风格转换网络进行风格转换处理,得到所述第二目标对象对应的第二目标对象风格图像;inputting the second original object image and the target style label into a second style conversion network for style conversion processing, to obtain a second target object style image corresponding to the second target object;
所述第二风格转换网络为基于第二样本对象图像、多种目标风格标签和如第一方面提供的任一图像生成方法生成的多种目标风格的预设对象风格图像,对第二预设目标图像生成网络进行对抗训练得到的。The second style conversion network is based on the second sample object image, a variety of target style labels and a variety of target styles of preset object style images generated by any image generation method provided in the first aspect, for the second preset The target image generation network is obtained by adversarial training.
根据本公开实施例的第四方面,提供一种图像生成装置,包括:According to a fourth aspect of an embodiment of the present disclosure, an image generation device is provided, including:
编码获取模块,被配置为执行获取预设对象编码和目标风格的目标风格编码;an encoding acquisition module configured to perform target style encoding for acquiring preset object encodings and target styles;
第一风格融合处理模块,被配置为执行基于风格融合网络中预设数量个网络层对应的网络融合参数,对所述目标风格编码和所述预设对象编码进行风格融合处理,得到目标风格融合编码;所述网络融合参数为基于所述预设数量个网络层对应的融合数据和目标融合权重确定的,所述目标融合权重为基于所述目标风格编码和所述预设对象编码进行融合权重学习得到的;The first style fusion processing module is configured to perform style fusion processing on the target style code and the preset object code based on network fusion parameters corresponding to a preset number of network layers in the style fusion network to obtain a target style fusion Coding; the network fusion parameters are determined based on fusion data corresponding to the preset number of network layers and target fusion weights, and the target fusion weights are fusion weights based on the target style coding and the preset object coding learned;
第一图像生成处理模块,被配置为执行将所述目标风格融合编码输入目标图像生成网络进行图像生成处理,得到所述目标风格对应的预设对象风格图像。The first image generation processing module is configured to input the target style fusion code into the target image generation network for image generation processing, and obtain a preset object style image corresponding to the target style.
在一些实施例中,所述第一风格融合处理模块包括:In some embodiments, the first style fusion processing module includes:
目标融合调控数据获取单元,被配置为执行获取目标融合调控数据,所述目标融合调控数据用于调控所述目标风格编码和所述预设对象编码在所述风格融合网络中的融合位置;A target fusion control data acquisition unit configured to acquire target fusion control data, the target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network;
融合数据确定单元,被配置为执行根据所述目标融合调控数据,确定所述预设数量个网络层对应的融合数据;The fused data determination unit is configured to determine fused data corresponding to the preset number of network layers according to the target fused control data;
第一拼接处理单元,被配置为执行对所述目标风格编码和所述预设对象编码进行拼接处理,得到目标拼接编码;The first splicing processing unit is configured to perform splicing processing on the target style code and the preset object code to obtain the target spliced code;
第一融合权重学习单元,被配置为执行基于所述目标拼接编码进行融合权重学习,得到所述目标融合权重;The first fusion weight learning unit is configured to perform fusion weight learning based on the target splicing code to obtain the target fusion weight;
第一加权处理单元,被配置为执行对所述融合数据和所述目标融合权重进行加权处理,得到所述网络融合参数;a first weighting processing unit configured to perform weighting processing on the fusion data and the target fusion weight to obtain the network fusion parameters;
第一风格融合处理单元,被配置为执行基于所述网络融合参数在所述预设数量个网络层中对所述目标风格编码和所述预设对象编码进行风格融合处理,得到所述目标风格融合编码。The first style fusion processing unit is configured to perform style fusion processing on the target style code and the preset object code in the preset number of network layers based on the network fusion parameters to obtain the target style fusion encoding.
在一些实施例中,所述融合数据确定单元包括:In some embodiments, the fusion data determination unit includes:
比较单元,被配置为执行比较所述预设数量个网络层对应的层数与所述目标融合调控数据,得到比较结果;The comparison unit is configured to compare the number of layers corresponding to the preset number of network layers with the target fusion control data to obtain a comparison result;
融合数据确定子单元,被配置为执行根据所述比较结果,确定所述预设数量个网络层对应的融合数据。The fused data determining subunit is configured to determine fused data corresponding to the preset number of network layers according to the comparison result.
在一些实施例中,所述编码获取模块包括:In some embodiments, the code acquisition module includes:
参考风格图像获取单元,被配置为执行获取所述目标风格的参考风格图像;a reference style image acquisition unit configured to acquire a reference style image of the target style;
风格编码处理单元,被配置为执行将所述参考风格图像输入风格编码网络进行风格编码处理,得到所述目标风格编码。The style encoding processing unit is configured to input the reference style image into the style encoding network for style encoding processing to obtain the target style encoding.
在一些实施例中,所述装置还包括:In some embodiments, the device also includes:
样本图像获取模块,被配置为执行获取所述目标风格的正样本风格图像对和负样本风格图像对;The sample image acquisition module is configured to acquire the positive sample style image pair and the negative sample style image pair of the target style;
风格编码处理模块,被配置为执行将所述正样本风格图像对和负样本风格图像对输入待训练风格编码网络进行风格编码处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本风格编码;The style coding processing module is configured to perform style coding processing by inputting the positive sample style image pair and the negative sample style image pair into the style coding network to be trained, to obtain the positive sample style image pair and the negative sample style image pair The respective sample style codes;
感知处理模块,被配置为执行将所述样本风格编码输入待训练感知网络进行感知处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本感知特征信息;The perceptual processing module is configured to input the sample style code into the perceptual network to be trained for perceptual processing, and obtain the sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair;
对比损失信息确定模块,被配置为执行根据所述样本感知特征信息,确定对比损失信息;A comparison loss information determination module configured to determine the comparison loss information according to the sample perception feature information;
第一网络训练模块,被配置为执行基于所述对比损失信息,训练所述待训练风格编码网络和所述待训练感知网络;A first network training module configured to perform training of the style coding network to be trained and the perception network to be trained based on the comparison loss information;
风格编码网络确定模块,被配置为执行将训练好的待训练风格编码网络,作为所述风格编码网络。The style coding network determining module is configured to execute the trained style coding network to be trained as the style coding network.
在一些实施例中,所述编码获取模块包括:In some embodiments, the code acquisition module includes:
初始风格编码生成单元,被配置为执行基于第一预设分布,随机生成初始风格编码;an initial style code generating unit configured to randomly generate an initial style code based on a first preset distribution;
第一感知处理单元,被配置为执行将所述初始风格编码输入第一多层感知网络进行感知处理,得到所述目标风格编码。The first perceptual processing unit is configured to input the initial style code into the first multi-layer perceptual network for perceptual processing to obtain the target style code.
在一些实施例中,所述编码获取模块包括:In some embodiments, the code acquisition module includes:
初始对象编码生成单元,被配置为执行基于第二预设分布,随机生成初始对象编码;The initial object code generating unit is configured to randomly generate the initial object code based on the second preset distribution;
第二感知处理单元,被配置为执行将所述初始对象编码输入第二多层感知网络进行感知处理,得到所述预设对象编码。The second perceptual processing unit is configured to input the initial object code into the second multi-layer perceptual network for perceptual processing to obtain the preset object code.
在一些实施例中,所述装置还包括:In some embodiments, the device also includes:
样本数据获取模块,被配置为执行获取所述目标风格的第一样本风格编码、样本对象编码、非目标风格的第二样本风格编码、预设风格对象图像和预设对象图像;A sample data acquisition module configured to acquire the first sample style code of the target style, the sample object code, the second sample style code of the non-target style, the preset style object image and the preset object image;
第二风格融合处理模块,被配置为执行基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到样本风格融合编码;The second style fusion processing module is configured to perform style encoding on the first sample style code and the sample object code based on sample network fusion parameters corresponding to a preset number of network layers to be trained in the style fusion network to be trained Fusion processing to obtain sample style fusion coding;
第二图像生成处理模块,被配置为执行将所述样本风格融合编码输入待训练图像生成网络进行图像生成处理,得到所述目标风格对应的样本对象风格图像;The second image generation processing module is configured to input the sample style fusion code into the image generation network to be trained for image generation processing, and obtain the sample object style image corresponding to the target style;
风格判别处理模块,被配置为执行将所述样本对象风格图像、所述预设风格对象图像、预设对象图像、所述第一样本风格编码和所述第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息;a style discrimination processing module configured to input the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminator to be trained The network performs style discrimination processing to obtain target discrimination information;
目标损失信息确定模块,被配置为执行根据所述目标判别信息,确定目标损失信息;A target loss information determination module configured to determine target loss information according to the target discrimination information;
第二网络训练模块,被配置为执行基于所述目标损失信息,训练所述待训练风格融合网络、所述待训练图像生成网络和所述待训练判别网络;The second network training module is configured to perform training of the style fusion network to be trained, the image generation network to be trained, and the discrimination network to be trained based on the target loss information;
网络确定模块,被配置为执行将训练好的待训练风格融合网络作为所述风格融合网络,以及将训练好的待训练图像生成网络作为所述目标图像生成网络。The network determination module is configured to use the trained style fusion network to be trained as the style fusion network, and use the trained image generation network to be trained as the target image generation network.
在一些实施例中,所述样本风格融合编码包括第一风格融合编码和第二风格融合编码;所述第二风格融合处理模块包括:In some embodiments, the style fusion coding of the sample includes a first style fusion coding and a second style fusion coding; the second style fusion processing module includes:
样本融合调控数据获取单元,被配置为执行获取第一融合调控数据和第二融合调控数据,所述第一融合调控数据用于调控所述第一样本风格编码和所述样本对象编码,从所述待训练风格融合网络中的第一个网络层开始融合;所述第二融合调控数据用于调控所述第一样本风格编码在所述待训练风格融合网络中不参与融合;The sample fusion control data acquisition unit is configured to perform acquisition of first fusion control data and second fusion control data, the first fusion control data is used to control the first sample style code and the sample object code, from The first network layer in the style fusion network to be trained starts to fuse; the second fusion control data is used to control the first sample style code not to participate in fusion in the style fusion network to be trained;
样本融合数据确定单元,被配置为执行根据所述第一融合调控数据和所述第二融合调控数据,分别确定所述预设数量个待训练网络层对应的第一样本融合数据和第二样本融合数据;The sample fusion data determination unit is configured to determine the first sample fusion data and the second fusion data corresponding to the preset number of network layers to be trained according to the first fusion control data and the second fusion control data. Sample fusion data;
第二拼接处理单元,被配置为执行对所述第一样本风格编码和所述样本对象编码进行拼接处理,得到样本拼接编码;A second splicing processing unit configured to perform splicing processing on the first sample style code and the sample object code to obtain a sample spliced code;
第二融合权重学习单元,被配置为执行基于所述样本拼接编码进行融合权重学习,得到所述样本融合权重;The second fusion weight learning unit is configured to perform fusion weight learning based on the sample splicing code to obtain the sample fusion weight;
第二加权处理单元,被配置为执行将所述第一样本融合数据、第二样本融合数据分别与所述样本融合权重进行加权处理,得到第一样本网络融合参数和第二样本网络融合参数;The second weighting processing unit is configured to perform weighting processing on the first sample fusion data, the second sample fusion data and the sample fusion weight respectively, to obtain the first sample network fusion parameters and the second sample network fusion parameter;
第二风格融合处理单元,被配置为执行基于所述第一样本网络融合参数和第二样本网络融合参数,分别在所述预设数量个待训练网络层中对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到所述第一风格融合编码和所述第二风格融合编码。The second style fusion processing unit is configured to perform, based on the network fusion parameters of the first sample and the network fusion parameters of the second sample, the first sample style in the preset number of network layers to be trained respectively. performing style fusion processing on the coding and the sample object coding to obtain the first style fusion coding and the second style fusion coding.
在一些实施例中,所述样本对象风格图像包括所述第一风格融合编码对应的第一样本对象风格图像和所述第二风格融合编码对应的第二样本对象风格图像;所述待训练判别网络包括对象判别网络、风格对象判别网络和风格编码判别网络;所述目标判别信息包括对象判别信息、风格对象判别信息和风格编码判别信息;In some embodiments, the sample object style image includes a first sample object style image corresponding to the first style fusion encoding and a second sample object style image corresponding to the second style fusion encoding; the to-be-trained The discriminant network includes an object discriminant network, a style object discriminant network, and a style code discriminant network; the target discriminant information includes object discriminant information, style object discriminant information, and style code discriminant information;
所述风格判别处理模块包括:The style discrimination processing module includes:
对象判别处理单元,被配置为执行将所述第二样本对象风格图像和所述预设对象图像输入所述对象判别网络进行对象判别处理,得到对象判别信息;an object discrimination processing unit configured to perform object discrimination processing by inputting the second sample object style image and the preset object image into the object discrimination network to obtain object discrimination information;
风格对象判别处理单元,被配置为执行将所述第一样本对象风格图像和所述预设风格对象图像输入所述风格对象判别网络进行风格对象判别处理,得到风格对象判别信息;a style object discrimination processing unit configured to perform style object discrimination processing by inputting the first sample object style image and the preset style object image into the style object discrimination network to obtain style object discrimination information;
风格编码判别处理单元,被配置为执行将所述第二样本对象风格图像、所述第一样本风格编码和所述第二样本风格编码输入所述风格编码判别网络进行风格编码判别处理,得到风格编码判别信息。a style encoding discrimination processing unit configured to perform style encoding discrimination processing by inputting the second sample object style image, the first sample style code and the second sample style code into the style coding discrimination network to obtain Style encodes discriminative information.
根据本公开实施例的第五方面,提供一种图像生成装置,包括:According to a fifth aspect of an embodiment of the present disclosure, an image generating device is provided, including:
原始对象图像获取模块,被配置为执行获取第一目标对象的第一原始对象图像;an original object image acquisition module configured to perform acquisition of a first original object image of the first target object;
第一风格转换处理模块,被配置为执行将所述第一原始对象图像输入第一风格转换网络进行风格转换处理,得到所述第一目标对象对应的第一目标对象风格图像;The first style conversion processing module is configured to perform style conversion processing by inputting the first original object image into a first style conversion network to obtain a first target object style image corresponding to the first target object;
所述第一风格转换网络为基于第一样本对象图像和如第一方面提供的任一图像生成方法生成的目标风格的预设对象风格图像,对第一预设图像生成网络进行对抗训练得到的。The first style conversion network is a preset object style image based on the first sample object image and the target style generated by any image generation method provided in the first aspect, and the first preset image generation network is subjected to adversarial training to obtain of.
根据本公开实施例的第六方面,提供一种图像生成装置,包括:According to a sixth aspect of the embodiments of the present disclosure, there is provided an image generation device, including:
数据获取模块,被配置为执行获取第二目标对象的第二原始对象图像和目标风格标签;a data acquisition module configured to perform acquisition of a second original object image and a target style label of a second target object;
第二风格转换处理模块,被配置为执行将所述第二原始对象图像和所述目标风格标签输入第二风格转换网络进行风格转换处理,得到所述第二目标对象对应的第二目标对象风格图像;The second style conversion processing module is configured to input the second original object image and the target style label into a second style conversion network for style conversion processing, and obtain the second target object style corresponding to the second target object image;
所述第二风格转换网络为基于第二样本对象图像、多种目标风格标签和如第一方面提供的任一图像生成方法生成的多种目标风格的预设对象风格图像,对第二预设目标图像生成网络进行对抗训练得到的。The second style conversion network is based on the second sample object image, a variety of target style labels and a variety of target styles of preset object style images generated by any image generation method provided in the first aspect, for the second preset The target image generation network is obtained by adversarial training.
根据本公开实施例的第七方面,提供一种电子设备,包括:处理器;用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令,以实现如上述第一方面、第二方面和第三方面中任一项所述的方法。According to a seventh aspect of the embodiments of the present disclosure, there is provided an electronic device, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions to implement The method as described in any one of the first aspect, the second aspect and the third aspect above.
根据本公开实施例的第八方面,提供一种计算机可读存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行本公开实施例的第一方面、第二方面和第三方面中任一项所述方法。According to the eighth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium. When the instructions in the storage medium are executed by the processor of the electronic device, the electronic device can execute the first method of the embodiments of the present disclosure. Aspect, the method described in any one of the second aspect and the third aspect.
根据本公开实施例的第九方面,提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行本公开实施例的第一方面、第二方面和第三方面中任一项所述方法。According to a ninth aspect of the embodiments of the present disclosure, there is provided a computer program product containing instructions, which, when run on a computer, causes the computer to execute any one of the first aspect, the second aspect and the third aspect of the embodiments of the present disclosure. method described in the item.
在生成某一类对象的风格化图像(预设对象风格图像)的过程中,将风格化图像解耦为对象编码和风格编码两部分,并在风格融合网络中,结合目标融合权重与预设数量个网络层对应的融合数据确定的网络融合参数,对目标风格编码和预设对象编码进行风格融合处理,可以通过对两者的融合,得到既能表征某一类对象的对象特征,又能有效融入目标风格的目标风格融合编码,且目标融合权重是基于目标风格编码和预设对象编码进行融合权重学习,可以实现自适应的调整对象在不同目标风格下的融合权重,进而更好的融合出目标风格下的对象风格编码。在大大提升了风格化效果和风格化图像 质量的基础上,可以大大提升多风格的对象风格图像自适应生成效率。In the process of generating a stylized image of a certain type of object (preset object style image), the stylized image is decoupled into two parts, object encoding and style encoding, and in the style fusion network, the target fusion weight is combined with the preset The network fusion parameters determined by the fusion data corresponding to the number of network layers are used to perform style fusion processing on the target style code and the preset object code. Through the fusion of the two, the object characteristics that can represent a certain type of object The target style fusion coding that effectively integrates the target style, and the target fusion weight is based on the target style coding and the preset object coding for fusion weight learning, which can realize adaptive adjustment of the fusion weight of the object under different target styles, and then better fusion Object style encoding under the target style. On the basis of greatly improving the stylized effect and stylized image quality, it can greatly improve the adaptive generation efficiency of multi-style object style images.
附图说明Description of drawings
图1是根据一示例性实施例示出的一种应用环境的示意图;Fig. 1 is a schematic diagram showing an application environment according to an exemplary embodiment;
图2是根据一示例性实施例示出的一种图像生成方法的流程图;Fig. 2 is a flowchart of an image generation method shown according to an exemplary embodiment;
图3是根据一示例性实施例提供的一种风格编码网络的网络结构示意图;Fig. 3 is a schematic network structure diagram of a style coding network provided according to an exemplary embodiment;
图4是根据一示例性实施例示出的一种预先训练出风格编码网络的流程图;Fig. 4 is a flowchart showing a pre-trained style coding network according to an exemplary embodiment;
图5是根据一示例性实施例示出的一种基于风格融合网络中预设数量个网络层对应的网络融合参数,对目标风格编码和预设对象编码进行风格融合处理,得到目标风格融合编码的流程图;Fig. 5 is a network fusion parameter corresponding to a preset number of network layers in a style fusion network according to an exemplary embodiment, and performs style fusion processing on target style coding and preset object coding to obtain target style fusion coding flow chart;
图6是根据一示例性实施例示出的一种预先训练出目标图像生成网络和风格融合网络的流程图;Fig. 6 is a flowchart showing a pre-trained target image generation network and style fusion network according to an exemplary embodiment;
图7是根据一示例性实施例示出的一种基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对第一样本风格编码和样本对象编码进行风格融合处理,得到样本风格融合编码的流程图;Fig. 7 shows a style fusion process based on the sample network fusion parameters corresponding to the preset number of network layers to be trained in the style fusion network to be trained, according to an exemplary embodiment, to perform style fusion processing on the first sample style coding and sample object coding , to obtain the flow chart of sample style fusion coding;
图8是根据一示例性实施例提供的一种训练风格融合网络和目标图像生成网络的示意图;Fig. 8 is a schematic diagram of a training style fusion network and a target image generation network according to an exemplary embodiment;
图9是根据一示例性实施例示出的一种图像生成方法的流程图;Fig. 9 is a flowchart of an image generation method according to an exemplary embodiment;
图10是根据一示例性实施例示出的另一种图像生成方法的流程图;Fig. 10 is a flowchart of another image generation method according to an exemplary embodiment;
图11是根据一示例性实施例示出的一种图像生成装置框图;Fig. 11 is a block diagram of an image generating device according to an exemplary embodiment;
图12是根据一示例性实施例示出的另一种图像生成装置框图;Fig. 12 is a block diagram of another image generating device according to an exemplary embodiment;
图13是根据一示例性实施例示出的另一种图像生成装置框图;Fig. 13 is a block diagram of another image generating device according to an exemplary embodiment;
图14是根据一示例性实施例示出的一种用于图像生成的电子设备的框图;Fig. 14 is a block diagram of an electronic device for image generation according to an exemplary embodiment;
图15是根据一示例性实施例示出的另一种用于图像生成的电子设备的框图。Fig. 15 is a block diagram showing another electronic device for image generation according to an exemplary embodiment.
具体实施方式Detailed ways
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。It should be noted that the terms "first" and "second" in the specification and claims of the present disclosure and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein can be practiced in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present disclosure as recited in the appended claims.
需要说明的是,本公开所涉及的用户信息(包括但不限于用户设备信息、用户个人信息等)和数据(包括但不限于用于展示的数据、分析的数据等),均为经用户授权或者经过各方充分授权的信息和数据。It should be noted that the user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for display, data for analysis, etc.) involved in this disclosure are authorized by the user. Or information and data fully authorized by the parties.
请参阅图1,图1是根据一示例性实施例示出的一种应用环境的示意图,如图1所示,该应用环境可以包括终端100和服务器200。Please refer to FIG. 1 . FIG. 1 is a schematic diagram showing an application environment according to an exemplary embodiment. As shown in FIG. 1 , the application environment may include a terminal 100 and a server 200 .
终端100可以用于面向任一用户提供目标对象的风格化图像(对象风格图像)生成服务。在一些实施例中,终端100可以包括但不限于智能手机、台式计算机、平板电脑、笔记本电脑、智能音箱、数字助理、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、智能可穿戴设备等类型的电子设备,也可以为运行于上述电子设备的软件,例如应用程序等。在一些实施例中,电子设备上运行的操作系统可以包括但不限于安卓系统、IOS系统、linux、windows等。The terminal 100 can be used to provide a stylized image (object style image) generation service of a target object for any user. In some embodiments, the terminal 100 may include, but not limited to, smartphones, desktop computers, tablet computers, notebook computers, smart speakers, digital assistants, augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) devices, Electronic devices such as smart wearable devices may also be software running on the above-mentioned electronic devices, such as application programs. In some embodiments, the operating system running on the electronic device may include but not limited to Android system, IOS system, linux, windows and so on.
在一些实施例中,服务器200可以为终端100提供后台服务,预先生成用于训练风格转换网络的对象风格图像,以及训练可以用于将对象图像转换成风格化图像(目标风格的对象风格图像)。在一些实施例中,服务器200可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN(Content Delivery Network,内容分发网络)、以及大数据和人工智能平台等基础云计算服务的云服务器。In some embodiments, the server 200 can provide background services for the terminal 100, pre-generate object style images for training the style transfer network, and train the object images that can be used to convert object images into stylized images (object style images of the target style) . In some embodiments, the server 200 can be an independent physical server, or a server cluster or a distributed system composed of multiple physical servers, and can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, Cloud servers for basic cloud computing services such as network services, cloud communications, middleware services, domain name services, security services, CDN (Content Delivery Network, content distribution network), and big data and artificial intelligence platforms.
此外,需要说明的是,图1所示的仅仅是本公开提供的一种应用环境,在实际应用中,还可以包括其他应用环境,例如可以包括更多的终端。In addition, it should be noted that what is shown in FIG. 1 is only an application environment provided by the present disclosure, and in actual application, other application environments may also be included, for example, more terminals may be included.
本说明书实施例中,上述终端100以及服务器200可以通过有线或无线通信方式进行直接或间接 地连接,本公开在此不做限制。In the embodiment of this specification, the above-mentioned terminal 100 and server 200 may be directly or indirectly connected through wired or wireless communication, which is not limited in this disclosure.
图2是根据一示例性实施例示出的一种图像生成方法的流程图,如图2所示,该图像生成方法用于终端、服务器等电子设备中,包括以下步骤。Fig. 2 is a flowchart of an image generation method according to an exemplary embodiment. As shown in Fig. 2 , the image generation method is used in electronic devices such as terminals and servers, and includes the following steps.
在步骤S201中,获取预设对象编码和目标风格的目标风格编码。In step S201, a preset object code and a target style code of a target style are obtained.
在一些实施例中,目标风格可以为任意一种图像风格。图像风格的划分可以结合实际应用需求有多种划分方式。在一些实施例中,目标风格可以包括但不限于动漫、油画、铅笔画等图像风格。In some embodiments, the target style can be any image style. The division of image style can be divided into multiple ways according to actual application requirements. In some embodiments, the target style may include but not limited to image styles such as animation, oil painting, and pencil drawing.
在一些实施例中,目标风格编码可以为能够表征目标风格的风格特征的编码信息。预设对象编码可以为能够表征某一类对象的对象特征的编码信息。在一些实施例中,对象可以包括但不限于人脸、猫脸、狗脸等需要进行风格转换的对象。In some embodiments, the target style code may be coded information that can characterize the style features of the target style. The preset object code may be coded information capable of characterizing object features of a certain type of object. In some embodiments, the objects may include, but are not limited to, human faces, cat faces, dog faces, and other objects requiring style conversion.
在一些实施例中,上述目标风格可以为从某一种参考的风格图像中提取的图像风格,相应的,上述获取目标风格的目标风格编码可以包括:In some embodiments, the above-mentioned target style may be an image style extracted from a certain reference style image, and correspondingly, the above-mentioned target style encoding for obtaining the target style may include:
获取目标风格的参考风格图像;Obtain the reference style image of the target style;
将参考风格图像输入风格编码网络进行风格编码处理,得到目标风格编码。The reference style image is input into the style encoding network for style encoding processing, and the target style encoding is obtained.
在一些实施例中,风格图像可以为具有某一种风格的图像。以对象为人脸为例,且目标风格为动漫风格为例,风格图像为动漫人脸图像。In some embodiments, the style image may be an image with a certain style. Taking the subject as a human face as an example, and the target style as an example of an anime style, the style image is an anime human face image.
在一些实施例中,风格编码网络可以为基于目标风格的正样本风格图像对和负样本风格图像对对待训练风格编码网络进行对比训练得到的。In some embodiments, the style coding network can be obtained by comparing the style coding network to be trained with the target style-based positive sample style image pair and negative sample style image pair.
在一些实施例中,风格编码网络的网络结构可以结合实际应用预先设置。在一些实施例中,如图3所示,图3是根据一示例性实施例提供的一种风格编码网络的网络结构示意图。在一些实施例中,风格编码网络可以包括依次连接卷积神经网络、风格特征提取网络、特征拼接网络和多层感知网络。In some embodiments, the network structure of the style coding network can be preset in combination with actual applications. In some embodiments, as shown in FIG. 3 , FIG. 3 is a schematic network structure diagram of a style coding network provided according to an exemplary embodiment. In some embodiments, the style encoding network may include sequentially connecting a convolutional neural network, a style feature extraction network, a feature splicing network, and a multi-layer perceptual network.
在一些实施例中,卷积神经网络可以用于提取参考风格图像的图像特征信息;风格特征提取网络可以用于提取图像特征信息中的风格特征信息;特征拼接网络可以用于将风格特征信息拼接成预设维度(预设维度与多层感知网络的输入维度一致)的长向量;上述多层感知网络可以包括依次连接的两部分多层感知网络,第一部分多层感知网络可以用于进行降维,以及将预设维度的长向量转换成信息更稠密的风格编码,进而可以更易于训练图像生成网络。第二部分多层感知网络可以用于对第一部分多层感知网络输出的编码信息进行降维,以及将该编码信息从(编码空间)变换到图像生成网络所在的分布空间。In some embodiments, the convolutional neural network can be used to extract the image feature information of the reference style image; the style feature extraction network can be used to extract the style feature information in the image feature information; the feature splicing network can be used to stitch the style feature information into a long vector of a preset dimension (the preset dimension is consistent with the input dimension of the multi-layer perceptual network); the above-mentioned multi-layer perceptual network can include two parts of the multi-layer perceptual network connected in sequence, and the first part of the multi-layer perceptual network can be used to reduce Dimensions, and convert long vectors of preset dimensions into more dense style codes, which makes it easier to train image generation networks. The second part of the multi-layer perceptual network can be used to reduce the dimensionality of the encoded information output by the first part of the multi-layer perceptual network, and transform the encoded information from (encoding space) to the distribution space where the image generation network is located.
在一些实施例中,卷积神经网络、风格特征提取网络、特征拼接网络和多层感知网络的具体网络结构也可以结合实际应用进行设置。In some embodiments, the specific network structures of convolutional neural network, style feature extraction network, feature splicing network and multi-layer perceptual network can also be set in combination with practical applications.
在一些实施例中,上述方法还可以包括:预先训练出风格编码网络的步骤,如图4所示,预先训练出风格编码网络的可以包括以下步骤:In some embodiments, the above method may also include: the step of pre-training the style coding network, as shown in FIG. 4 , the pre-training of the style coding network may include the following steps:
在步骤S401中,获取目标风格的正样本风格图像对和负样本风格图像对;In step S401, a positive sample style image pair and a negative sample style image pair of the target style are obtained;
在步骤S403中,将正样本风格图像对和负样本风格图像对输入待训练风格编码网络进行风格编码处理,得到正样本风格图像对和负样本风格图对各自对应的样本风格编码;In step S403, the positive sample style image pair and the negative sample style image pair are input into the style coding network to be trained for style coding processing, and the respective corresponding sample style codes of the positive sample style image pair and the negative sample style image pair are obtained;
在步骤S405中,将样本风格编码输入待训练感知网络进行感知处理,得到正样本风格图像对和负样本风格图对各自对应的样本感知特征信息;In step S405, the sample style code is input into the perception network to be trained for perceptual processing, and the respective sample perception feature information corresponding to the positive sample style image pair and the negative sample style image pair are obtained;
在步骤S407中,根据样本感知特征信息,确定对比损失信息;In step S407, according to the sample perceptual feature information, determine the contrast loss information;
在步骤S409中,基于对比损失信息,训练待训练风格编码网络和待训练感知网络;In step S409, based on the comparison loss information, the style encoding network to be trained and the perception network to be trained are trained;
在步骤S411中,将训练好的待训练风格编码网络,作为风格编码网络。In step S411, the trained style coding network to be trained is used as the style coding network.
在一些实施例中,待训练风格编码网络的网络结构与风格编码网络的网络结构相同,但网络参数不同。In some embodiments, the network structure of the style coding network to be trained is the same as that of the style coding network, but the network parameters are different.
在一些实施例中,可以获取一张目标风格的参考风格图像,并通过对该参考风格图像进行多次仿射变换,相应的,将每次仿射变换后的风格图像与该参考风格图像组成一对正样本风格图像对;在一些实施例中,多次仿射变化的平移量可以不同。In some embodiments, a reference style image of a target style can be obtained, and by performing multiple affine transformations on the reference style image, correspondingly, the style image after each affine transformation is combined with the reference style image A pair of positive sample style images; in some embodiments, the translation amount of multiple affine changes can be different.
在一些实施例中,可以获取多个非目标风格的参考风格图像,并将该多个非目标风格的参考图像分别与目标风格的参考风格图像组成多对目标风格的负样本风格图像对。在一些实施例中,也可以通过对某一张非目标风格的参考风格图像进行多次仿射变换,并将多次仿射变换后的风格图像分别与目 标风格的参考风格图像组成多对负样本风格图像对。In some embodiments, a plurality of non-target style reference style images may be obtained, and the plurality of non-target style reference images are respectively combined with target style reference style images to form multiple pairs of target style negative sample style image pairs. In some embodiments, multiple affine transformations may be performed on a reference style image of a non-target style, and the style images after multiple affine transformations are respectively combined with the reference style image of the target style to form multiple pairs of negative Sample style image pair.
在一些实施例中,为了利用自监督学习策略来训练上述待训练风格编码网络,可以将待训练风格编码网络输出的正样本风格图像对和负样本风格图对各自对应的样本风格编码输入待训练感知网络进行感知处理,得到正样本风格图像对和负样本风格图对各自对应的样本感知特征信息,并根据样本感知特征信息,确定对比损失信息。In some embodiments, in order to use the self-supervised learning strategy to train the above-mentioned style coding network to be trained, the corresponding sample style codes of the positive sample style image pair and the negative sample style image pair output by the style coding network to be trained can be input to the training The perceptual network performs perceptual processing to obtain the sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair, and determines the comparative loss information according to the sample perceptual feature information.
在一些实施例中,在根据样本感知特征信息,确定对比损失信息的过程中可以结合预设对比损失函数,在一些实施例中,预设对比损失函数可以为任意一种对比损失函数,在一些实施例中,例如NT-Xent对比损失函数(the normalized temperature-scaled cross entropy loss,归一化温度缩放的交叉熵损失函数)。In some embodiments, the preset contrast loss function may be combined in the process of determining the contrast loss information according to the sample perceptual feature information. In some embodiments, the preset contrast loss function may be any kind of contrast loss function. In some In an embodiment, for example, the NT-Xent contrast loss function (the normalized temperature-scaled cross entropy loss, the normalized temperature-scaled cross entropy loss function).
在一些实施例中,上述基于对比损失信息,训练待训练风格编码网络和待训练感知网络可以包括:基于对比损失信息,更新待训练风格编码网络和待训练感知网络中的网络参数;接着,基于更新后的待训练风格编码网络和待训练感知网络重复将正样本风格图像对和负样本风格图像对输入待训练风格编码网络进行风格编码处理,得到正样本风格图像对和负样本风格图对各自对应的样本风格编码至基于对比损失信息,更新待训练风格编码网络和待训练感知网络中的网络参数的训练迭代操作,直至达到第一预设收敛条件。In some embodiments, the above-mentioned training of the style encoding network to be trained and the perception network to be trained based on the comparison loss information may include: updating the network parameters in the style encoding network to be trained and the perception network to be trained based on the comparison loss information; then, based on The updated style coding network to be trained and the perception network to be trained repeatedly input the positive sample style image pair and the negative sample style image pair into the style coding network to be trained for style coding processing, and obtain the positive sample style image pair and the negative sample style image pair respectively. The corresponding sample style is encoded to a training iterative operation of updating network parameters in the style encoding network to be trained and the perception network to be trained based on the comparison loss information, until the first preset convergence condition is reached.
在一些实施例中,在达到第一预设收敛条件的情况下,将当前的待训练风格编码网络(训练好的待训练风格编码网络)作为上述风格编码网络。In some embodiments, when the first preset convergence condition is reached, the current style encoding network to be trained (the trained style encoding network to be trained) is used as the style encoding network.
在一些实施例中,上述达到第一预设收敛条件可以为训练迭代操作的次数达到第一预设训练次数。在一些实施例中,达到第一预设收敛条件也可以为对比损失信息小于第一预设阈值。本说明书实施例中,第一预设训练次数和第一预设阈值可以结合实际应用中对网络的训练速度和精准度预先设置。In some embodiments, reaching the first preset convergence condition may be that the number of training iteration operations reaches the first preset training number. In some embodiments, reaching the first preset convergence condition may also be that the contrast loss information is smaller than the first preset threshold. In the embodiment of this specification, the first preset training times and the first preset threshold can be preset in combination with the training speed and accuracy of the network in practical applications.
上述实施例中,基于目标风格的正样本风格图像对和负样本风格图像对是对待训练风格编码网络进行对比训练得到的,可以实现自监督的风格编码网络的训练,有效保证训练出的风格编码网络对风格特征的表征准确度。且基于训练好的风格编码网络从目标风格的风格图像中提取目标风格的目标风格编码,可以有效提升目标风格编码对目标风格表征的精准性。In the above-mentioned embodiment, the positive sample style image pair and the negative sample style image pair based on the target style are obtained by comparative training of the style coding network to be trained, which can realize the training of the self-supervised style coding network, and effectively ensure that the trained style coding The representation accuracy of the network on style features. And based on the trained style coding network, the target style coding of the target style is extracted from the style image of the target style, which can effectively improve the accuracy of the target style coding for the target style representation.
在一些实施例中,上述目标风格也可以为随机采样得到的一种图像风格,相应的,上述获取目标风格的目标风格编码包括:In some embodiments, the above-mentioned target style may also be an image style obtained by random sampling. Correspondingly, the above-mentioned target style encoding for obtaining the target style includes:
基于第一预设分布,随机生成初始风格编码;Randomly generate an initial style code based on a first preset distribution;
将初始风格编码输入第一多层感知网络进行感知处理,得到目标风格编码。The initial style code is input into the first multi-layer perceptual network for perceptual processing, and the target style code is obtained.
在一些实施例中,第一预设分布可以为预设的编码分布,在一些实施例中,第一预设分布可以包括但不限于高斯分布。In some embodiments, the first preset distribution may be a preset encoding distribution, and in some embodiments, the first preset distribution may include but not limited to a Gaussian distribution.
在一些实施例中,第一多层感知网络可以用于对初始风格编码进行降维,以及将初始风格编码从(编码空间,例如高斯分布空间)变换到图像生成网络所在的分布空间。In some embodiments, the first multi-layer perceptual network can be used to reduce the dimensionality of the initial style code, and transform the initial style code from (encoding space, such as Gaussian distribution space) to the distribution space where the image generation network is located.
上述实施例中,通过随机生成初始风格编码,可以提升风格编码的生成效率和多样性,大大提升生成对象风格图像灵活性;且基于第一预设分布随机生成初始风格编码后,结合第一多层感知网络进行感知处理,可以更易于训练图像生成网络。In the above-mentioned embodiment, by randomly generating the initial style code, the generation efficiency and diversity of the style code can be improved, and the flexibility of generating the style image of the object can be greatly improved; and after the initial style code is randomly generated based on the first preset distribution, combined with the first multiple Layer-aware networks perform perceptual processing, making it easier to train image generation networks.
在一些实施例中,上述预设对象编码包括采用下述步骤获取:In some embodiments, the above-mentioned preset object code includes obtaining by the following steps:
基于第二预设分布,随机生成初始对象编码;Randomly generating an initial object code based on a second preset distribution;
将初始对象编码输入第二多层感知网络进行感知处理,得到预设对象编码。The initial object code is input into the second multi-layer perceptual network for perceptual processing, and the preset object code is obtained.
在一些实施例中,第二预设分布可以为预设的编码分布,在一些实施例中,第二预设分布可以包括但不限于高斯分布。In some embodiments, the second preset distribution may be a preset encoding distribution, and in some embodiments, the second preset distribution may include but not limited to a Gaussian distribution.
在一些实施例中,第二多层感知网络可以用于对初始对象编码进行降维,以及将初始对象编码从(编码空间,例如高斯分布空间)变换到图像生成网络所在的分布空间。In some embodiments, the second multi-layer perceptual network can be used to reduce the dimensionality of the initial object encoding and transform the initial object encoding from (encoding space, eg Gaussian distribution space) to the distribution space where the image generation network resides.
上述实施例中,通过随机生成初始对象编码,可以提升对象编码的生成效率,进而可以针对某一风格快速生成大量对象风格图像,有效提升了某一风格的图像生成效率,且基于第二预设分布随机生成初始对象编码后,结合第二多层感知网络进行感知处理,可以更易于训练图像生成网络。In the above embodiment, by randomly generating the initial object code, the generation efficiency of the object code can be improved, and a large number of object style images can be quickly generated for a certain style, which effectively improves the image generation efficiency of a certain style, and based on the second preset After the initial object encoding is randomly generated by the distribution, combined with the second multi-layer perceptual network for perceptual processing, it can be easier to train the image generation network.
在步骤S203中,基于风格融合网络中预设数量个网络层对应的网络融合参数,对目标风格编码和预设对象编码进行风格融合处理,得到目标风格融合编码。In step S203, based on the network fusion parameters corresponding to the preset number of network layers in the style fusion network, the style fusion processing is performed on the target style code and the preset object code to obtain the target style fusion code.
在一些实施例中,风格融合网络可以用于对目标风格编码和预设对象编码进行风格融合处理。在一些实施例中,风格融合网络可以包括预设数量个网络层;在一些实施例中,网络层的数量(预设数量)可以结合实际应用进行设置。In some embodiments, the style fusion network can be used to perform style fusion processing on target style codes and preset object codes. In some embodiments, the style fusion network may include a preset number of network layers; in some embodiments, the number of network layers (preset number) may be set in conjunction with actual applications.
在一些实施例中,上述风格融合网络可以为能够调控融合程度以及自适应调整融合权重的风格融合网络。在一些实施例中,可以通过调控目标风格编码和预设对象编码在风格融合网络中的融合位置(即从哪个网络层开始融合),以控制融合程度。在一些实施例中,可以通过对目标风格编码和预设对象编码进行融合权重学习,实现自适应的调整对象在不同目标风格下的融合权重。In some embodiments, the aforementioned style fusion network may be a style fusion network capable of regulating the degree of fusion and adaptively adjusting fusion weights. In some embodiments, the degree of fusion can be controlled by regulating the fusion position of the target style code and the preset object code in the style fusion network (that is, from which network layer to start fusion). In some embodiments, fusion weight learning of target style codes and preset object codes can be performed to achieve adaptive adjustment of fusion weights of objects under different target styles.
在一些实施例中,上述网络融合参数可以为基于预设数量个网络层对应的融合数据和目标融合权重确定的,上述目标融合权重为基于目标风格编码和预设对象编码进行融合权重学习得到的。In some embodiments, the above-mentioned network fusion parameters may be determined based on fusion data corresponding to a preset number of network layers and target fusion weights, and the above-mentioned target fusion weights are obtained by performing fusion weight learning based on target style coding and preset object coding .
在一些实施例中,如图5所示,上述基于风格融合网络中预设数量个网络层对应的网络融合参数,对目标风格编码和预设对象编码进行风格融合处理,得到目标风格融合编码可以包括以下步骤:In some embodiments, as shown in FIG. 5, the above-mentioned network fusion parameters corresponding to the preset number of network layers in the style fusion network perform style fusion processing on the target style code and the preset object code, and the target style fusion code can be obtained. Include the following steps:
在步骤S501中,获取目标融合调控数据。In step S501, target fusion regulation data is acquired.
在一些实施例中,上述目标融合调控数据用于调控目标风格编码和预设对象编码在风格融合网络中的融合位置。In some embodiments, the above target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network.
在步骤S503中,根据目标融合调控数据,确定预设数量个网络层对应的融合数据。In step S503, according to the target fusion control data, the fusion data corresponding to a preset number of network layers are determined.
在一些实施例中,上述根据目标融合调控数据,确定预设数量个网络层对应的融合数据可以包括:In some embodiments, the determination of the fusion data corresponding to the preset number of network layers according to the target fusion control data may include:
比较预设数量个网络层对应的层数与目标融合调控数据,得到比较结果;Comparing the number of layers corresponding to the preset number of network layers with the target fusion control data to obtain the comparison result;
根据比较结果,确定预设数量个网络层对应的融合数据。According to the comparison result, the fused data corresponding to the preset number of network layers is determined.
在一些实施例中,每个网络层对应融合数据可以表征目标风格编码在该网络层是否参与融合。在一些实施例中,风格融合网络包括的预设数量个网络层可以为预设数量个按序排列的网络层,例如第0层至第n层;在一些实施例中,在比较结果指示网络层的层数小于目标融合调控数据的情况下,该网络层对应的融合数据为0,即目标风格编码在该网络层不参与融合;反之,在比较结果指示网络层的层数大于等于目标融合调控数据的情况下,该网络层对应的融合数据为1,即目标风格编码在该网络层参与融合。In some embodiments, the fusion data corresponding to each network layer may represent whether the target style code participates in fusion at the network layer. In some embodiments, the preset number of network layers included in the style fusion network may be a preset number of sequentially arranged network layers, such as the 0th layer to the nth layer; in some embodiments, the comparison result indicates that the network If the number of layers is less than the target fusion control data, the fusion data corresponding to this network layer is 0, that is, the target style code does not participate in fusion at this network layer; otherwise, when the comparison result indicates that the number of layers of the network layer is greater than or equal to the target fusion In the case of regulating data, the fusion data corresponding to this network layer is 1, that is, the target style coding participates in fusion at this network layer.
上述实施例中,基于目标融合调控数据所确定的融合数据来控制目标风格编码在多个网络层中是否参与融合,可以实现可调控的风格特征融合,进而可以保证后续生成与自然对象相似度较高,且具有不同风格化程度的对象风格图像,实现对象风格化强弱的灵活调控,更好的应对不同场景的需求,且大大提升生成的对象风格图像的图像质量以及对象风格图像的生成效率。In the above-mentioned embodiment, based on the fusion data determined by the target fusion control data to control whether the target style coding participates in the fusion in multiple network layers, the controllable style feature fusion can be realized, and then the subsequent generation can be guaranteed to have a higher similarity with the natural object. High object style images with different degrees of stylization, flexible control of the strength of object stylization, better respond to the needs of different scenarios, and greatly improve the image quality of generated object style images and the generation efficiency of object style images .
在步骤S505中,对目标风格编码和预设对象编码进行拼接处理,得到目标拼接编码。In step S505, concatenate the target style code and the preset object code to obtain the target concatenated code.
在步骤S507中,基于目标拼接编码进行融合权重学习,得到目标融合权重。In step S507, fusion weight learning is performed based on the target splicing code to obtain target fusion weights.
在一些实施例中,可以将目标拼接编码输入全连接层,以进行融合权重学习,得到上述目标融合权重。在一些实施例中,目标融合权重可以用于控制目标风格编码和预设对象编码在融合各层中的融合比例。In some embodiments, the target concatenation code can be input into the fully connected layer to perform fusion weight learning to obtain the above target fusion weight. In some embodiments, the target fusion weight can be used to control the fusion ratio of the target style code and the preset object code in each fusion layer.
上述实施例中,结合包括目标风格编码和预设对象编码的目标拼接编码,进行融合权重学习,使得学习到的目标融合权重与不同类型的对象和风格自适应,进而更好的融合出目标风格下的对象图像。In the above-mentioned embodiment, the fusion weight learning is performed in combination with target splicing coding including target style coding and preset object coding, so that the learned target fusion weights are adaptive to different types of objects and styles, and then the target style is better fused. Object image below.
在步骤S509中,对融合数据和目标融合权重进行加权处理,得到网络融合参数。In step S509, the fusion data and target fusion weights are weighted to obtain network fusion parameters.
在一些实施例中,目标融合权重和融合数据均为矩阵形式的数据,相应的,可以将目标融合权重和融合数据中对应元素相乘,得到上述网络融合参数。In some embodiments, both the target fusion weight and the fusion data are data in the form of a matrix. Correspondingly, the target fusion weight and the corresponding elements in the fusion data can be multiplied to obtain the above network fusion parameters.
在步骤S511中,基于网络融合参数在预设数量个网络层中对目标风格编码和预设对象编码进行风格融合处理,得到目标风格融合编码。In step S511, the target style code and the preset object code are subjected to style fusion processing in a preset number of network layers based on the network fusion parameters to obtain the target style fusion code.
在一些实施例中,可以将上述预设数量个网络层对应的网络融合参数作为目标风格编码的权重,以及将元素全是一的矩阵减去网络融合参数后矩阵差作为预设对象编码的权重,并基于目标风格编码的权重和预设对象编码的权重,对目标风格编码和预设对象编码进行加权求和,得到上述目标融合编码。In some embodiments, the network fusion parameters corresponding to the above preset number of network layers can be used as the weight of the target style coding, and the matrix difference after subtracting the network fusion parameters from the matrix whose elements are all ones can be used as the weight of the preset object coding , and based on the weight of the target style code and the weight of the preset object code, the target style code and the preset object code are weighted and summed to obtain the above target fusion code.
在一些实施例中,假设网络融合层包括18个网络层,在一些实施例中,在目标融合调控数据为0的情况下,目标风格编码和预设对象编码从第1层网络层开始融合,相应的,后续图像生成网络输出完全风格化后的对象风格图像;在目标融合调控数据为18的情况下,目标风格编码不参与融合;相应 的,后续图像生成网络输出未经风格化的自然对象图像;在目标融合调控数据大于1且小于18的情况下,目标风格编码和预设对象编码从第i(目标融合调控数据)层开始融合,低分辨率层(即第1层至第i-1)不受目标风格编码的影响;基于目标融合调控数据,可以保证目标风格融合编码保留与自然对象相同的特征信息,且具有不同程度的风格特征,进而可以保证后续生成与自然对象相似度较高,且具有不同风格化程度的对象风格图像,实现对象风格化强弱的灵活调控。相应的,上述目标融合调控数据用于调控目标风格编码和预设对象编码在风格融合网络中的融合位置可以为调控目标风格编码和预设对象编码在风格融合网络中的开始融合位置。In some embodiments, it is assumed that the network fusion layer includes 18 network layers. In some embodiments, when the target fusion control data is 0, the target style code and the preset object code are fused from the first network layer, Correspondingly, the subsequent image generation network outputs a fully stylized object style image; when the target fusion control data is 18, the target style code does not participate in the fusion; correspondingly, the subsequent image generation network outputs an unstylized natural object Image; when the target fusion control data is greater than 1 and less than 18, the target style code and the preset object code are fused from the i-th (target fusion control data) layer, and the low-resolution layer (that is, the first layer to the i-th layer 1) It is not affected by the target style coding; based on the target fusion control data, it can ensure that the target style fusion code retains the same feature information as the natural object, and has different degrees of style characteristics, which in turn can ensure that the subsequent generation is more similar to the natural object. High object style images with different stylization levels can realize flexible control of object stylization strength. Correspondingly, the above-mentioned target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network, which may be the start fusion position of the control target style code and the preset object code in the style fusion network.
上述实施例中,在风格融合网络中,结合目标融合调控数据对目标风格编码和预设对象编码控制目标风格编码和预设对象编码在多个网络层中的融合位置,可以实现融合程度的调控。且结合包括目标风格编码和预设对象编码的目标拼接编码,进行融合权重学习,使得学习到的目标融合权重与不同类型的对象和风格自适应,实现自适应的调整对象在不同目标风格下的融合权重,进而更好的融合出目标风格下的对象风格编码,可以保证得到的目标风格融合编码保留与自然对象相同的特征信息的基础上,具有可调控的风格特征和自适应的融合权重,进而可以保证后续生成与自然对象相似度较高,且具有不同风格化程度的对象风格图像,实现对象风格化强弱的灵活调控,并提升多风格的对象风格图像自适应生成效率,进而可以更好的应对不同场景的需求,且大大提升生成的对象风格图像的图像质量以及多风格的对象风格图像的生成效率。In the above-mentioned embodiment, in the style fusion network, combining the target fusion control data to control the fusion position of the target style code and the preset object code in multiple network layers, the regulation of the degree of fusion can be realized . In addition, combined with target splicing coding including target style coding and preset object coding, fusion weight learning is carried out, so that the learned target fusion weight is adaptive to different types of objects and styles, and the adaptive adjustment of objects under different target styles is realized. Fusion weights, and then better integrate the object style code under the target style, which can ensure that the obtained target style fusion code retains the same feature information as the natural object, and has adjustable style features and adaptive fusion weights. In turn, it can ensure that the subsequent generation of object-style images with a high degree of similarity to natural objects and different stylization degrees can achieve flexible control of the strength of object stylization, and improve the efficiency of adaptive generation of multi-style object-style images, which can further improve It can better meet the needs of different scenes, and greatly improve the image quality of the generated object-style images and the generation efficiency of multi-style object-style images.
在步骤S205中,将目标风格融合编码输入目标图像生成网络进行图像生成处理,得到目标风格对应的预设对象风格图像。In step S205, the target style fusion code is input into the target image generation network for image generation processing, and a preset object style image corresponding to the target style is obtained.
在一些实施例中,目标图像生成网络可以用于生成目标风格对应的预设对象风格图像。在一些实施例中,上述方法还包括:预先训练出目标图像生成网络和风格融合网络的步骤,在一些实施例中,如图6所示,预先训练出目标图像生成网络和风格融合网络的步骤可以包括:In some embodiments, the target image generation network can be used to generate a preset object style image corresponding to the target style. In some embodiments, the above method also includes: the step of pre-training the target image generation network and the style fusion network, in some embodiments, as shown in Figure 6, the step of pre-training the target image generation network and the style fusion network Can include:
在步骤S601中,获取目标风格的第一样本风格编码、样本对象编码、非目标风格的第二样本风格编码、预设风格对象图像和预设对象图像。In step S601, a first sample style code of a target style, a sample object code, a second sample style code of a non-target style, a preset style object image, and a preset object image are acquired.
在一些实施例中,第一样本风格编码、第二样本风格编码的获取方式可以参见上述目标风格编码的获取方式,在此不再赘述。样本对象编码的获取方式可以参见上述预设对象编码的获取方式,在此不再赘述。In some embodiments, the acquisition manner of the first sample style code and the second sample style code may refer to the above-mentioned manner of obtaining the target style code, which will not be repeated here. For the acquisition method of the sample object code, refer to the above-mentioned method for obtaining the preset object code, which will not be repeated here.
在一些实施例中,预设风格对象图像可以为目标风格对应的风格化对象图像训练集中获取的,以对象为人脸为例,可以从收集的目标风格的人脸风格图像训练集中获取预设风格人脸图像。在一些实施例中,预设对象图像可以为原始对象图像,以对象为人脸为例,可以为某一真实人脸的图像。In some embodiments, the preset style object image can be obtained from the stylized object image training set corresponding to the target style. Taking the object as a human face as an example, the preset style can be obtained from the collected face style image training set of the target style face image. In some embodiments, the preset object image may be an original object image. Taking the object as a human face as an example, it may be an image of a real human face.
在步骤S603中,基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对第一样本风格编码和样本对象编码进行风格融合处理,得到样本风格融合编码。In step S603, based on the sample network fusion parameters corresponding to the preset number of network layers to be trained in the style fusion network to be trained, the style fusion processing is performed on the first sample style code and the sample object code to obtain the sample style fusion code.
在一些实施例中,待训练风格融合网络可以包括预设数量个待训练网络层;上述样本网络融合参数可以为基于预设数量个待训练网络层对应的样本融合数据和样本融合权重确定的,上述样本融合权重为基于第一样本风格编码和样本对象编码进行融合权重学习得到的。In some embodiments, the style fusion network to be trained may include a preset number of network layers to be trained; the above-mentioned sample network fusion parameters may be determined based on the sample fusion data and sample fusion weights corresponding to the preset number of network layers to be trained, The above-mentioned sample fusion weight is obtained by performing fusion weight learning based on the first sample style coding and sample object coding.
在一些实施例中,上述样本风格融合编码包括第一风格融合编码和第二风格融合编码;相应的,如图7所示,上述基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对第一样本风格编码和样本对象编码进行风格融合处理,得到样本风格融合编码可以包括以下步骤:In some embodiments, the above sample style fusion coding includes the first style fusion coding and the second style fusion coding; correspondingly, as shown in FIG. The sample network fusion parameters of the first sample style code and the sample object code are subjected to style fusion processing, and the sample style fusion code obtained may include the following steps:
在步骤S701中,获取第一融合调控数据和第二融合调控数据。In step S701, first fusion control data and second fusion control data are obtained.
在一些实施例中,上述第一融合调控数据用于调控第一样本风格编码和样本对象编码,从待训练风格融合网络中的第一个网络层开始融合,结合上述目标融合调控数据的实施例中,第一融合调控数据为1;在一些实施例中,第二融合调控数据可以用于调控第一样本风格编码在待训练风格融合网络中不参与融合,以上述预设数量为18的实施例为例,第二融合调控数据为18。In some embodiments, the above-mentioned first fusion control data is used to control the first sample style coding and sample object coding, and the fusion starts from the first network layer in the style fusion network to be trained, combined with the implementation of the above-mentioned target fusion control data In the example, the first fusion control data is 1; in some embodiments, the second fusion control data can be used to control the first sample style encoding not to participate in fusion in the style fusion network to be trained, and the above preset number is 18 Taking the embodiment of the example as an example, the second fusion regulation data is 18.
在步骤S703中,根据第一融合调控数据和第二融合调控数据,分别确定预设数量个待训练网络层对应的第一样本融合数据和第二样本融合数据。In step S703, according to the first fused control data and the second fused control data, first sample fused data and second sample fused data corresponding to a preset number of network layers to be trained are respectively determined.
在一些实施例中,根据第一融合调控数据,确定预设数量个待训练网络层对应的第一样本融合数据;以及根据第二融合调控数据,确定预设数量个待训练网络层对应的第二样本融合数据。In some embodiments, according to the first fusion regulation data, determine the first sample fusion data corresponding to the preset number of network layers to be trained; and according to the second fusion regulation data, determine the preset number of network layers corresponding to the training The second sample fuses the data.
在步骤S705中,对第一样本风格编码和样本对象编码进行拼接处理,得到样本拼接编码;In step S705, concatenate the first sample style code and the sample object code to obtain the sample spliced code;
在步骤S707中,基于样本拼接编码进行融合权重学习,得到样本融合权重;In step S707, fusion weight learning is performed based on sample splicing coding to obtain sample fusion weights;
在步骤S709中,将第一样本融合数据、第二样本融合数据分别与样本融合权重进行加权处理,得到第一样本网络融合参数和第二样本网络融合参数;In step S709, the first sample fusion data and the second sample fusion data are weighted with the sample fusion weight respectively to obtain the first sample network fusion parameters and the second sample network fusion parameters;
在步骤S711中,基于第一样本网络融合参数和第二样本网络融合参数,分别在预设数量个待训练网络层中对第一样本风格编码和样本对象编码进行风格融合处理,得到第一风格融合编码和第二风格融合编码。In step S711, based on the network fusion parameters of the first sample and the network fusion parameters of the second sample, style fusion processing is performed on the style coding of the first sample and the coding of the sample object in the preset number of network layers to be trained respectively, to obtain the first One-style fusion encoding and second-style fusion encoding.
在一些实施例中,上述步骤S705至711的具体细化可以参见上述步骤S505至511的具体细化,在此不再赘述。In some embodiments, the specific details of the above steps S705 to 711 can refer to the specific details of the above steps S505 to 511 , which will not be repeated here.
上述实施例中,在网络训练过程中,结合第一融合调控数据和第二融合调控数据,可以将样本对象编码在不同程度上与对第一样本风格编码进行风格融合处理,对象风格化强弱的灵活调控,以便提升后续生成的对象风格图像的图像质量以及对象风格图像的生成效率。In the above embodiment, in the network training process, combining the first fusion control data and the second fusion control data, the sample object encoding can be combined with the first sample style encoding to varying degrees, and the object style is strong. Weak and flexible control, in order to improve the image quality of subsequent generated object-style images and the generation efficiency of object-style images.
在步骤S605中,将样本风格融合编码输入待训练图像生成网络进行图像生成处理,得到目标风格对应的样本对象风格图像。In step S605, the sample style fusion code is input into the image generation network to be trained for image generation processing, and the sample object style image corresponding to the target style is obtained.
在一些实施例中,样本对象风格图像可以包括第一风格融合编码对应的第一样本对象风格图像和第二风格融合编码对应的第二样本对象风格图像;相应的,上述将样本风格融合编码输入待训练图像生成网络进行图像生成处理,得到目标风格对应的样本对象风格图像包括:将第一风格融合编码和第二风格融合编码输入待训练图像生成网络进行图像生成处理,得到第一样本对象风格图像和第二样本对象风格图像。In some embodiments, the sample object style image may include a first sample object style image corresponding to the first style fusion coding and a second sample object style image corresponding to the second style fusion coding; correspondingly, the above-mentioned sample style fusion coding Inputting the image generation network to be trained for image generation processing to obtain the sample object style image corresponding to the target style includes: inputting the first style fusion code and the second style fusion code into the image generation network to be trained for image generation processing to obtain the first sample Object style image and second sample object style image.
在步骤S607中,将样本对象风格图像、预设风格对象图像、预设对象图像、第一样本风格编码和第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息。In step S607, input the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminant network to be trained for style discriminant processing to obtain target discriminant information.
在一些实施例中,上述待训练判别网络包括对象判别网络、风格对象判别网络和风格编码判别网络;相应的,上述目标判别信息可以包括对象判别信息、风格对象判别信息和风格编码判别信息。In some embodiments, the discriminant network to be trained includes an object discriminant network, a style object discriminant network, and a style code discriminant network; correspondingly, the target discriminant information may include object discriminant information, style object discriminant information, and style code discriminant information.
在一些实施例中,在待训练风格融合网络为具有固定融合结构的网络的情况下,上述将样本对象风格图像、预设风格对象图像、预设对象图像、第一样本风格编码和第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息包括:将样本对象风格图像和预设对象图像输入对象判别网络进行对象判别处理,得到对象判别信息(该对象判别信息可以包括对象判别网络输出的样本对象风格图像和预设对象图像各自对应的特征信息);将样本对象风格图像和预设风格对象图像输入风格对象判别网络进行风格对象判别处理,得到风格对象判别信息(风格对象判别信息可以包括风格对象判别网络输出的样本对象风格图像和预设风格对象图像各自对应的特征信息);将样本对象风格图像、第一样本风格编码和第二样本风格编码输入风格编码判别网络进行风格编码判别处理,得到风格编码判别信息(风格编码判别信息可以包括风格编码判别网络输出的第一样本风格编码和第二样本风格编码各自对应的特征信息)。In some embodiments, when the style fusion network to be trained is a network with a fixed fusion structure, the above-mentioned sample object style image, preset style object image, preset object image, first sample style encoding and second The sample style code is input into the discrimination network to be trained for style discrimination processing, and the target discrimination information obtained includes: inputting the sample object style image and the preset object image into the object discrimination network for object discrimination processing, and obtaining object discrimination information (the object discrimination information may include object Discriminate the corresponding characteristic information of the sample object style image output by the network and the preset object image); input the sample object style image and the preset style object image into the style object discrimination network to perform style object discrimination processing, and obtain the style object discrimination information (style object The discriminant information may include the corresponding feature information of the sample object style image output by the style object discrimination network and the preset style object image); the sample object style image, the first sample style code and the second sample style code are input into the style code discrimination network Perform style coding discrimination processing to obtain style coding discrimination information (the style coding discrimination information may include feature information corresponding to the first sample style code and the second sample style code output by the style coding discrimination network).
在另一个可选的实施例中,在待训练风格融合网络为能够调控融合程度的网络的情况下,上述将样本对象风格图像、预设风格对象图像、预设对象图像、第一样本风格编码和第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息可以包括:将第二样本对象风格图像和预设对象图像输入对象判别网络进行对象判别处理,得到对象判别信息(该对象判别信息可以包括对象判别网络输出的第二样本对象风格图像和预设对象图像各自对应的特征信息);将第一样本对象风格图像和预设风格对象图像输入风格对象判别网络进行风格对象判别处理,得到风格对象判别信息(该风格对象判别信息可以包括风格对象判别网络输出的第一样本对象风格图像和预设风格对象图像各自对应的特征信息);将第二样本对象风格图像、第一样本风格编码和第二样本风格编码输入风格编码判别网络进行风格编码判别处理,得到风格编码判别信息(该风格编码判别信息可以包括风格编码判别网络输出的第一样本风格编码和第二样本风格编码各自对应的特征信息)。In another optional embodiment, when the style fusion network to be trained is a network capable of regulating the fusion degree, the above-mentioned sample object style image, preset style object image, preset object image, first sample style The encoding and the second sample style encoding are input into the discriminant network to be trained for style discriminant processing, and obtaining the target discriminant information may include: inputting the second sample object style image and the preset object image into the object discriminant network for object discriminant processing, and obtaining the object discriminant information ( The object discrimination information may include the second sample object style image output by the object discrimination network and the corresponding feature information of the preset object image); the first sample object style image and the preset style object image are input into the style object discrimination network for style Object discrimination processing, obtain style object discrimination information (this style object discrimination information can comprise the feature information corresponding to the first sample object style image output by the style object discrimination network and the preset style object image respectively); the second sample object style image , the first sample style code and the second sample style code are input into the style code discrimination network for style code discrimination processing, and the style code discrimination information is obtained (the style code discrimination information may include the first sample style code and the style code discrimination network output The second sample style codes corresponding feature information).
上述实施例中,从风格判别、风格对象判别以及风格编码判别三个维度对用于生成对象风格图像的待训练图像生成网络进行对抗训练,可以大大提升训练好的图像生成网络对对象风格化图像的表征能力,提升生成的对象风格图像的质量。In the above embodiment, the adversarial training of the image generation network to be trained for generating the style image of the object is carried out from the three dimensions of style discrimination, style object discrimination and style coding discrimination, which can greatly improve the ability of the trained image generation network to stylize the image of the object. The representation ability of the method improves the quality of the generated object-style images.
在步骤S609中,根据目标判别信息,确定目标损失信息。In step S609, target loss information is determined according to the target discrimination information.
在一些实施例中,目标损失信息可以包括待训练图像生成网络对应的生成损失信息和待训练判别 网络对应的判别损失信息。In some embodiments, the target loss information may include generation loss information corresponding to the image generation network to be trained and discrimination loss information corresponding to the discriminative network to be trained.
在一些实施例中,在根据目标判别信息,确定目标损失信息过程中,可以结合对抗损失函数。在一些实施例中,可以结合对抗损失函数,确定对象判别网络输出的第二样本对象风格图像和预设对象图像各自对应的特征信息间的对象判别损失;结合对抗损失函数,确定风格对象判别网络输出的第一样本对象风格图像和预设风格对象图像各自对应的特征信息间的风格对象判别损失;结合对抗损失函数,确定风格编码判别网络输出的第一样本风格编码和第二样本风格编码各自对应的特征信息间的风格编码判别损失。In some embodiments, in the process of determining the target loss information according to the target discrimination information, an adversarial loss function may be combined. In some embodiments, the object discrimination loss between the feature information corresponding to the second sample object style image output by the object discrimination network and the preset object image can be determined in combination with the adversarial loss function; in combination with the adversarial loss function, the style object discrimination network can be determined The style object discrimination loss between the corresponding feature information of the output first sample object style image and the preset style object image; combined with the adversarial loss function, determine the first sample style code and the second sample style code output by the style coding discriminant network Encodes the style encoding discriminative loss between the corresponding feature information.
进一步的,可以将对象判别损失、风格对象判别损失和风格编码判别损失相加,得到上述生成损失信息,以及将生成损失信息的负数作为判别损失信息。Further, the object discrimination loss, the style object discrimination loss and the style encoding discrimination loss can be added to obtain the above generation loss information, and the negative number of the generation loss information can be used as the discrimination loss information.
在步骤S611中,基于目标损失信息,训练待训练风格融合网络、待训练图像生成网络和待训练判别网络。In step S611, based on the target loss information, the style fusion network to be trained, the image generation network to be trained and the discrimination network to be trained are trained.
在一些实施例中,上述基于目标损失信息,训练待训练风格融合网络、待训练图像生成网络和待训练判别网络可以包括:基于生成损失信息,更新待训练图像生成网络和待训练风格融合网络中的网络参数,以及基于判别损失信息更新待训练判别网络(对象判别网络、风格对象判别网络和风格编码判别网络)中的网络参数;接着,基于更新后的待训练风格融合网络、待训练图像生成网络和待训练判别网络重复基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对第一样本风格编码和样本对象编码进行风格融合处理,得到样本风格融合编码,至基于生成损失信息,更新待训练图像生成网络和待训练风格融合网络中的网络参数,以及基于判别损失信息更新待训练判别网络(对象判别网络、风格对象判别网络和风格编码判别网络)中的网络参数的训练迭代操作,直至达到第二预设收敛条件。In some embodiments, the above-mentioned training of the style fusion network to be trained, the image generation network to be trained, and the discrimination network to be trained based on the target loss information may include: updating the image generation network to be trained and the style fusion network to be trained based on the generation loss information , and update the network parameters in the discriminant network to be trained (object discriminant network, style object discriminant network, and style encoding discriminant network) based on discriminative loss information; then, based on the updated style fusion network to be trained, images to be trained are generated The network and the discriminant network to be trained are repeatedly based on the sample network fusion parameters corresponding to the preset number of network layers to be trained in the style fusion network to be trained, and the style fusion process is performed on the first sample style code and the sample object code to obtain the sample style fusion code , to update the network parameters in the image generation network to be trained and the style fusion network to be trained based on the generation loss information, and update the discriminant network to be trained (object discriminant network, style object discriminant network and style encoding discriminant network) based on discriminative loss information The training of the network parameters is performed iteratively until the second preset convergence condition is reached.
在一些实施例中,上述达到第二预设收敛条件可以为训练迭代操作的次数达到第二预设训练次数。在一些实施例中,达到第二预设收敛条件也可以为生成损失信息小于第二预设阈值。本说明书实施例中,第二预设训练次数和第二预设阈值可以结合实际应用中对网络的训练速度和精准度预先设置。In some embodiments, reaching the second preset convergence condition may be that the number of training iteration operations reaches the second preset training number. In some embodiments, reaching the second preset convergence condition may also be that the generated loss information is less than a second preset threshold. In the embodiment of this specification, the second preset training times and the second preset threshold can be preset in combination with the training speed and accuracy of the network in practical applications.
在步骤S613中,将训练好的待训练风格融合网络作为风格融合网络,以及将训练好的待训练图像生成网络作为目标图像生成网络。In step S613, the trained style fusion network to be trained is used as a style fusion network, and the trained image generation network to be trained is used as a target image generation network.
在一些实施例中,在达到第二预设收敛条件的情况下,将当前的待训练风格融合网络(训练好的待训练风格融合网络)作为上述风格融合网络,以及将当前的待训练图像生成网络(训练好的待训练图像生成网络)作为上述目标图像生成网络。In some embodiments, when the second preset convergence condition is reached, the current style fusion network to be trained (the trained style fusion network to be trained) is used as the above-mentioned style fusion network, and the current image to be trained is generated Network (trained image generation network to be trained) is used as the above-mentioned target image generation network.
在一些实施例中,如图8所示,图8是根据一示例性实施例提供的一种训练风格融合网络和目标图像生成网络的示意图。In some embodiments, as shown in FIG. 8 , FIG. 8 is a schematic diagram of a training style fusion network and a target image generation network provided according to an exemplary embodiment.
上述实施例中,将目标图像生成网络和风格融合网络进行联合训练,可以实现对风格特征和对象特征的融合,且基于融合后的样本对象风格图像来训练目标图像生成网络,可以大大提升训练好的图像生成网络对对象风格和对象特征的表征能力,进而有效提升后续生成的对象风格图像的质量。In the above embodiments, the joint training of the target image generation network and the style fusion network can realize the fusion of style features and object features, and the target image generation network can be trained based on the fused sample object style images, which can greatly improve the training performance. The ability of the image generation network to represent the object style and object features can effectively improve the quality of the subsequent generated object style images.
在一些实施例中,基于本说明书实施例提供的上述图像生成方法可以生成目标风格的大量预设对象风格图像,相应的,可以基于第一样本对象图像(大量真实对象的对象图像)和上述图像生成方法生成大量目标风格的预设对象风格图像,对第一预设图像生成网络进行对抗训练,可以得到第一风格转换网络。该第一风格转换网络可以用于生成某一目标对象的目标风格的对象图像。在一些实施例中,在基于第一样本对象图像和上述图像生成方法生成目标风格的预设对象风格图像,对第一预设图像生成网络进行对抗训练过程中,可以将第一样本对象图像输入第一预设图像生成网络进行风格转换处理,得到第一样本对象图像对应的对象风格图像;并将该对象风格图像和对应的预设对象风格图像输入对应的判别网络进行风格判别处理,得到第一风格判别信息;并基于第一风格判别信息,确定相应的判别损失信息;进而可以基于该判别损失信息来训练第一预设图像生成网络和对应的判别网络,并将训练好的第一预设图像生成网络作为第一风格转换网络。In some embodiments, a large number of preset object style images of the target style can be generated based on the above-mentioned image generation method provided by the embodiments of this specification. The image generation method generates a large number of preset object style images of the target style, and conducts adversarial training on the first preset image generation network to obtain the first style transfer network. The first style transfer network may be used to generate an object image of a target style for a certain target object. In some embodiments, during the process of adversarial training for the first preset image generation network based on the first sample object image and the above-mentioned image generation method to generate a preset object style image of the target style, the first sample object can be The image is input into the first preset image generation network for style conversion processing to obtain the object style image corresponding to the first sample object image; and the object style image and the corresponding preset object style image are input into the corresponding discriminant network for style discrimination processing , to obtain the first style discriminant information; and based on the first style discriminant information, determine the corresponding discriminant loss information; furthermore, the first preset image generation network and the corresponding discriminant network can be trained based on the discriminative loss information, and the trained The first preset image generation network is used as the first style transfer network.
在一些实施例中,基于本说明书实施例提供的上述图像生成方法可以生成多种目标风格的预设对象风格图像,相应的,可以基于第二样本对象图像(大量真实对象的对象图像)多种目标风格标签和上述图像生成方法生成多种目标风格的预设对象风格图像,对第二预设图像生成网络进行对抗训练,可以得到第二风格转换网络。该第二风格转换网络可以用于生成多种目标风格的对象风格图像。在一 些实施例中,在基于第二样本对象图像(大量真实对象的对象图像)多种目标风格标签和上述图像生成方法生成多种目标风格的预设对象风格图像,对第二预设图像生成网络进行对抗训练的过程中,可以将第二样本对象图像和对应的目标风格标签输入第二预设图像生成网络进行风格转换处理,得到第二样本对象图像对应的与目标风格标签匹配的对象风格图像;并将该对象风格图像和对应的预设对象风格图像输入对应的判别网络进行风格判别处理,得到第二风格判别信息;另外,还可以加一个判别网络,以对第二预设图像生成网络输出的对象风格图像进行是否为目标风格标签对应风格进行判别,得到第三风格判别信息;并基于第二风格判别信息和第三风格判别信息,确定相应的损失信息;进而可以基于该损失信息来训练第二预设图像生成网络和对应的判别网络,并将训练好的第二预设图像生成网络作为第二风格转换网络。In some embodiments, based on the above-mentioned image generation method provided by the embodiments of this specification, preset object style images of various target styles can be generated. The target style label and the above image generation method generate preset object style images of multiple target styles, and conduct confrontation training on the second preset image generation network to obtain a second style transfer network. The second style transfer network can be used to generate object style images of various target styles. In some embodiments, based on the second sample object image (object images of a large number of real objects) multiple object style labels and the above-mentioned image generation method to generate preset object style images of multiple target styles, the second preset image is generated In the process of adversarial training of the network, the second sample object image and the corresponding target style label can be input into the second preset image generation network for style conversion processing, and the object style corresponding to the second sample object image matching the target style label can be obtained image; and the object style image and the corresponding preset object style image are input into the corresponding discriminant network for style discriminant processing to obtain the second style discriminant information; in addition, a discriminant network can also be added to generate the second preset image The object style image output by the network is judged whether it is the style corresponding to the target style label, and the third style discrimination information is obtained; and based on the second style discrimination information and the third style discrimination information, the corresponding loss information is determined; and then based on the loss information To train the second preset image generation network and the corresponding discriminant network, and use the trained second preset image generation network as the second style transfer network.
由以上本说明书实施例提供的技术方案可见,本说明书在生成某一类对象的风格化图像(预设对象风格图像)的过程中,将风格化图像解耦为对象编码和风格编码两部分,并在风格融合网络中,结合目标融合权重与预设数量个网络层对应的融合数据,确定的网络融合参数,对目标风格编码和预设对象编码进行风格融合处理,可以通过对两者的融合,得到既能表征某一类对象的对象特征,又能有效融入目标风格的目标风格融合编码,且目标融合权重是基于目标风格编码和预设对象编码进行融合权重学习,可以实现自适应的调整对象在不同目标风格下的融合权重,进而更好的融合出目标风格下的对象风格编码。在大大提升了风格化效果和风格化图像质量的基础上,可以大大提升多风格的对象风格图像自适应生成效率。It can be seen from the above technical solutions provided by the embodiments of this specification that in the process of generating a stylized image (preset object style image) of a certain type of object, this specification decouples the stylized image into two parts: object coding and style coding. And in the style fusion network, combined with the fusion data corresponding to the target fusion weight and the preset number of network layers, and the determined network fusion parameters, the style fusion processing is performed on the target style code and the preset object code, and the fusion of the two can be performed , to obtain the target style fusion code that can not only represent the object characteristics of a certain type of object, but also effectively integrate into the target style, and the target fusion weight is based on the target style code and the preset object code for fusion weight learning, which can realize adaptive adjustment The fusion weights of objects under different target styles, and then better fuse the object style encoding under the target style. On the basis of greatly improving the stylized effect and stylized image quality, it can greatly improve the adaptive generation efficiency of multi-style object style images.
图9是根据一示例性实施例示出的另一种图像生成方法的流程图,如图9所示,该图像生成方法用于终端、服务器电子设备中,包括以下步骤。Fig. 9 is a flowchart of another image generation method according to an exemplary embodiment. As shown in Fig. 9, the image generation method is used in terminals and server electronic devices, and includes the following steps.
在步骤S901中,获取第一目标对象的第一原始对象图像;In step S901, a first original object image of a first target object is acquired;
在步骤S903中,将第一原始对象图像输入第一风格转换网络进行风格转换处理,得到第一目标对象对应的第一目标对象风格图像;In step S903, input the first original object image into the first style conversion network to perform style conversion processing, and obtain the first target object style image corresponding to the first target object;
在一些实施例中,第一原始对象图像可以为用户通过终端上传的第一目标对象的对象图像。以第一目标对象为某一用户的人脸为例,第一原始对象图像可以为该用户的真实人脸图像。In some embodiments, the first original object image may be an object image of the first target object uploaded by the user through the terminal. Taking the first target object as an example of a user's face, the first original object image may be a real face image of the user.
在一些实施例中,第一目标对象风格图像可以为第一目标对象的目标风格的对象图像。In some embodiments, the first target object style image may be an object image of the target style of the first target object.
在一些实施例中,终端可以将第一原始对象图像输入第一风格转换网络进行风格转换处理,得到第一目标对象对应的第一目标对象风格图像。终端也可以将第一原始对象图像发送给服务器,由服务器基于第一风格转换网络生成第一目标对象风格图像,并传输给终端。In some embodiments, the terminal may input the first original object image into the first style conversion network for style conversion processing to obtain the first target object style image corresponding to the first target object. The terminal may also send the first original object image to the server, and the server generates the first target object style image based on the first style conversion network and transmits it to the terminal.
上述实施例中,结合即保留了自然对象的对象特征,又有效融入目标风格的风格特征的预设对象风格图像所训练得到的第一风格转换网络,来对第一目标对象的原始对象图像进行风格转换,可以在有效提升风格化效果的基础上,保证风格转换后的第一目标对象风格图像中对象特征与第一目标对象的对象特征间的一致性,进而大大提升了风格化图像的质量。In the above embodiment, the original object image of the first target object is combined with the first style conversion network trained by the preset object style image that not only retains the object characteristics of the natural object but also effectively incorporates the style characteristics of the target style. Style conversion, on the basis of effectively improving the stylization effect, can ensure the consistency between the object features in the style image of the first target object after style conversion and the object features of the first target object, thereby greatly improving the quality of the stylized image .
图10是根据一示例性实施例示出的另一种图像生成方法的流程图,如图10所示,该图像生成方法用于终端、服务器电子设备中,包括以下步骤。Fig. 10 is a flow chart of another image generation method according to an exemplary embodiment. As shown in Fig. 10, the image generation method is used in terminals and server electronic devices, and includes the following steps.
在步骤S1001中,获取第二目标对象的第二原始对象图像和目标风格标签;In step S1001, a second original object image and a target style label of a second target object are obtained;
在步骤S1003中,将第二原始对象图像和目标风格标签输入第二风格转换网络进行风格转换处理,得到第二目标对象对应的第二目标对象风格图像;In step S1003, input the second original object image and the target style label into the second style conversion network for style conversion processing, and obtain the second target object style image corresponding to the second target object;
在一些实施例中,第二原始对象图像可以为用户通过终端上传的第二目标对象的对象图像。以第二目标对象为某一用户的人脸为例,第二原始对象图像可以为该用户的真实人脸图像。目标风格标签可以为为用户选定的某一风格的标识信息。In some embodiments, the second original object image may be an object image of the second target object uploaded by the user through the terminal. Taking the second target object as an example of a user's face, the second original object image may be a real face image of the user. The target style tag may be identification information of a certain style selected for the user.
在一些实施例中,第二目标对象风格图像可以为第二目标对象的目标风格标签对应的风格的对象图像。In some embodiments, the style image of the second target object may be an object image of a style corresponding to the target style label of the second target object.
在一些实施例中,终端可以将第二原始对象图像和目标风格标签输入第二风格转换网络进行风格转换处理,得到第二目标对象对应的第二目标对象风格图像。终端也可以将第二原始对象图像和目标风格标签发送给服务器,由服务器基于第二风格转换网络生成第二目标对象风格图像,并传输给终端。In some embodiments, the terminal may input the second original object image and the target style label into the second style conversion network for style conversion processing to obtain the second target object style image corresponding to the second target object. The terminal may also send the second original object image and the target style label to the server, and the server generates the second target object style image based on the second style conversion network and transmits it to the terminal.
上述实施例中,结合即保留了自然对象的对象特征,又有效融入多种目标风格的风格特征的预设对象风格图像所训练得到的第二风格转换网络,来对第二目标对象的原始对象图像进行目标风格标签 对应风格的风格转换,可以在有效提升风格化效果的基础上,保证风格转换后的第二目标对象风格图像中对象特征与第二目标对象的对象特征间的一致性,进而大大提升了风格化图像的质量。In the above-mentioned embodiment, the second style conversion network obtained by training the preset object style image that not only retains the object characteristics of the natural object but also effectively incorporates the style characteristics of various target styles is used to convert the original object of the second target object The style conversion of the image corresponding to the style of the target style label can effectively improve the stylization effect, and ensure the consistency between the object features in the style image of the second target object after style conversion and the object features of the second target object, and then Greatly improved the quality of stylized images.
图11是根据一示例性实施例示出的一种图像生成装置框图。参照图11,该装置包括:Fig. 11 is a block diagram of an image generating device according to an exemplary embodiment. Referring to Figure 11, the device includes:
编码获取模块1110,被配置为执行获取预设对象编码和目标风格的目标风格编码;The code acquisition module 1110 is configured to perform target style coding for acquiring preset object codes and target styles;
第一风格融合处理模块1120,被配置为执行基于风格融合网络中预设数量个网络层对应的网络融合参数,对目标风格编码和预设对象编码进行风格融合处理,得到目标风格融合编码;网络融合参数为基于预设数量个网络层对应的融合数据和目标融合权重确定的,目标融合权重为基于目标风格编码和预设对象编码进行融合权重学习得到的;The first style fusion processing module 1120 is configured to perform style fusion processing on the target style code and the preset object code based on network fusion parameters corresponding to a preset number of network layers in the style fusion network to obtain the target style fusion code; network The fusion parameters are determined based on fusion data corresponding to a preset number of network layers and target fusion weights, and the target fusion weights are obtained by learning fusion weights based on target style coding and preset object coding;
第一图像生成处理模块1130,被配置为执行将目标风格融合编码输入目标图像生成网络进行图像生成处理,得到目标风格对应的预设对象风格图像。The first image generation processing module 1130 is configured to input the target style fusion code into the target image generation network for image generation processing, and obtain a preset object style image corresponding to the target style.
在一些实施例中,第一风格融合处理模块1120包括:In some embodiments, the first style fusion processing module 1120 includes:
目标融合调控数据获取单元,被配置为执行获取目标融合调控数据,目标融合调控数据用于调控目标风格编码和预设对象编码在风格融合网络中的融合位置;The target fusion control data acquisition unit is configured to execute the acquisition of target fusion control data, and the target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network;
融合数据确定单元,被配置为执行根据目标融合调控数据,确定预设数量个网络层对应的融合数据;The fused data determination unit is configured to perform fusion control data according to the target, and determine fused data corresponding to a preset number of network layers;
第一拼接处理单元,被配置为执行对目标风格编码和预设对象编码进行拼接处理,得到目标拼接编码;The first splicing processing unit is configured to perform splicing processing on the target style code and the preset object code to obtain the target spliced code;
第一融合权重学习单元,被配置为执行基于目标拼接编码进行融合权重学习,得到目标融合权重;The first fusion weight learning unit is configured to perform fusion weight learning based on target splicing coding to obtain target fusion weights;
第一加权处理单元,被配置为执行对融合数据和目标融合权重进行加权处理,得到网络融合参数;The first weighting processing unit is configured to perform weighting processing on the fusion data and the target fusion weight to obtain network fusion parameters;
第一风格融合处理单元,被配置为执行基于网络融合参数在预设数量个网络层中对目标风格编码和预设对象编码进行风格融合处理,得到目标风格融合编码。The first style fusion processing unit is configured to perform style fusion processing on the target style code and the preset object code in a preset number of network layers based on the network fusion parameters to obtain the target style fusion code.
在一些实施例中,融合数据确定单元包括:In some embodiments, the fusion data determination unit includes:
比较单元,被配置为执行比较预设数量个网络层对应的层数与目标融合调控数据,得到比较结果;The comparison unit is configured to compare the number of layers corresponding to the preset number of network layers with the target fusion control data to obtain a comparison result;
融合数据确定子单元,被配置为执行根据比较结果,确定预设数量个网络层对应的融合数据。The fused data determination subunit is configured to determine fused data corresponding to a preset number of network layers according to the comparison result.
在一些实施例中,编码获取模块1110包括:In some embodiments, the code acquisition module 1110 includes:
参考风格图像获取单元,被配置为执行获取目标风格的参考风格图像;a reference style image acquiring unit configured to acquire a reference style image of a target style;
风格编码处理单元,被配置为执行将参考风格图像输入风格编码网络进行风格编码处理,得到目标风格编码。The style encoding processing unit is configured to input the reference style image into the style encoding network for style encoding processing to obtain the target style encoding.
在一些实施例中,上述装置还包括:In some embodiments, the above-mentioned device also includes:
样本图像获取模块,被配置为执行获取目标风格的正样本风格图像对和负样本风格图像对;The sample image acquisition module is configured to perform acquisition of a positive sample style image pair and a negative sample style image pair of the target style;
风格编码处理模块,被配置为执行将正样本风格图像对和负样本风格图像对输入待训练风格编码网络进行风格编码处理,得到正样本风格图像对和负样本风格图对各自对应的样本风格编码;The style coding processing module is configured to perform style coding processing on inputting the positive sample style image pair and the negative sample style image pair into the style coding network to be trained, and obtain the respective sample style codes corresponding to the positive sample style image pair and the negative sample style image pair ;
感知处理模块,被配置为执行将样本风格编码输入待训练感知网络进行感知处理,得到正样本风格图像对和负样本风格图对各自对应的样本感知特征信息;The perceptual processing module is configured to input the sample style code into the perceptual network to be trained for perceptual processing, and obtain the respective sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair;
对比损失信息确定模块,被配置为执行根据样本感知特征信息,确定对比损失信息;The comparison loss information determination module is configured to determine the comparison loss information according to the sample perception feature information;
第一网络训练模块,被配置为执行基于对比损失信息,训练待训练风格编码网络和待训练感知网络;The first network training module is configured to perform the training of the style encoding network to be trained and the perception network to be trained based on the comparison loss information;
风格编码网络确定模块,被配置为执行将训练好的待训练风格编码网络,作为风格编码网络。The style coding network determination module is configured to execute the trained style coding network to be trained as the style coding network.
在一些实施例中,编码获取模块1110包括:In some embodiments, the code acquisition module 1110 includes:
初始风格编码生成单元,被配置为执行基于第一预设分布,随机生成初始风格编码;an initial style code generating unit configured to randomly generate an initial style code based on a first preset distribution;
第一感知处理单元,被配置为执行将初始风格编码输入第一多层感知网络进行感知处理,得到目标风格编码。The first perceptual processing unit is configured to input the initial style code into the first multi-layer perceptual network for perceptual processing to obtain the target style code.
在一些实施例中,编码获取模块1110包括:In some embodiments, the code acquisition module 1110 includes:
初始对象编码生成单元,被配置为执行基于第二预设分布,随机生成初始对象编码;The initial object code generating unit is configured to randomly generate the initial object code based on the second preset distribution;
第二感知处理单元,被配置为执行将初始对象编码输入第二多层感知网络进行感知处理,得到预设对象编码。The second perceptual processing unit is configured to input the initial object code into the second multi-layer perceptual network for perceptual processing to obtain the preset object code.
在一些实施例中,上述装置还包括:In some embodiments, the above-mentioned device also includes:
样本数据获取模块,被配置为执行获取目标风格的第一样本风格编码、样本对象编码、非目标风格的第二样本风格编码、预设风格对象图像和预设对象图像;A sample data acquisition module configured to perform acquisition of a first sample style code of a target style, a sample object code, a second sample style code of a non-target style, a preset style object image, and a preset object image;
第二风格融合处理模块,被配置为执行基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对第一样本风格编码和样本对象编码进行风格融合处理,得到样本风格融合编码;The second style fusion processing module is configured to perform style fusion processing on the first sample style code and the sample object code based on the sample network fusion parameters corresponding to a preset number of network layers to be trained in the style fusion network to be trained, and obtain Sample style fusion encoding;
第二图像生成处理模块,被配置为执行将样本风格融合编码输入待训练图像生成网络进行图像生成处理,得到目标风格对应的样本对象风格图像;The second image generation processing module is configured to input the sample style fusion code into the image generation network to be trained for image generation processing, and obtain the sample object style image corresponding to the target style;
风格判别处理模块,被配置为执行将样本对象风格图像、预设风格对象图像、预设对象图像、第一样本风格编码和第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息;The style discrimination processing module is configured to perform style discrimination processing by inputting the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminant network to be trained to obtain the target discriminant information;
目标损失信息确定模块,被配置为执行根据目标判别信息,确定目标损失信息;The target loss information determination module is configured to determine the target loss information according to the target discrimination information;
第二网络训练模块,被配置为执行基于目标损失信息,训练待训练风格融合网络、待训练图像生成网络和待训练判别网络;The second network training module is configured to perform training based on the target loss information to train the style fusion network to be trained, the image generation network to be trained and the discrimination network to be trained;
网络确定模块,被配置为执行将训练好的待训练风格融合网络作为风格融合网络,以及将训练好的待训练图像生成网络作为目标图像生成网络。The network determination module is configured to use the trained style fusion network to be trained as the style fusion network, and use the trained image generation network to be trained as the target image generation network.
在一些实施例中,样本风格融合编码包括第一风格融合编码和第二风格融合编码;第二风格融合处理模块包括:In some embodiments, the sample style fusion encoding includes a first style fusion encoding and a second style fusion encoding; the second style fusion processing module includes:
样本融合调控数据获取单元,被配置为执行获取第一融合调控数据和第二融合调控数据,第一融合调控数据用于调控第一样本风格编码和样本对象编码,从待训练风格融合网络中的第一个网络层开始融合;第二融合调控数据用于调控第一样本风格编码在待训练风格融合网络中不参与融合;The sample fusion control data acquisition unit is configured to execute the acquisition of the first fusion control data and the second fusion control data, the first fusion control data is used to control the first sample style coding and sample object coding, from the style fusion network to be trained The first network layer starts to fuse; the second fusion control data is used to control the first sample style code not to participate in fusion in the style fusion network to be trained;
样本融合数据确定单元,被配置为执行根据第一融合调控数据和第二融合调控数据,分别确定预设数量个待训练网络层对应的第一样本融合数据和第二样本融合数据;The sample fusion data determination unit is configured to determine the first sample fusion data and the second sample fusion data corresponding to the preset number of network layers to be trained according to the first fusion control data and the second fusion control data;
第二拼接处理单元,被配置为执行对第一样本风格编码和样本对象编码进行拼接处理,得到样本拼接编码;The second splicing processing unit is configured to perform splicing processing on the first sample style code and the sample object code to obtain the sample spliced code;
第二融合权重学习单元,被配置为执行基于样本拼接编码进行融合权重学习,得到样本融合权重;The second fused weight learning unit is configured to perform fused weight learning based on sample splicing coding to obtain sample fused weights;
第二加权处理单元,被配置为执行将第一样本融合数据、第二样本融合数据分别与样本融合权重进行加权处理,得到第一样本网络融合参数和第二样本网络融合参数;The second weighting processing unit is configured to perform weighting processing on the first sample fusion data, the second sample fusion data and the sample fusion weight respectively, to obtain the first sample network fusion parameters and the second sample network fusion parameters;
第二风格融合处理单元,被配置为执行基于第一样本网络融合参数和第二样本网络融合参数,分别在预设数量个待训练网络层中对第一样本风格编码和样本对象编码进行风格融合处理,得到第一风格融合编码和第二风格融合编码。The second style fusion processing unit is configured to perform the first sample style coding and sample object coding in a preset number of network layers to be trained based on the first sample network fusion parameters and the second sample network fusion parameters, respectively. Style fusion processing to obtain the first style fusion code and the second style fusion code.
在一些实施例中,样本对象风格图像包括第一风格融合编码对应的第一样本对象风格图像和第二风格融合编码对应的第二样本对象风格图像;待训练判别网络包括对象判别网络、风格对象判别网络和风格编码判别网络;目标判别信息包括对象判别信息、风格对象判别信息和风格编码判别信息;In some embodiments, the sample object style image includes a first sample object style image corresponding to the first style fusion encoding and a second sample object style image corresponding to the second style fusion encoding; the discriminant network to be trained includes an object discriminant network, a style Object discrimination network and style coding discrimination network; target discrimination information includes object discrimination information, style object discrimination information and style coding discrimination information;
风格判别处理模块包括:The style discrimination processing module includes:
对象判别处理单元,被配置为执行将第二样本对象风格图像和预设对象图像输入对象判别网络进行对象判别处理,得到对象判别信息;The object discrimination processing unit is configured to perform object discrimination processing by inputting the second sample object style image and the preset object image into the object discrimination network to obtain object discrimination information;
风格对象判别处理单元,被配置为执行将第一样本对象风格图像和预设风格对象图像输入风格对象判别网络进行风格对象判别处理,得到风格对象判别信息;a style object discrimination processing unit configured to input the first sample object style image and the preset style object image into the style object discrimination network for style object discrimination processing, and obtain style object discrimination information;
风格编码判别处理单元,被配置为执行将第二样本对象风格图像、第一样本风格编码和第二样本风格编码输入风格编码判别网络进行风格编码判别处理,得到风格编码判别信息。The style code discrimination processing unit is configured to input the second sample object style image, the first sample style code and the second sample style code into the style code discrimination network for style code discrimination processing to obtain style code discrimination information.
图12是根据一示例性实施例示出的另一种图像生成装置框图。参照图12,该装置包括:Fig. 12 is a block diagram of another image generating device according to an exemplary embodiment. Referring to Figure 12, the device includes:
原始对象图像获取模块1210,被配置为执行获取第一目标对象的第一原始对象图像;an original object image acquisition module 1210, configured to perform acquisition of a first original object image of the first target object;
第一风格转换处理模块1220,被配置为执行将第一原始对象图像输入第一风格转换网络进行风格转换处理,得到第一目标对象对应的第一目标对象风格图像;The first style conversion processing module 1220 is configured to input the first original object image into the first style conversion network for style conversion processing, and obtain the first target object style image corresponding to the first target object;
第一风格转换网络为基于第一样本对象图像和如上述图像生成方法生成的目标风格的预设对象风格图像,对第一预设图像生成网络进行对抗训练得到的。The first style conversion network is obtained by performing adversarial training on the first preset image generation network based on the first sample object image and the preset object style image of the target style generated by the above-mentioned image generation method.
图13是根据一示例性实施例示出的另一种图像生成装置框图。参照图13,该装置包括:Fig. 13 is a block diagram of another image generating device according to an exemplary embodiment. Referring to Figure 13, the device includes:
数据获取模块1310,被配置为执行获取第二目标对象的第二原始对象图像和目标风格标签;a data acquisition module 1310 configured to acquire a second original object image and a target style label of a second target object;
第二风格转换处理模块1320,被配置为执行将第二原始对象图像和目标风格标签输入第二风格转换网络进行风格转换处理,得到第二目标对象对应的第二目标对象风格图像;The second style conversion processing module 1320 is configured to perform style conversion processing by inputting the second original object image and the target style label into the second style conversion network to obtain a second target object style image corresponding to the second target object;
第二风格转换网络为基于第二样本对象图像、多种目标风格标签和如上述图像生成方法生成的多种目标风格的预设对象风格图像,对第二预设目标图像生成网络进行对抗训练得到的。The second style conversion network is a preset object style image based on the second sample object image, multiple target style labels, and multiple target styles generated by the above image generation method, and the second preset target image generation network is subjected to adversarial training to obtain of.
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。Regarding the apparatus in the foregoing embodiments, the specific manner in which each module executes operations has been described in detail in the embodiments related to the method, and will not be described in detail here.
图14是根据一示例性实施例示出的一种用于图像生成的电子设备的框图,该电子设备可以是终端,其内部结构图可以如图14所示。该电子设备包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,该电子设备的处理器用于提供计算和控制能力。该电子设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该电子设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种图像生成方法。该电子设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该电子设备的输入装置可以是显示屏上覆盖的触摸层,也可以是电子设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。Fig. 14 is a block diagram of an electronic device for image generation according to an exemplary embodiment. The electronic device may be a terminal, and its internal structure may be as shown in Fig. 14 . The electronic device includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus. Wherein, the processor of the electronic device is used to provide calculation and control capabilities. The memory of the electronic device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The network interface of the electronic device is used to communicate with an external terminal through a network connection. When the computer program is executed by a processor, an image generation method is realized. The display screen of the electronic device may be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic device may be a touch layer covered on the display screen, or a button, a trackball or a touch pad provided on the housing of the electronic device , and can also be an external keyboard, touchpad, or mouse.
图15是根据一示例性实施例示出的另一种用于图像生成的电子设备的框图,该电子设备可以是服务器,其内部结构图可以如图15所示。该电子设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该电子设备的处理器用于提供计算和控制能力。该电子设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该电子设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种图像生成方法。Fig. 15 is a block diagram of another electronic device for image generation according to an exemplary embodiment. The electronic device may be a server, and its internal structure may be as shown in Fig. 15 . The electronic device includes a processor, memory and network interface connected by a system bus. Wherein, the processor of the electronic device is used to provide calculation and control capabilities. The memory of the electronic device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The network interface of the electronic device is used to communicate with an external terminal through a network connection. When the computer program is executed by a processor, an image generation method is realized.
本领域技术人员可以理解,图14或图15中示出的结构,仅仅是与本公开方案相关的部分结构的框图,并不构成对本公开方案所应用于其上的电子设备的限定,在一些实施例中,电子设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 14 or FIG. 15 is only a block diagram of a part of the structure related to the disclosed solution, and does not constitute a limitation on the electronic equipment to which the disclosed solution is applied. In some In embodiments, the electronic device may include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
在示例性实施例中,还提供了一种电子设备,包括:处理器;用于存储该处理器可执行指令的存储器;其中,该处理器被配置为执行该指令,以实现如本公开实施例中的图像生成方法。In an exemplary embodiment, there is also provided an electronic device, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions, so as to realize the implementation of the present disclosure. The image generation method in the example.
在示例性实施例中,还提供了一种计算机可读存储介质,当该存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行本公开实施例中的图像生成方法。In an exemplary embodiment, a computer-readable storage medium is also provided, and when instructions in the storage medium are executed by a processor of the electronic device, the electronic device can execute the image generation method in the embodiments of the present disclosure.
在示例性实施例中,还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行本公开实施例中的图像生成方法。In an exemplary embodiment, there is also provided a computer program product including instructions, which, when run on a computer, cause the computer to execute the image generation method in the embodiments of the present disclosure.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,该计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be realized by instructing related hardware through a computer program, and the computer program can be stored in a non-volatile computer-readable storage medium , when the computer program is executed, it may include the procedures of the embodiments of the above-mentioned methods. Wherein, any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and/or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
本公开所有实施例均可以单独被执行,也可以与其他实施例相结合被执行,均视为本公开要求的保护范围。All the embodiments of the present disclosure can be implemented independently or in combination with other embodiments, which are all regarded as the scope of protection required by the present disclosure.

Claims (38)

  1. 一种图像生成方法,其特征在于,包括:A method for generating an image, comprising:
    获取预设对象编码和目标风格的目标风格编码;Obtain the target style encoding of the preset object encoding and target style;
    基于风格融合网络中预设数量个网络层对应的网络融合参数,对所述目标风格编码和所述预设对象编码进行风格融合处理,得到目标风格融合编码;所述网络融合参数为基于所述预设数量个网络层对应的融合数据和目标融合权重确定的,所述目标融合权重为基于所述目标风格编码和所述预设对象编码进行融合权重学习得到的;Based on network fusion parameters corresponding to a preset number of network layers in the style fusion network, perform style fusion processing on the target style code and the preset object code to obtain a target style fusion code; the network fusion parameters are based on the The fusion data corresponding to a preset number of network layers and the target fusion weight are determined, and the target fusion weight is obtained by performing fusion weight learning based on the target style code and the preset object code;
    将所述目标风格融合编码输入目标图像生成网络进行图像生成处理,得到所述目标风格对应的预设对象风格图像。The target style fusion code is input into the target image generation network for image generation processing, and the preset object style image corresponding to the target style is obtained.
  2. 根据权利要求1所述的图像生成方法,其特征在于,所述基于风格融合网络中预设数量个网络层对应的网络融合参数,对所述目标风格编码和所述预设对象编码进行风格融合处理,得到目标风格融合编码包括:The image generation method according to claim 1, wherein the network fusion parameters corresponding to a preset number of network layers in the style fusion network are used to perform style fusion on the target style code and the preset object code Processing to obtain the target style fusion encoding includes:
    获取目标融合调控数据,所述目标融合调控数据用于调控所述目标风格编码和所述预设对象编码在所述风格融合网络中的融合位置;Acquiring target fusion control data, the target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network;
    根据所述目标融合调控数据,确定所述预设数量个网络层对应的融合数据;determining fusion data corresponding to the preset number of network layers according to the target fusion control data;
    对所述目标风格编码和所述预设对象编码进行拼接处理,得到目标拼接编码;Perform splicing processing on the target style code and the preset object code to obtain the target spliced code;
    基于所述目标拼接编码进行融合权重学习,得到所述目标融合权重;performing fusion weight learning based on the target splicing code to obtain the target fusion weight;
    对所述融合数据和所述目标融合权重进行加权处理,得到所述网络融合参数;performing weighting processing on the fusion data and the target fusion weight to obtain the network fusion parameters;
    基于所述网络融合参数在所述预设数量个网络层中对所述目标风格编码和所述预设对象编码进行风格融合处理,得到所述目标风格融合编码。Perform style fusion processing on the target style code and the preset object code in the preset number of network layers based on the network fusion parameters to obtain the target style fusion code.
  3. 根据权利要求2所述的图像生成方法,其特征在于,所述根据所述目标融合调控数据,确定所述预设数量个网络层对应的融合数据包括:The image generation method according to claim 2, wherein said determining fusion data corresponding to said preset number of network layers according to said target fusion control data comprises:
    比较所述预设数量个网络层对应的层数与所述目标融合调控数据,得到比较结果;Comparing the number of layers corresponding to the preset number of network layers with the target fusion control data to obtain a comparison result;
    根据所述比较结果,确定所述预设数量个网络层对应的融合数据。According to the comparison result, determine fusion data corresponding to the preset number of network layers.
  4. 根据权利要求1至3任一所述的图像生成方法,其特征在于,所述获取目标风格的目标风格编码包括:The image generation method according to any one of claims 1 to 3, wherein said acquiring the target style encoding of the target style comprises:
    获取所述目标风格的参考风格图像;Acquiring a reference style image of the target style;
    将所述参考风格图像输入风格编码网络进行风格编码处理,得到所述目标风格编码。Inputting the reference style image into the style encoding network for style encoding processing to obtain the target style encoding.
  5. 根据权利要求4所述的图像生成方法,其特征在于,所述方法还包括:The image generation method according to claim 4, wherein the method further comprises:
    获取所述目标风格的正样本风格图像对和负样本风格图像对;Obtaining a positive sample style image pair and a negative sample style image pair of the target style;
    将所述正样本风格图像对和负样本风格图像对输入待训练风格编码网络进行风格编码处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本风格编码;Inputting the positive sample style image pair and the negative sample style image pair into the style coding network to be trained for style coding processing to obtain the sample style codes corresponding to the positive sample style image pair and the negative sample style image pair;
    将所述样本风格编码输入待训练感知网络进行感知处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本感知特征信息;Inputting the sample style code into the perceptual network to be trained for perceptual processing, and obtaining the respective sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair;
    根据所述样本感知特征信息,确定对比损失信息;determining comparative loss information according to the perceptual feature information of the sample;
    基于所述对比损失信息,训练所述待训练风格编码网络和所述待训练感知网络;training the style encoding network to be trained and the perception network to be trained based on the comparative loss information;
    将训练好的待训练风格编码网络,作为所述风格编码网络。The trained style coding network to be trained is used as the style coding network.
  6. 根据权利要求1至3任一所述的图像生成方法,其特征在于,所述获取目标风格的目标风格编码包括:The image generation method according to any one of claims 1 to 3, wherein said acquiring the target style encoding of the target style comprises:
    基于第一预设分布,随机生成初始风格编码;Randomly generate an initial style code based on a first preset distribution;
    将所述初始风格编码输入第一多层感知网络进行感知处理,得到所述目标风格编码。Inputting the initial style code into the first multi-layer perceptual network for perceptual processing to obtain the target style code.
  7. 根据权利要求1至3任一所述的图像生成方法,其特征在于,所述预设对象编码包括采用下述步骤获取:The image generation method according to any one of claims 1 to 3, wherein the preset object code includes obtaining by the following steps:
    基于第二预设分布,随机生成初始对象编码;Randomly generating an initial object code based on a second preset distribution;
    将所述初始对象编码输入第二多层感知网络进行感知处理,得到所述预设对象编码。Inputting the initial object code into the second multi-layer perceptual network for perceptual processing to obtain the preset object code.
  8. 根据权利要求1至3任一所述的图像生成方法,其特征在于,所述方法还包括:The image generation method according to any one of claims 1 to 3, wherein the method further comprises:
    获取所述目标风格的第一样本风格编码、样本对象编码、非目标风格的第二样本风格编码、预设 风格对象图像和预设对象图像;Acquiring a first sample style code of the target style, a sample object code, a second sample style code of a non-target style, a preset style object image, and a preset object image;
    基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到样本风格融合编码;Based on the sample network fusion parameters corresponding to the preset number of network layers to be trained in the style fusion network to be trained, perform style fusion processing on the first sample style code and the sample object code to obtain a sample style fusion code;
    将所述样本风格融合编码输入待训练图像生成网络进行图像生成处理,得到所述目标风格对应的样本对象风格图像;Inputting the sample style fusion code into the image generation network to be trained for image generation processing to obtain the sample object style image corresponding to the target style;
    将所述样本对象风格图像、所述预设风格对象图像、预设对象图像、所述第一样本风格编码和所述第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息;Input the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminant network to be trained for style discrimination processing, and obtain the target discrimination information;
    根据所述目标判别信息,确定目标损失信息;determining target loss information according to the target discrimination information;
    基于所述目标损失信息,训练所述待训练风格融合网络、所述待训练图像生成网络和所述待训练判别网络;Training the style fusion network to be trained, the image generation network to be trained, and the discrimination network to be trained based on the target loss information;
    将训练好的待训练风格融合网络作为所述风格融合网络,以及将训练好的待训练图像生成网络作为所述目标图像生成网络。The trained style fusion network to be trained is used as the style fusion network, and the trained image generation network to be trained is used as the target image generation network.
  9. 根据权利要求8所述的图像生成方法,其特征在于,所述样本风格融合编码包括第一风格融合编码和第二风格融合编码;所述基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到样本风格融合编码包括:The image generation method according to claim 8, wherein the sample style fusion coding includes a first style fusion coding and a second style fusion coding; the preset number of networks to be trained based on the style fusion network to be trained The sample network fusion parameters corresponding to the layer, the style fusion processing is performed on the first sample style code and the sample object code, and the sample style fusion code obtained includes:
    获取第一融合调控数据和第二融合调控数据,所述第一融合调控数据用于调控所述第一样本风格编码和所述样本对象编码,从所述待训练风格融合网络中的第一个网络层开始融合;所述第二融合调控数据用于调控所述第一样本风格编码在所述待训练风格融合网络中不参与融合;Acquiring first fusion control data and second fusion control data, the first fusion control data is used to control the first sample style code and the sample object code, from the first style fusion network to be trained The first network layer starts to fuse; the second fusion control data is used to control the first sample style code not to participate in fusion in the style fusion network to be trained;
    根据所述第一融合调控数据和所述第二融合调控数据,分别确定所述预设数量个待训练网络层对应的第一样本融合数据和第二样本融合数据;According to the first fusion control data and the second fusion control data, respectively determine the first sample fusion data and the second sample fusion data corresponding to the preset number of network layers to be trained;
    对所述第一样本风格编码和所述样本对象编码进行拼接处理,得到样本拼接编码;Concatenating the first sample style code and the sample object code to obtain a sample concatenation code;
    基于所述样本拼接编码进行融合权重学习,得到所述样本融合权重;Perform fusion weight learning based on the sample splicing code to obtain the sample fusion weight;
    将所述第一样本融合数据、第二样本融合数据分别与所述样本融合权重进行加权处理,得到第一样本网络融合参数和第二样本网络融合参数;performing weighting processing on the first sample fusion data, the second sample fusion data and the sample fusion weight respectively, to obtain the first sample network fusion parameters and the second sample network fusion parameters;
    基于所述第一样本网络融合参数和第二样本网络融合参数,分别在所述预设数量个待训练网络层中对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到所述第一风格融合编码和所述第二风格融合编码。Based on the first sample network fusion parameters and the second sample network fusion parameters, respectively perform style fusion processing on the first sample style code and the sample object code in the preset number of network layers to be trained , to obtain the first style fusion code and the second style fusion code.
  10. 根据权利要求9所述的图像生成方法,其特征在于,所述样本对象风格图像包括所述第一风格融合编码对应的第一样本对象风格图像和所述第二风格融合编码对应的第二样本对象风格图像;所述待训练判别网络包括对象判别网络、风格对象判别网络和风格编码判别网络;所述目标判别信息包括对象判别信息、风格对象判别信息和风格编码判别信息;The image generation method according to claim 9, wherein the sample object style image includes the first sample object style image corresponding to the first style fusion coding and the second style fusion coding corresponding to the second style image. A sample object style image; the discriminant network to be trained includes an object discriminant network, a style object discriminant network, and a style encoding discriminant network; the target discriminant information includes object discriminant information, style object discriminant information, and style code discriminant information;
    所述将所述样本对象风格图像、所述预设风格对象图像、预设对象图像、所述第一样本风格编码和所述第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息包括:The sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code are input into the discriminant network to be trained for style discrimination processing, and the obtained Target discrimination information includes:
    将所述第二样本对象风格图像和所述预设对象图像输入所述对象判别网络进行对象判别处理,得到对象判别信息;inputting the second sample object style image and the preset object image into the object discrimination network for object discrimination processing to obtain object discrimination information;
    将所述第一样本对象风格图像和所述预设风格对象图像输入所述风格对象判别网络进行风格对象判别处理,得到风格对象判别信息;inputting the first sample object style image and the preset style object image into the style object discrimination network to perform style object discrimination processing to obtain style object discrimination information;
    将所述第二样本对象风格图像、所述第一样本风格编码和所述第二样本风格编码输入所述风格编码判别网络进行风格编码判别处理,得到风格编码判别信息。Inputting the second sample object style image, the first sample style code and the second sample style code into the style code discrimination network for style code discrimination processing to obtain style code discrimination information.
  11. 一种图像生成方法,其特征在于,包括:A method for generating an image, comprising:
    获取第一目标对象的第一原始对象图像;acquiring a first original object image of a first target object;
    将所述第一原始对象图像输入第一风格转换网络进行风格转换处理,得到所述第一目标对象对应的第一目标对象风格图像;inputting the first original object image into a first style conversion network to perform style conversion processing to obtain a first target object style image corresponding to the first target object;
    所述第一风格转换网络为基于第一样本对象图像和如权利要求1至10任一图像生成方法生成的目标风格的预设对象风格图像,对第一预设图像生成网络进行对抗训练得到的。The first style conversion network is a preset object style image based on the first sample object image and the target style generated by any image generation method according to claims 1 to 10, and the first preset image generation network is subjected to confrontation training to obtain of.
  12. 一种图像生成方法,其特征在于,包括:A method for generating an image, comprising:
    获取第二目标对象的第二原始对象图像和目标风格标签;Obtaining a second original object image and a target style label of a second target object;
    将所述第二原始对象图像和所述目标风格标签输入第二风格转换网络进行风格转换处理,得到所述第二目标对象对应的第二目标对象风格图像;inputting the second original object image and the target style label into a second style conversion network for style conversion processing, to obtain a second target object style image corresponding to the second target object;
    所述第二风格转换网络为基于第二样本对象图像、多种目标风格标签和如权利要求1至10任一图像生成方法生成的多种目标风格的预设对象风格图像,对第二预设目标图像生成网络进行对抗训练得到的。The second style conversion network is a preset object style image based on the second sample object image, a variety of target style labels and a variety of target styles generated by any image generation method in claims 1 to 10, for the second preset The target image generation network is obtained by adversarial training.
  13. 一种图像生成装置,其特征在于,包括:An image generating device, characterized in that it comprises:
    编码获取模块,被配置为执行获取预设对象编码和目标风格的目标风格编码;an encoding acquisition module configured to perform target style encoding for acquiring preset object encodings and target styles;
    第一风格融合处理模块,被配置为执行基于风格融合网络中预设数量个网络层对应的网络融合参数,对所述目标风格编码和所述预设对象编码进行风格融合处理,得到目标风格融合编码;所述网络融合参数为基于所述预设数量个网络层对应的融合数据和目标融合权重确定的,所述目标融合权重为基于所述目标风格编码和所述预设对象编码进行融合权重学习得到的;The first style fusion processing module is configured to perform style fusion processing on the target style code and the preset object code based on network fusion parameters corresponding to a preset number of network layers in the style fusion network to obtain a target style fusion Coding; the network fusion parameters are determined based on fusion data corresponding to the preset number of network layers and target fusion weights, and the target fusion weights are fusion weights based on the target style coding and the preset object coding learned;
    第一图像生成处理模块,被配置为执行将所述目标风格融合编码输入目标图像生成网络进行图像生成处理,得到所述目标风格对应的预设对象风格图像。The first image generation processing module is configured to input the target style fusion code into the target image generation network for image generation processing, and obtain a preset object style image corresponding to the target style.
  14. 根据权利要求13所述的图像生成装置,其特征在于,所述第一风格融合处理模块包括:The image generating device according to claim 13, wherein the first style fusion processing module comprises:
    目标融合调控数据获取单元,被配置为执行获取目标融合调控数据,所述目标融合调控数据用于调控所述目标风格编码和所述预设对象编码在所述风格融合网络中的融合位置;A target fusion control data acquisition unit configured to acquire target fusion control data, the target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network;
    融合数据确定单元,被配置为执行根据所述目标融合调控数据,确定所述预设数量个网络层对应的融合数据;The fused data determination unit is configured to determine fused data corresponding to the preset number of network layers according to the target fused control data;
    第一拼接处理单元,被配置为执行对所述目标风格编码和所述预设对象编码进行拼接处理,得到目标拼接编码;The first splicing processing unit is configured to perform splicing processing on the target style code and the preset object code to obtain the target spliced code;
    第一融合权重学习单元,被配置为执行基于所述目标拼接编码进行融合权重学习,得到所述目标融合权重;The first fusion weight learning unit is configured to perform fusion weight learning based on the target splicing code to obtain the target fusion weight;
    第一加权处理单元,被配置为执行对所述融合数据和所述目标融合权重进行加权处理,得到所述网络融合参数;a first weighting processing unit configured to perform weighting processing on the fusion data and the target fusion weight to obtain the network fusion parameters;
    第一风格融合处理单元,被配置为执行基于所述网络融合参数在所述预设数量个网络层中对所述目标风格编码和所述预设对象编码进行风格融合处理,得到所述目标风格融合编码。The first style fusion processing unit is configured to perform style fusion processing on the target style code and the preset object code in the preset number of network layers based on the network fusion parameters to obtain the target style fusion encoding.
  15. 根据权利要求14所述的图像生成装置,其特征在于,所述融合数据确定单元包括:The image generating device according to claim 14, wherein the fusion data determining unit comprises:
    比较单元,被配置为执行比较所述预设数量个网络层对应的层数与所述目标融合调控数据,得到比较结果;The comparison unit is configured to compare the number of layers corresponding to the preset number of network layers with the target fusion control data to obtain a comparison result;
    融合数据确定子单元,被配置为执行根据所述比较结果,确定所述预设数量个网络层对应的融合数据。The fused data determining subunit is configured to determine fused data corresponding to the preset number of network layers according to the comparison result.
  16. 根据权利要求13至15任一所述的图像生成装置,其特征在于,所述编码获取模块包括:The image generating device according to any one of claims 13 to 15, wherein the code acquisition module includes:
    参考风格图像获取单元,被配置为执行获取所述目标风格的参考风格图像;a reference style image acquisition unit configured to acquire a reference style image of the target style;
    风格编码处理单元,被配置为执行将所述参考风格图像输入风格编码网络进行风格编码处理,得到所述目标风格编码。The style encoding processing unit is configured to input the reference style image into the style encoding network for style encoding processing to obtain the target style encoding.
  17. 根据权利要求16所述的图像生成装置,其特征在于,所述装置还包括:The image generating device according to claim 16, wherein the device further comprises:
    样本图像获取模块,被配置为执行获取所述目标风格的正样本风格图像对和负样本风格图像对;The sample image acquisition module is configured to acquire the positive sample style image pair and the negative sample style image pair of the target style;
    风格编码处理模块,被配置为执行将所述正样本风格图像对和负样本风格图像对输入待训练风格编码网络进行风格编码处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本风格编码;The style coding processing module is configured to perform style coding processing by inputting the positive sample style image pair and the negative sample style image pair into the style coding network to be trained, to obtain the positive sample style image pair and the negative sample style image pair The respective sample style codes;
    感知处理模块,被配置为执行将所述样本风格编码输入待训练感知网络进行感知处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本感知特征信息;The perceptual processing module is configured to input the sample style code into the perceptual network to be trained for perceptual processing, and obtain the sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair;
    对比损失信息确定模块,被配置为执行根据所述样本感知特征信息,确定对比损失信息;A comparison loss information determination module configured to determine the comparison loss information according to the sample perception feature information;
    第一网络训练模块,被配置为执行基于所述对比损失信息,训练所述待训练风格编码网络和所述待训练感知网络;A first network training module configured to perform training of the style coding network to be trained and the perception network to be trained based on the comparison loss information;
    风格编码网络确定模块,被配置为执行将训练好的待训练风格编码网络,作为所述风格编码网络。The style coding network determining module is configured to execute the trained style coding network to be trained as the style coding network.
  18. 根据权利要求13至15任一所述的图像生成装置,其特征在于,所述编码获取模块包括:The image generating device according to any one of claims 13 to 15, wherein the code acquisition module includes:
    初始风格编码生成单元,被配置为执行基于第一预设分布,随机生成初始风格编码;an initial style code generating unit configured to randomly generate an initial style code based on a first preset distribution;
    第一感知处理单元,被配置为执行将所述初始风格编码输入第一多层感知网络进行感知处理,得到所述目标风格编码。The first perceptual processing unit is configured to input the initial style code into the first multi-layer perceptual network for perceptual processing to obtain the target style code.
  19. 根据权利要求13至15任一所述的图像生成装置,其特征在于,所述编码获取模块包括:The image generating device according to any one of claims 13 to 15, wherein the code acquisition module includes:
    初始对象编码生成单元,被配置为执行基于第二预设分布,随机生成初始对象编码;The initial object code generating unit is configured to randomly generate the initial object code based on the second preset distribution;
    第二感知处理单元,被配置为执行将所述初始对象编码输入第二多层感知网络进行感知处理,得到所述预设对象编码。The second perceptual processing unit is configured to input the initial object code into the second multi-layer perceptual network for perceptual processing to obtain the preset object code.
  20. 根据权利要求13至15任一所述的图像生成装置,其特征在于,所述装置还包括:The image generating device according to any one of claims 13 to 15, wherein the device further comprises:
    样本数据获取模块,被配置为执行获取所述目标风格的第一样本风格编码、样本对象编码、非目标风格的第二样本风格编码、预设风格对象图像和预设对象图像;A sample data acquisition module configured to acquire the first sample style code of the target style, the sample object code, the second sample style code of the non-target style, the preset style object image and the preset object image;
    第二风格融合处理模块,被配置为执行基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到样本风格融合编码;The second style fusion processing module is configured to perform style encoding on the first sample style code and the sample object code based on sample network fusion parameters corresponding to a preset number of network layers to be trained in the style fusion network to be trained Fusion processing to obtain sample style fusion coding;
    第二图像生成处理模块,被配置为执行将所述样本风格融合编码输入待训练图像生成网络进行图像生成处理,得到所述目标风格对应的样本对象风格图像;The second image generation processing module is configured to input the sample style fusion code into the image generation network to be trained for image generation processing, and obtain the sample object style image corresponding to the target style;
    风格判别处理模块,被配置为执行将所述样本对象风格图像、所述预设风格对象图像、预设对象图像、所述第一样本风格编码和所述第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息;a style discrimination processing module configured to input the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminator to be trained The network performs style discrimination processing to obtain target discrimination information;
    目标损失信息确定模块,被配置为执行根据所述目标判别信息,确定目标损失信息;A target loss information determination module configured to determine target loss information according to the target discrimination information;
    第二网络训练模块,被配置为执行基于所述目标损失信息,训练所述待训练风格融合网络、所述待训练图像生成网络和所述待训练判别网络;The second network training module is configured to perform training of the style fusion network to be trained, the image generation network to be trained, and the discrimination network to be trained based on the target loss information;
    网络确定模块,被配置为执行将训练好的待训练风格融合网络作为所述风格融合网络,以及将训练好的待训练图像生成网络作为所述目标图像生成网络。The network determination module is configured to use the trained style fusion network to be trained as the style fusion network, and use the trained image generation network to be trained as the target image generation network.
  21. 根据权利要求20所述的图像生成装置,其特征在于,所述样本风格融合编码包括第一风格融合编码和第二风格融合编码;所述第二风格融合处理模块包括:The image generation device according to claim 20, wherein the sample style fusion coding includes a first style fusion coding and a second style fusion coding; and the second style fusion processing module includes:
    样本融合调控数据获取单元,被配置为执行获取第一融合调控数据和第二融合调控数据,所述第一融合调控数据用于调控所述第一样本风格编码和所述样本对象编码,从所述待训练风格融合网络中的第一个网络层开始融合;所述第二融合调控数据用于调控所述第一样本风格编码在所述待训练风格融合网络中不参与融合;The sample fusion control data acquisition unit is configured to perform acquisition of first fusion control data and second fusion control data, the first fusion control data is used to control the first sample style code and the sample object code, from The first network layer in the style fusion network to be trained starts to fuse; the second fusion control data is used to control the first sample style code not to participate in fusion in the style fusion network to be trained;
    样本融合数据确定单元,被配置为执行根据所述第一融合调控数据和所述第二融合调控数据,分别确定所述预设数量个待训练网络层对应的第一样本融合数据和第二样本融合数据;The sample fusion data determination unit is configured to determine the first sample fusion data and the second fusion data corresponding to the preset number of network layers to be trained according to the first fusion control data and the second fusion control data. Sample fusion data;
    第二拼接处理单元,被配置为执行对所述第一样本风格编码和所述样本对象编码进行拼接处理,得到样本拼接编码;A second splicing processing unit configured to perform splicing processing on the first sample style code and the sample object code to obtain a sample spliced code;
    第二融合权重学习单元,被配置为执行基于所述样本拼接编码进行融合权重学习,得到所述样本融合权重;The second fusion weight learning unit is configured to perform fusion weight learning based on the sample splicing code to obtain the sample fusion weight;
    第二加权处理单元,被配置为执行将所述第一样本融合数据、第二样本融合数据分别与所述样本融合权重进行加权处理,得到第一样本网络融合参数和第二样本网络融合参数;The second weighting processing unit is configured to perform weighting processing on the first sample fusion data, the second sample fusion data and the sample fusion weight respectively, to obtain the first sample network fusion parameters and the second sample network fusion parameter;
    第二风格融合处理单元,被配置为执行基于所述第一样本网络融合参数和第二样本网络融合参数,分别在所述预设数量个待训练网络层中对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到所述第一风格融合编码和所述第二风格融合编码。The second style fusion processing unit is configured to perform, based on the network fusion parameters of the first sample and the network fusion parameters of the second sample, the first sample style in the preset number of network layers to be trained respectively. performing style fusion processing on the coding and the sample object coding to obtain the first style fusion coding and the second style fusion coding.
  22. 根据权利要求21所述的图像生成装置,其特征在于,所述样本对象风格图像包括所述第一风格融合编码对应的第一样本对象风格图像和所述第二风格融合编码对应的第二样本对象风格图像;所述待训练判别网络包括对象判别网络、风格对象判别网络和风格编码判别网络;所述目标判别信息包括对象判别信息、风格对象判别信息和风格编码判别信息;The image generating device according to claim 21, wherein the sample object style image includes the first sample object style image corresponding to the first style fusion coding and the second style fusion coding corresponding to the second style image. A sample object style image; the discriminant network to be trained includes an object discriminant network, a style object discriminant network, and a style encoding discriminant network; the target discriminant information includes object discriminant information, style object discriminant information, and style code discriminant information;
    所述风格判别处理模块包括:The style discrimination processing module includes:
    对象判别处理单元,被配置为执行将所述第二样本对象风格图像和所述预设对象图像输入所述对 象判别网络进行对象判别处理,得到对象判别信息;The object discrimination processing unit is configured to execute inputting the second sample object style image and the preset object image into the object discrimination network for object discrimination processing to obtain object discrimination information;
    风格对象判别处理单元,被配置为执行将所述第一样本对象风格图像和所述预设风格对象图像输入所述风格对象判别网络进行风格对象判别处理,得到风格对象判别信息;a style object discrimination processing unit configured to perform style object discrimination processing by inputting the first sample object style image and the preset style object image into the style object discrimination network to obtain style object discrimination information;
    风格编码判别处理单元,被配置为执行将所述第二样本对象风格图像、所述第一样本风格编码和所述第二样本风格编码输入所述风格编码判别网络进行风格编码判别处理,得到风格编码判别信息。a style encoding discrimination processing unit configured to perform style encoding discrimination processing by inputting the second sample object style image, the first sample style code and the second sample style code into the style coding discrimination network to obtain Style encodes discriminative information.
  23. 一种图像生成装置,其特征在于,包括:An image generating device, characterized in that it comprises:
    原始对象图像获取模块,被配置为执行获取第一目标对象的第一原始对象图像;an original object image acquisition module configured to perform acquisition of a first original object image of the first target object;
    第一风格转换处理模块,被配置为执行将所述第一原始对象图像输入第一风格转换网络进行风格转换处理,得到所述第一目标对象对应的第一目标对象风格图像;The first style conversion processing module is configured to perform style conversion processing by inputting the first original object image into a first style conversion network to obtain a first target object style image corresponding to the first target object;
    所述第一风格转换网络为基于第一样本对象图像和如权利要求1至10任一图像生成方法生成的目标风格的预设对象风格图像,对第一预设图像生成网络进行对抗训练得到的。The first style conversion network is a preset object style image based on the first sample object image and the target style generated by any image generation method according to claims 1 to 10, and the first preset image generation network is subjected to confrontation training to obtain of.
  24. 一种图像生成装置,其特征在于,包括:An image generating device, characterized in that it comprises:
    数据获取模块,被配置为执行获取第二目标对象的第二原始对象图像和目标风格标签;a data acquisition module configured to perform acquisition of a second original object image and a target style label of a second target object;
    第二风格转换处理模块,被配置为执行将所述第二原始对象图像和所述目标风格标签输入第二风格转换网络进行风格转换处理,得到所述第二目标对象对应的第二目标对象风格图像;The second style conversion processing module is configured to input the second original object image and the target style label into a second style conversion network for style conversion processing, and obtain the second target object style corresponding to the second target object image;
    所述第二风格转换网络为基于第二样本对象图像、多种目标风格标签和如权利要求1至10任一图像生成方法生成的多种目标风格的预设对象风格图像,对第二预设目标图像生成网络进行对抗训练得到的。The second style conversion network is a preset object style image based on the second sample object image, a variety of target style labels and a variety of target styles generated by any image generation method in claims 1 to 10, for the second preset The target image generation network is obtained by adversarial training.
  25. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;memory for storing said processor-executable instructions;
    其中,所述处理器被配置为执行所述指令,以执行以下步骤:Wherein, the processor is configured to execute the instructions to perform the following steps:
    获取预设对象编码和目标风格的目标风格编码;Obtain the target style encoding of the preset object encoding and target style;
    基于风格融合网络中预设数量个网络层对应的网络融合参数,对所述目标风格编码和所述预设对象编码进行风格融合处理,得到目标风格融合编码;所述网络融合参数为基于所述预设数量个网络层对应的融合数据和目标融合权重确定的,所述目标融合权重为基于所述目标风格编码和所述预设对象编码进行融合权重学习得到的;Based on network fusion parameters corresponding to a preset number of network layers in the style fusion network, perform style fusion processing on the target style code and the preset object code to obtain a target style fusion code; the network fusion parameters are based on the The fusion data corresponding to a preset number of network layers and the target fusion weight are determined, and the target fusion weight is obtained by performing fusion weight learning based on the target style code and the preset object code;
    将所述目标风格融合编码输入目标图像生成网络进行图像生成处理,得到所述目标风格对应的预设对象风格图像。The target style fusion code is input into the target image generation network for image generation processing, and the preset object style image corresponding to the target style is obtained.
  26. 根据权利要求25所述的电子设备,其特征在于,所述处理器还被配置为执行以下步骤:The electronic device according to claim 25, wherein the processor is further configured to perform the following steps:
    获取目标融合调控数据,所述目标融合调控数据用于调控所述目标风格编码和所述预设对象编码在所述风格融合网络中的融合位置;Acquiring target fusion control data, the target fusion control data is used to control the fusion position of the target style code and the preset object code in the style fusion network;
    根据所述目标融合调控数据,确定所述预设数量个网络层对应的融合数据;determining fusion data corresponding to the preset number of network layers according to the target fusion control data;
    对所述目标风格编码和所述预设对象编码进行拼接处理,得到目标拼接编码;Perform splicing processing on the target style code and the preset object code to obtain the target spliced code;
    基于所述目标拼接编码进行融合权重学习,得到所述目标融合权重;performing fusion weight learning based on the target splicing code to obtain the target fusion weight;
    对所述融合数据和所述目标融合权重进行加权处理,得到所述网络融合参数;performing weighting processing on the fusion data and the target fusion weight to obtain the network fusion parameters;
    基于所述网络融合参数在所述预设数量个网络层中对所述目标风格编码和所述预设对象编码进行风格融合处理,得到所述目标风格融合编码。Perform style fusion processing on the target style code and the preset object code in the preset number of network layers based on the network fusion parameters to obtain the target style fusion code.
  27. 根据权利要求26所述的电子设备,其特征在于,所述处理器还被配置为执行以下步骤:The electronic device according to claim 26, wherein the processor is further configured to perform the following steps:
    比较所述预设数量个网络层对应的层数与所述目标融合调控数据,得到比较结果;Comparing the number of layers corresponding to the preset number of network layers with the target fusion control data to obtain a comparison result;
    根据所述比较结果,确定所述预设数量个网络层对应的融合数据。According to the comparison result, determine fusion data corresponding to the preset number of network layers.
  28. 根据权利要求25至27任一所述的电子设备,其特征在于,所述处理器还被配置为执行以下步骤:The electronic device according to any one of claims 25 to 27, wherein the processor is further configured to perform the following steps:
    获取所述目标风格的参考风格图像;Acquiring a reference style image of the target style;
    将所述参考风格图像输入风格编码网络进行风格编码处理,得到所述目标风格编码。Inputting the reference style image into the style encoding network for style encoding processing to obtain the target style encoding.
  29. 根据权利要求28所述的电子设备,其特征在于,所述处理器还被配置为执行以下步骤:The electronic device according to claim 28, wherein the processor is further configured to perform the following steps:
    获取所述目标风格的正样本风格图像对和负样本风格图像对;Obtaining a positive sample style image pair and a negative sample style image pair of the target style;
    将所述正样本风格图像对和负样本风格图像对输入待训练风格编码网络进行风格编码处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本风格编码;Inputting the positive sample style image pair and the negative sample style image pair into the style coding network to be trained for style coding processing to obtain the sample style codes corresponding to the positive sample style image pair and the negative sample style image pair;
    将所述样本风格编码输入待训练感知网络进行感知处理,得到所述正样本风格图像对和所述负样本风格图对各自对应的样本感知特征信息;Inputting the sample style code into the perceptual network to be trained for perceptual processing, and obtaining the respective sample perceptual feature information corresponding to the positive sample style image pair and the negative sample style image pair;
    根据所述样本感知特征信息,确定对比损失信息;determining comparative loss information according to the perceptual feature information of the sample;
    基于所述对比损失信息,训练所述待训练风格编码网络和所述待训练感知网络;training the style encoding network to be trained and the perception network to be trained based on the comparative loss information;
    将训练好的待训练风格编码网络,作为所述风格编码网络。The trained style coding network to be trained is used as the style coding network.
  30. 根据权利要求25至27任一所述的电子设备法,其特征在于,所述处理器还被配置为执行以下步骤:The electronic device method according to any one of claims 25 to 27, wherein the processor is further configured to perform the following steps:
    基于第一预设分布,随机生成初始风格编码;Randomly generate an initial style code based on a first preset distribution;
    将所述初始风格编码输入第一多层感知网络进行感知处理,得到所述目标风格编码。Inputting the initial style code into the first multi-layer perceptual network for perceptual processing to obtain the target style code.
  31. 根据权利要求25至27任一所述的电子设备,其特征在于,所述处理器还被配置为执行以下步骤:The electronic device according to any one of claims 25 to 27, wherein the processor is further configured to perform the following steps:
    基于第二预设分布,随机生成初始对象编码;Randomly generating an initial object code based on a second preset distribution;
    将所述初始对象编码输入第二多层感知网络进行感知处理,得到所述预设对象编码。Inputting the initial object code into the second multi-layer perceptual network for perceptual processing to obtain the preset object code.
  32. 根据权利要求25至27任一所述的电子设备,其特征在于,所述处理器还被配置为执行以下步骤:The electronic device according to any one of claims 25 to 27, wherein the processor is further configured to perform the following steps:
    获取所述目标风格的第一样本风格编码、样本对象编码、非目标风格的第二样本风格编码、预设风格对象图像和预设对象图像;Acquiring the first sample style code of the target style, the sample object code, the second sample style code of the non-target style, the preset style object image and the preset object image;
    基于待训练风格融合网络中预设数量个待训练网络层对应的样本网络融合参数,对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到样本风格融合编码;Based on the sample network fusion parameters corresponding to the preset number of network layers to be trained in the style fusion network to be trained, perform style fusion processing on the first sample style code and the sample object code to obtain a sample style fusion code;
    将所述样本风格融合编码输入待训练图像生成网络进行图像生成处理,得到所述目标风格对应的样本对象风格图像;Inputting the sample style fusion code into the image generation network to be trained for image generation processing to obtain the sample object style image corresponding to the target style;
    将所述样本对象风格图像、所述预设风格对象图像、预设对象图像、所述第一样本风格编码和所述第二样本风格编码输入待训练判别网络进行风格判别处理,得到目标判别信息;Input the sample object style image, the preset style object image, the preset object image, the first sample style code and the second sample style code into the discriminant network to be trained for style discrimination processing, and obtain the target discrimination information;
    根据所述目标判别信息,确定目标损失信息;determining target loss information according to the target discrimination information;
    基于所述目标损失信息,训练所述待训练风格融合网络、所述待训练图像生成网络和所述待训练判别网络;Training the style fusion network to be trained, the image generation network to be trained, and the discrimination network to be trained based on the target loss information;
    将训练好的待训练风格融合网络作为所述风格融合网络,以及将训练好的待训练图像生成网络作为所述目标图像生成网络。The trained style fusion network to be trained is used as the style fusion network, and the trained image generation network to be trained is used as the target image generation network.
  33. 根据权利要求32所述的电子设备,其特征在于,所述样本风格融合编码包括第一风格融合编码和第二风格融合编码;所述处理器还被配置为执行以下步骤:The electronic device according to claim 32, wherein the style fusion coding of the sample comprises a first style fusion coding and a second style fusion coding; and the processor is further configured to perform the following steps:
    获取第一融合调控数据和第二融合调控数据,所述第一融合调控数据用于调控所述第一样本风格编码和所述样本对象编码,从所述待训练风格融合网络中的第一个网络层开始融合;所述第二融合调控数据用于调控所述第一样本风格编码在所述待训练风格融合网络中不参与融合;Acquiring first fusion control data and second fusion control data, the first fusion control data is used to control the first sample style code and the sample object code, from the first style fusion network to be trained The first network layer starts to fuse; the second fusion control data is used to control the first sample style code not to participate in fusion in the style fusion network to be trained;
    根据所述第一融合调控数据和所述第二融合调控数据,分别确定所述预设数量个待训练网络层对应的第一样本融合数据和第二样本融合数据;According to the first fusion control data and the second fusion control data, respectively determine the first sample fusion data and the second sample fusion data corresponding to the preset number of network layers to be trained;
    对所述第一样本风格编码和所述样本对象编码进行拼接处理,得到样本拼接编码;Concatenating the first sample style code and the sample object code to obtain a sample concatenation code;
    基于所述样本拼接编码进行融合权重学习,得到所述样本融合权重;Perform fusion weight learning based on the sample splicing code to obtain the sample fusion weight;
    将所述第一样本融合数据、第二样本融合数据分别与所述样本融合权重进行加权处理,得到第一样本网络融合参数和第二样本网络融合参数;performing weighting processing on the first sample fusion data, the second sample fusion data and the sample fusion weight respectively, to obtain the first sample network fusion parameters and the second sample network fusion parameters;
    基于所述第一样本网络融合参数和第二样本网络融合参数,分别在所述预设数量个待训练网络层中对所述第一样本风格编码和所述样本对象编码进行风格融合处理,得到所述第一风格融合编码和所述第二风格融合编码。Based on the first sample network fusion parameters and the second sample network fusion parameters, respectively perform style fusion processing on the first sample style code and the sample object code in the preset number of network layers to be trained , to obtain the first style fusion code and the second style fusion code.
  34. 根据权利要求33所述的电子设备,其特征在于,所述样本对象风格图像包括所述第一风格融合编码对应的第一样本对象风格图像和所述第二风格融合编码对应的第二样本对象风格图像;所述待训练判别网络包括对象判别网络、风格对象判别网络和风格编码判别网络;所述目标判别信息包括 对象判别信息、风格对象判别信息和风格编码判别信息;The electronic device according to claim 33, wherein the sample object style image includes a first sample object style image corresponding to the first style fusion coding and a second sample corresponding to the second style fusion coding An object style image; the discriminant network to be trained includes an object discriminant network, a style object discriminant network and a style code discriminant network; the target discriminant information includes object discriminant information, style object discriminant information and style code discriminant information;
    所述处理器还被配置为执行以下步骤:The processor is also configured to perform the following steps:
    将所述第二样本对象风格图像和所述预设对象图像输入所述对象判别网络进行对象判别处理,得到对象判别信息;inputting the second sample object style image and the preset object image into the object discrimination network for object discrimination processing to obtain object discrimination information;
    将所述第一样本对象风格图像和所述预设风格对象图像输入所述风格对象判别网络进行风格对象判别处理,得到风格对象判别信息;inputting the first sample object style image and the preset style object image into the style object discrimination network to perform style object discrimination processing to obtain style object discrimination information;
    将所述第二样本对象风格图像、所述第一样本风格编码和所述第二样本风格编码输入所述风格编码判别网络进行风格编码判别处理,得到风格编码判别信息。Inputting the second sample object style image, the first sample style code and the second sample style code into the style code discrimination network for style code discrimination processing to obtain style code discrimination information.
  35. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;memory for storing said processor-executable instructions;
    其中,所述处理器被配置为执行所述指令,以执行以下步骤:Wherein, the processor is configured to execute the instructions to perform the following steps:
    获取第一目标对象的第一原始对象图像;acquiring a first original object image of a first target object;
    将所述第一原始对象图像输入第一风格转换网络进行风格转换处理,得到所述第一目标对象对应的第一目标对象风格图像;inputting the first original object image into a first style conversion network to perform style conversion processing to obtain a first target object style image corresponding to the first target object;
    所述第一风格转换网络为基于第一样本对象图像和如权利要求1至10任一图像生成方法生成的目标风格的预设对象风格图像,对第一预设图像生成网络进行对抗训练得到的。The first style conversion network is a preset object style image based on the first sample object image and the target style generated by any image generation method according to claims 1 to 10, and the first preset image generation network is subjected to confrontation training to obtain of.
  36. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;memory for storing said processor-executable instructions;
    其中,所述处理器被配置为执行所述指令,以执行以下步骤:Wherein, the processor is configured to execute the instructions to perform the following steps:
    获取第二目标对象的第二原始对象图像和目标风格标签;Obtaining a second original object image and a target style label of a second target object;
    将所述第二原始对象图像和所述目标风格标签输入第二风格转换网络进行风格转换处理,得到所述第二目标对象对应的第二目标对象风格图像;inputting the second original object image and the target style label into a second style conversion network for style conversion processing, to obtain a second target object style image corresponding to the second target object;
    所述第二风格转换网络为基于第二样本对象图像、多种目标风格标签和如权利要求1至10任一图像生成方法生成的多种目标风格的预设对象风格图像,对第二预设目标图像生成网络进行对抗训练得到的。The second style conversion network is a preset object style image based on the second sample object image, a variety of target style labels and a variety of target styles generated by any image generation method in claims 1 to 10, for the second preset The target image generation network is obtained by adversarial training.
  37. 一种计算机可读存储介质,其特征在于,响应于所述存储介质中的指令由电子设备的处理器执行,使得所述电子设备能够执行如权利要求1至12中任一项所述的图像生成方法。A computer-readable storage medium, characterized in that, in response to instructions in the storage medium being executed by a processor of an electronic device, the electronic device can execute the image as claimed in any one of claims 1 to 12 generate method.
  38. 一种计算机程序产品,包括计算机指令,其特征在于,响应于所述计算机指令被处理器执行,实现权利要求1至12中任一项所述的图像生成方法。A computer program product, comprising computer instructions, characterized in that, in response to the computer instructions being executed by a processor, the image generation method according to any one of claims 1 to 12 is implemented.
PCT/CN2022/094971 2021-11-18 2022-05-25 Image generation method and apparatus WO2023087656A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111371705.0 2021-11-18
CN202111371705.0A CN114202456A (en) 2021-11-18 2021-11-18 Image generation method, image generation device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2023087656A1 true WO2023087656A1 (en) 2023-05-25

Family

ID=80648046

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/094971 WO2023087656A1 (en) 2021-11-18 2022-05-25 Image generation method and apparatus

Country Status (2)

Country Link
CN (1) CN114202456A (en)
WO (1) WO2023087656A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114202456A (en) * 2021-11-18 2022-03-18 北京达佳互联信息技术有限公司 Image generation method, image generation device, electronic equipment and storage medium
CN114418919B (en) * 2022-03-25 2022-07-26 北京大甜绵白糖科技有限公司 Image fusion method and device, electronic equipment and storage medium
CN115063536B (en) * 2022-06-30 2023-10-10 中国电信股份有限公司 Image generation method, device, electronic equipment and computer readable storage medium
CN116152901B (en) * 2023-04-24 2023-08-01 广州趣丸网络科技有限公司 Training method of image generation model and stylized image generation method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472270A (en) * 2018-10-31 2019-03-15 京东方科技集团股份有限公司 Image style conversion method, device and equipment
CN111784565A (en) * 2020-07-01 2020-10-16 北京字节跳动网络技术有限公司 Image processing method, migration model training method, device, medium and equipment
KR20210028401A (en) * 2019-09-04 2021-03-12 주식회사 엔씨소프트 Device and method for style translation
CN112651890A (en) * 2020-12-18 2021-04-13 深圳先进技术研究院 PET-MRI image denoising method and device based on dual-coding fusion network model
CN114202456A (en) * 2021-11-18 2022-03-18 北京达佳互联信息技术有限公司 Image generation method, image generation device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472270A (en) * 2018-10-31 2019-03-15 京东方科技集团股份有限公司 Image style conversion method, device and equipment
KR20210028401A (en) * 2019-09-04 2021-03-12 주식회사 엔씨소프트 Device and method for style translation
CN111784565A (en) * 2020-07-01 2020-10-16 北京字节跳动网络技术有限公司 Image processing method, migration model training method, device, medium and equipment
CN112651890A (en) * 2020-12-18 2021-04-13 深圳先进技术研究院 PET-MRI image denoising method and device based on dual-coding fusion network model
CN114202456A (en) * 2021-11-18 2022-03-18 北京达佳互联信息技术有限公司 Image generation method, image generation device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114202456A (en) 2022-03-18

Similar Documents

Publication Publication Date Title
WO2023087656A1 (en) Image generation method and apparatus
Tomei et al. Art2real: Unfolding the reality of artworks via semantically-aware image-to-image translation
Luo et al. Robust discrete code modeling for supervised hashing
US11074733B2 (en) Face-swapping apparatus and method
CN112330685B (en) Image segmentation model training method, image segmentation device and electronic equipment
CN108563782B (en) Commodity information format processing method and device, computer equipment and storage medium
TW202213275A (en) Image processing method and device, processor, electronic equipment and storage medium
CN112270686B (en) Image segmentation model training method, image segmentation device and electronic equipment
CN113592991A (en) Image rendering method and device based on nerve radiation field and electronic equipment
CN110147806A (en) Training method, device and the storage medium of image description model
CN112259247B (en) Method, device, equipment and medium for confrontation network training and medical data supplement
KR102332114B1 (en) Image processing method and apparatus thereof
US20220012846A1 (en) Method of modifying digital images
CN113192175A (en) Model training method and device, computer equipment and readable storage medium
CN112818995A (en) Image classification method and device, electronic equipment and storage medium
CN109460541A (en) Lexical relation mask method, device, computer equipment and storage medium
CN116189265A (en) Sketch face recognition method, device and equipment based on lightweight semantic transducer model
CN113886548A (en) Intention recognition model training method, recognition method, device, equipment and medium
JP2021051709A (en) Text processing apparatus, method, device, and computer-readable recording medium
Wu et al. Semantic key generation based on natural language
EP4242962A1 (en) Recognition system, recognition method, program, learning method, trained model, distillation model and training data set generation method
CN110780850B (en) Requirement case auxiliary generation method and device, computer equipment and storage medium
Ma et al. M3D-GAN: Multi-modal multi-domain translation with universal attention
CN113222100A (en) Training method and device of neural network model
CN112001566B (en) Optimization method, device, equipment and medium of fitness training model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22894200

Country of ref document: EP

Kind code of ref document: A1