WO2023093356A1 - Image generation method and apparatus, and electronic device and storage medium - Google Patents

Image generation method and apparatus, and electronic device and storage medium Download PDF

Info

Publication number
WO2023093356A1
WO2023093356A1 PCT/CN2022/125425 CN2022125425W WO2023093356A1 WO 2023093356 A1 WO2023093356 A1 WO 2023093356A1 CN 2022125425 W CN2022125425 W CN 2022125425W WO 2023093356 A1 WO2023093356 A1 WO 2023093356A1
Authority
WO
WIPO (PCT)
Prior art keywords
image generation
network
generation network
stylized
target
Prior art date
Application number
PCT/CN2022/125425
Other languages
French (fr)
Chinese (zh)
Inventor
顾天培
林纯泽
王权
钱晨
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023093356A1 publication Critical patent/WO2023093356A1/en

Links

Images

Classifications

    • G06T3/04
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to the field of computer technology, and in particular to an image generation method and device, electronic equipment and storage media.
  • Image stylization is to convert the original image into a stylized image of a specific style, such as sketch portrait style, cartoon image style, oil painting style, etc.
  • the disclosure proposes a technical solution for image generation.
  • an image generation method including: obtaining a realistic image generation network and a target stylized image generation network; inputting the same random data into the target stylized image generation network and the real The stylized image generation network is used to obtain the target stylized image output by the target stylized image generation network, and the target realistic image output by the realistic image generation network, the target stylized image has a target style, wherein the The target stylized image generation network is obtained by fusing the realistic image generation network with the original stylized image generation network, and the original stylized image generation network is used to generate images with the target style; the same random The target stylized image and the target realized image corresponding to the data are determined as a pair of paired images.
  • a large number of paired images can be generated by using random data, which not only reduces the construction cost of the paired images, but also allows the target stylized image in the constructed paired images to have both sufficient realistic details and sufficient stylized effects.
  • the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer
  • the combination of the realistic image generation network and The fusion of the original stylized image generation network includes: replacing the first I-layer network layer of the original stylized image generation network with the first I-layer network layer of the realistic image generation network to obtain the target style stylized image generation network, I ⁇ [1,N); wherein, the value of I is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network.
  • the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effect.
  • the first I-layer network layer is used to learn low-resolution information of images, and the low-resolution information includes edge contour information and style information of images; wherein, the original stylized The first I-layer network layer of the image generation network is replaced by the first I-layer network layer of the realized image generation network, including: the low-resolution information learned by the first I-layer network layer of the original stylized image generation network, exchange with the low-resolution information learned by the first layer I network layer of the realistic image generation network.
  • the target stylized image generation network can take into account the low-resolution information learned by the realistic image generation network and the high-resolution information learned by the original stylized image generation network, and then can generate images with sufficient realistic details and sufficient style.
  • the target stylized image for the Stylize effect is used to learn low-resolution information of images, and the low-resolution information includes edge contour information and style information of images; wherein, the original stylized The first I-layer network layer of the image generation network is replaced by the first I-layer network layer of the realized image generation
  • the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer
  • the combination of the realistic image generation network and The fusion of the original stylized image generation network also includes: replacing the last J-layer network layer of the original stylized image generation network with the last J-layer network layer of the realized image generation network to obtain the target A stylized image generation network, J ⁇ [1,N); wherein, the value of J is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network.
  • the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effect.
  • the post-J layer network layer is used to learn the high-resolution information of the image, and the high-resolution information includes the detailed information of the image; wherein, the post-J layer of the original stylized image generation network Layer network layer, replaced by the back J-layer network layer of the realized image generation network, including: combining the high-resolution information learned by the back J-layer network layer of the original stylized image generation network with the realized image
  • the high-resolution information learned by the post-J network layers of the generative network is exchanged.
  • the target stylized image generation network can take into account the low-resolution information learned by the stylized image generation network and the high-resolution information learned by the realistic image generation network, and then can generate images with sufficient realistic details and sufficient stylization. The effect's target stylized image.
  • the original stylized image generation network is obtained by performing transfer learning on the realistic image generation network based on a stylized sample image, and the stylized sample image has the target style.
  • the network structure of the realized image generation network and the original stylized image generation network can be the same.
  • the performing transfer learning on the realistic image generation network based on the stylized sample image includes: acquiring the realistic image generation network and the stylized sample image with the target style ; Using the stylized sample image, perform migration learning on the realistic image generation network to obtain the original stylized image generation network.
  • the network structure of the realized image generation network and the original stylized image generation network can be the same.
  • the realistic image generation network is obtained by performing network training on a resolution-increasing image generation confrontation network model, and the realistic image generation network has N layers of network layers, each An n-layer network layer represents a resolution level, and the realistic image generation network is used to generate images of different resolutions by resolution level, N is a positive integer, n ⁇ [1,N).
  • the paired images are multiple pairs, and the multiple pairs of paired images are used to train the initial network to obtain a target stylized network, and the target stylized network is used to convert the input image into a Describe the image of the target style.
  • the paired images can be used to effectively train a target stylization network that can convert an input image into an image with the target style.
  • an image generation device including: an acquisition module, used to obtain a realistic image generation network and a target stylized image generation network; an output module, used to input the same random data into the The target stylized image generation network and the realistic image generation network obtain the target stylized image output by the target stylized image generation network, and the target realistic image output by the realistic image generation network, and the target style
  • the stylized image has the target style, wherein the target stylized image generation network is obtained by fusing the realistic image generation network with the stylized image generation network, and the original stylized image generation network is used to generate the An image of the target style; a determining module, configured to determine the target stylized image and the target realized image corresponding to the same random data as a pair of paired images.
  • an electronic device including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to call the instructions stored in the memory to execute the above-mentioned method.
  • a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above method is implemented.
  • a computer program including computer readable codes, and when the computer readable codes are run in an electronic device, a processor in the electronic device executes the above method.
  • the realistic image generation network by fusing the realistic image generation network with the original stylized image generation network to obtain the target stylized generation network, and then using random data, a large number of paired images can be generated, which not only reduces the construction of paired images cost, and the target stylized image in the constructed paired image can have enough realistic details and enough stylized effect; in addition, when the paired image is applied to the network model training, the trained image can be obtained based on the paired image
  • the target stylization network, the obtained target stylization network can transform the realistic image into an image with sufficient realistic details and sufficient stylized effect.
  • FIG. 1 shows a flowchart of an image generation method according to an embodiment of the present disclosure.
  • Fig. 2 shows a schematic diagram of network convergence according to an embodiment of the present disclosure.
  • FIG. 3 shows a schematic diagram of a degree of stylization according to an embodiment of the disclosure.
  • FIG. 4 shows a schematic diagram of a degree of stylization according to an embodiment of the disclosure.
  • FIG. 5 shows a block diagram of an image generating device according to an embodiment of the present disclosure.
  • Fig. 6 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
  • Fig. 1 shows a flow chart of an image generation method according to an embodiment of the present disclosure
  • the image generation method may be executed by an electronic device such as a terminal device or a server
  • the terminal device may be a user equipment (User Equipment, UE), a mobile device, a user Terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle-mounted device, wearable device, etc.
  • the method can call the computer-readable information stored in the memory through the processor instructions, or the method may be executed by a server.
  • the image generation method includes:
  • step S11 a realistic image generation network and a target stylized image generation network are acquired.
  • the realistic image generation network is used to generate realistic images
  • the target stylized image generation network is used to generate target stylized images with the target style.
  • Realized images can be understood as images without the target style
  • target stylized images can be understood as images with the target style.
  • the target style can include any image style such as sketch portrait style, cartoon image style, oil painting style, comic style, etc.
  • the comic style can at least include, for example: SD doll, becoming a child, CG style 1, Impasto, dark Korean comics, Korean comics, CG style 2. It should be understood that the embodiment of the present disclosure does not limit the type of the target style.
  • the realistic image generation network can be obtained by performing network training on the image generation adversarial network model with increasing resolution, wherein the image generation adversarial network model with increasing resolution can, for example, be Including ProgressiveGAN, StyleGAN, StyleGANv1, StyleGANv2, StyleGANv3 and other derived network models of StyleGAN.
  • this kind of image generation confrontation network model with increasing resolution includes generation network G and discriminant network D, and the basic principle of image generation confrontation network model with increasing resolution can be simply understood as:
  • a random data z also called random noise
  • the generation network G generates an image G(z) through this random data
  • the generated image G(z) is input into the discriminant network D, through the discriminant network D to judge whether the input image is real, or whether it is the image G(z) generated by the generation network G.
  • increasing by resolution can be understood as that the shallow network layer of the generation network G first learns and generates low-resolution images (such as 4 ⁇ 4 resolution), and then gradually continues to learn and generate images as the network depth increases. Generate higher resolution images (such as 1024 ⁇ 1024 resolution).
  • the realistic image generation network has N layers of network layers, and each n-layer network layer represents a resolution level, and the realistic image generation network is used to generate images of different resolutions by resolution level, N is a positive integer, n ⁇ [1,N); for example, StyleGAN with 18 network layers can be used, where every two layers can represent a resolution level, and different resolution levels are generated from 4 ⁇ 4 to 1024 ⁇ 1024 rate images.
  • the resolution information includes image edge profile information and style information
  • the high-resolution information includes image detail information.
  • the goal of the generation network G is to generate real images to deceive the discriminant network D
  • the goal of the discriminant network D is to identify the images generated by the generative network G as much as possible. image; wherein, the confrontation loss between the generation network G and the discriminant network D can be used to train this type of image generation confrontation network model, and the trained generation network D can be used as a realistic image generation network. It should be understood that those skilled in the art may use network training methods known in the art to train this type of resolution-increasing image generative adversarial network model, which is not limited by the embodiments of the present disclosure.
  • the target stylized image generation network is obtained by fusing the realistic image generation network with the original stylized image generation network, and the original stylized image generation network is used to generate images with the target style; the original The stylized image generation network is obtained by transferring the realistic image generation network based on the stylized sample images, which have the target style.
  • the original stylized image generation network can be obtained by transfer learning the realized image generation network, then the network structure of the realized image generation network and the original stylized image generation network can be the same, in a possible
  • the realistic image generation network and the original stylized image generation network can be interchanged and fused at a specific network layer, so as to obtain the target stylized image generation network, for example, some layers of the realistic image generation network layer, replaced by the network layer of the corresponding layer in the original stylized image generation network.
  • the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effects.
  • step S12 the same random data is respectively input into the target stylized image generation network and the realized image generation network to obtain the target stylized image output by the target stylized image generation network and the target realization output by the realized image generation network Image, target stylized The image has the target style.
  • a random data z (also called random noise) is input into the generation network G, and the generation network G generates an image G(z) through this random data; that is, the real image generation network and the target style
  • the optimized image generation network actually uses random data to perform step-by-step upsampling or increase the resolution step by step to generate images. To generate paired object realization images and object stylized images.
  • paired images can be efficiently generated, and the target realized image and the target stylized image are correspondingly matched, or in other words, the target stylized image is equivalent to a stylized target realized image.
  • the random data may include: any type of value such as random vectors, feature matrices, random tensors, etc., where the random vectors may be hidden vectors subject to a Gaussian distribution, which is not limited by this embodiment of the present disclosure .
  • step S13 the target stylized image and the target realized image corresponding to the same random data are determined as a pair of paired images.
  • the paired images can be used to train an initial network to obtain a trained network capable of transforming an input image into an image with a target style.
  • the initial network can adopt a deep learning network model, for example, a convolutional neural network, an adversarial neural network and other network models can be used. It should be understood that the embodiments of the present disclosure do not limit the network structure, network type, training method, etc. of the initial network.
  • the target stylized image generated by the target stylized generation network can have sufficient Realistic details and sufficient stylized effects; thus, using the realistic image generation network and the target stylized image generation network, high-quality paired images can be efficiently generated.
  • the realistic image generation network and the original stylized image generation network can be exchanged and fused at a specific network layer to obtain the target stylized image generation network.
  • the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the realistic image generation network and the original stylized image generation network are fused , which can include:
  • the first I-layer network layer of the original stylized image generation network is replaced with the first I-layer network layer of the realized image generation network, that is, the first I-layer network layer of the realized image generation network is compared with the original stylized
  • the last N-I layer network layer of the image generation network is spliced.
  • FIG. 2 shows a schematic diagram of network fusion according to an embodiment of the present disclosure.
  • the first N/2 layers of the network layer of the realistic image generation network can be combined with the original stylized image generation network Splicing the last N/2 network layers of the original stylized image generation network, that is, replacing the first N/2 layer network layer of the original stylized image generation network with the first N/2 layer network layer of the real image generation network, to obtain the target stylized image generation network.
  • the resolution of each network output in the N-layer network layer increases layer by layer, that is, the first few network layers of the N-layer network layer can be considered as low-resolution layers, and the next few network layers can be considered as high-resolution layers. resolution level.
  • the target stylized image generation network obtained through the embodiments of the present disclosure can be simply understood as, in the process of generating the target stylized image, firstly generate realistic intermediate images step by step based on random data, and then step by step The second stage adds a stylized effect on the realized intermediate image to obtain the target stylized image.
  • the resolution information includes image edge profile information and style information
  • the high-resolution information includes image detail information. That is, the first I-layer network layer of the two networks is used to learn low-resolution information, and the latter N-1-layer network layer is used to learn high-resolution information.
  • the first I-layer network layer of the original stylized image generation network is replaced with the first I-layer network layer of the realized image generation network, including: the first I-layer network layer of the original stylized image generation network
  • the low-resolution information learned by the network layer is exchanged with the low-resolution information learned by the previous I-layer network layer of the realistic image generation network.
  • the target stylized image generation network can take into account the low-resolution information learned by the realistic image generation network and the high-resolution information learned by the original stylized image generation network, and then can generate images with sufficient realistic details and sufficient style.
  • the target stylized image for the Stylize effect can take into account the low-resolution information learned by the realistic image generation network and the high-resolution information learned by the original stylized image generation network, and then can generate images with sufficient realistic details and sufficient style.
  • the value of I can be set according to the requirements of the degree of stylization, wherein the value of I is negatively correlated with the degree of stylization of the target stylized image generated by the target stylized image generation network, which can be understood Therefore, the smaller the value of I, the closer the target stylized image generated by the target stylized image generation network to the stylized effect (or the less realistic image), that is, the higher the degree of stylization; the larger the value of I, The closer the target stylized image is to the realistic effect (or more like the realistic image), that is, the lower the stylization degree.
  • Fig. 3 shows a schematic diagram of a degree of stylization according to an embodiment of the present disclosure. As shown in Fig.
  • the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effect.
  • the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the realistic image generation network and the original stylized image generation network are fused , which can also include:
  • the post-J layer network layer of the original stylized image generation network is replaced by the post-J-layer network layer of the realized image generation network, that is, the post-J-layer network layer of the realized image generation network is compared with the original stylized
  • the first N-J layer network layers of the image generation network are spliced.
  • the resolution of each network output in the N-layer network layer increases layer by layer, that is, the first few network layers of the N-layer network layer can be considered as low-resolution layers, and the next few network layers can be considered as low-resolution layers. is a high-resolution layer.
  • the target stylized image generation network obtained through the embodiments of the present disclosure can be simply understood as, in the process of generating the target stylized image, the stylized intermediate image is firstly generated step by step based on random data, and then step by step The second stage adds realistic details to the stylized intermediate image to obtain the target stylized image.
  • the resolution information includes image edge profile information and style information
  • the high-resolution information includes image detail information. That is, the first N-J layer network layers of the two networks are used to learn low-resolution information, and the last J-layer network layers are used to learn high-resolution information.
  • the post-J layer network layer of the original stylized image generation network is replaced by the post-J layer network layer of the realized image generation network, including: the post-J layer network layer of the original stylized image generation network
  • the high-resolution information learned by the network layer is exchanged with the high-resolution information learned by the subsequent J-layer network layer of the realistic image generation network.
  • the target stylized image generation network can take into account the low-resolution information learned by the stylized image generation network and the high-resolution information learned by the realistic image generation network, and then can generate images with sufficient realistic details and sufficient stylization. The effect's target stylized image.
  • the value of J can be set according to specific stylization requirements, wherein the value of J is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network, which can be understood as, The larger the value of J, the closer the target stylized image generated by the target stylized image generation network to the realistic effect (or the more like a real image), the lower the degree of stylization, the smaller the value of J, the closer the target stylized image The closer to the stylized effect (or the less realistic the image is), the higher the stylization.
  • Fig. 4 shows a schematic diagram of the degree of stylization according to an embodiment of the present disclosure. As shown in Fig.
  • the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effect.
  • the original stylized image generation network is obtained by transferring the realistic image generation network based on the stylized sample images. study, including:
  • the realistic image generation network may be a generation network D trained according to the above-mentioned network training process.
  • Migration learning can be understood as enabling the realistic image generation network to learn the target style in the stylized sample image, thereby generating a stylized image with the target style, that is, to obtain the original stylized image generation network.
  • the original stylized image generation network can be obtained by training the above-mentioned resolution-increasing image generation adversarial network model by referring to the training method of the above-mentioned realistic image generation network; For the sample image, transfer learning is performed on the original stylized image generation network to obtain a realistic image generation network, which is not limited in this embodiment of the present disclosure.
  • the original stylized image generation network can be efficiently obtained, and the original stylized image generation network can maintain the network structure of the realistic image generation network without increasing the parameter amount of the original stylized image generation network, which is convenient Afterwards, the realistic image generation network is fused with the original stylized image generation network.
  • the same random data can be input into the target stylized image generation network and the realized image generation network respectively to obtain the target stylized image and the target realized image.
  • There can be multiple random data that is, there can be multiple pairs of paired images.
  • multiple pairs of paired images can be used to train the initial network to obtain the target stylized network, and the target stylized network is used to convert the input The original image is transformed into an image with the style of the target.
  • the initial network can adopt a deep learning network model known in the art, for example, a convolutional neural network, an adversarial neural network and other network models can be used. It should be understood that the embodiment of the present disclosure does not limit the network structure and network type of the initial network.
  • the training process of using the paired images to train the initial network may include, for example: inputting the target realization image in the paired images into the initial network to obtain the predicted stylized image output by the initial network; The loss between the stylized image and the target stylized image in the paired image, through gradient descent, back propagation, etc., optimize the network parameters of the initial network until the loss converges, and obtain the target stylized network.
  • the loss between the predicted stylized image and the target stylized image can be determined according to the distance between the predicted stylized image and the target stylized image, wherein the distance can include: the distance between the predicted stylized image and the target stylized image The L1 distance or L2 distance between them, and through the specified loss function (such as L1 loss function, L2 loss function), determine the loss between the predicted stylized image and the target stylized image.
  • the specified loss function such as L1 loss function, L2 loss function
  • the above-mentioned training process of using paired images to train the initial network is an implementation method provided by the embodiments of the present disclosure.
  • those skilled in the art can use any network training method known in the art to realize using The image is used to train the initial network, and the trained target stylized network is obtained.
  • the target stylized network can be applied to short video applications, photography applications, game applications, social applications, and comics of various styles
  • the actual collected face image can be converted into a stylized face image with the target style by using the target stylized image.
  • the paired images can be used to effectively train a target stylization network capable of converting an input image into an image with the target style.
  • the user can only provide a small number of stylized sample images to obtain the original stylized image generation network and the target stylized image generation network; and then use random data to generate a large number of paired images , not only reduces the construction cost of the paired image, but also the target stylized image in the constructed paired image can have enough realistic details and enough stylized effect.
  • a trained target stylization network can be obtained based on the paired images, and the obtained target stylization network can transform the realistic image into an image with sufficient realistic details and sufficient stylized effect.
  • the present disclosure also provides image generating devices, electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any image generating method provided in the present disclosure, corresponding technical solutions and descriptions, and corresponding records in the method section ,No longer.
  • Fig. 5 shows a block diagram of an image generation device according to an embodiment of the present disclosure. As shown in Fig. 5, the device includes:
  • An acquisition module 101 configured to acquire a realistic image generation network and a target stylized image generation network
  • the output module 102 is configured to input the same random data into the target stylized image generation network and the realized image generation network respectively, to obtain the target stylized image output by the target stylized image generation network, and the real The target realized image output by the stylized image generation network, the target stylized image has the target style, wherein the target stylized image generation network is obtained by fusing the realized image generation network and the original stylized image generation network , the original stylized image generation network is used to generate an image with the target style;
  • the determining module 103 is configured to determine the target stylized image and the target realized image corresponding to the same random data as a pair of paired images.
  • the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer
  • the combination of the realistic image generation network and The fusion of the original stylized image generation network includes: replacing the first I-layer network layer of the original stylized image generation network with the first I-layer network layer of the realistic image generation network to obtain the target style stylized image generation network, I ⁇ [1,N); wherein, the value of I is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network.
  • the first I-layer network layer is used to learn low-resolution information of images, and the low-resolution information includes edge contour information and style information of images; wherein, the original stylized The first I-layer network layer of the image generation network is replaced by the first I-layer network layer of the realized image generation network, comprising: combining the low-resolution information learned by the first I-layer network layer of the original stylized image generation network with The low-resolution information learned by the first layer I network layer of the realistic image generation network is exchanged.
  • the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer
  • the combination of the realistic image generation network and The fusion of the original stylized image generation network also includes: replacing the last J-layer network layer of the original stylized image generation network with the last J-layer network layer of the realized image generation network to obtain the target The original stylized image generation network, J ⁇ [1,N); wherein, the value of J is negatively correlated with the stylization degree of the target stylized image generated by the target original stylized image generation network.
  • the post-J layer network layer is used to learn the high-resolution information of the image, and the high-resolution information includes the detailed information of the image; wherein, the post-J layer of the original stylized image generation network Layer network layer, replaced by the last J-layer network layer of the realized image generation network, including: combining the high-resolution information learned by the rear J-layer network layer of the original stylized image generation network with the realized image generation The high-resolution information learned by the post-J network layers of the network is exchanged.
  • the original stylized image generation network is obtained by performing transfer learning on the realistic image generation network based on a stylized sample image, and the stylized sample image has the target style.
  • the performing transfer learning on the realistic image generation network based on the stylized sample image includes: acquiring the realistic image generation network and the stylized sample image with the target style ; Using the stylized sample image, perform migration learning on the realistic image generation network to obtain the original stylized image generation network.
  • the realistic image generation network is obtained by performing network training on a resolution-increasing image generation confrontation network model, and the realistic image generation network has N layers of network layers, each An n-layer network layer represents a resolution level, and the realistic image generation network is used to generate images of different resolutions by resolution level, N is a positive integer, n ⁇ [1,N).
  • the paired images are multiple pairs, and the multiple pairs of paired images are used to train the initial network to obtain a target stylized network, and the target stylized network is used to convert the input image into a Describe the image of the target style.
  • the functions or modules included in the device provided by the embodiments of the present disclosure can be used to execute the methods described in the method embodiments above, and its specific implementation can refer to the description of the method embodiments above. For brevity, here No longer.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which computer program instructions are stored, and the above-mentioned method is implemented when the computer program instructions are executed by a processor.
  • Computer readable storage media may be volatile or nonvolatile computer readable storage media.
  • An embodiment of the present disclosure also proposes an electronic device, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to invoke the instructions stored in the memory to execute the above method.
  • An embodiment of the present disclosure also provides a computer program, including computer readable codes, and when the computer readable codes are run in an electronic device, a processor in the electronic device executes the above method.
  • An embodiment of the present disclosure also provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
  • Electronic devices may be provided as terminals, servers, or other forms of devices.
  • This disclosure relates to the field of augmented reality.
  • acquiring the image information of the target object in the real environment and then using various visual correlation algorithms to detect or identify the relevant features, states and attributes of the target object, and thus obtain the image information that matches the specific application.
  • AR effect combining virtual and reality.
  • the target object may involve faces, limbs, gestures, actions, etc. related to the human body, or markers and markers related to objects, or sand tables, display areas or display items related to venues or places.
  • Vision-related algorithms may involve visual positioning, SLAM, 3D reconstruction, image registration, background segmentation, object key point extraction and tracking, object pose or depth detection, etc.
  • Specific applications can not only involve interactive scenes such as guided tours, navigation, explanations, reconstructions, virtual effect overlays and display related to real scenes or objects, but also special effects processing related to people, such as makeup beautification, body beautification, special effect display, virtual Interactive scenarios such as network display.
  • the relevant features, states and attributes of the target object can be detected or identified through the convolutional neural network.
  • the above-mentioned convolutional neural network is a network obtained by performing network training based on a deep learning framework.
  • FIG. 6 shows a block diagram of an electronic device 1900 according to an embodiment of the present disclosure.
  • the electronic device 1900 may be provided as a server or terminal.
  • electronic device 1900 includes processing component 1922 , which further includes one or more processors, and a memory resource represented by memory 1932 for storing instructions executable by processing component 1922 , such as application programs.
  • the application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions.
  • the processing component 1922 is configured to execute instructions to perform the above method.
  • the electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900 , a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input and output interface 1958 .
  • the electronic device 1900 can operate based on the operating system stored in the memory 1932, such as the Microsoft server operating system (Windows Server TM ), the graphical user interface-based operating system (Mac OS X TM ) introduced by Apple Inc., and the multi-user and multi-process computer operating system (Unix TM ), a free and open source Unix-like operating system (Linux TM ), an open source Unix-like operating system (FreeBSD TM ), or the like.
  • Windows Server TM the Microsoft server operating system
  • Mac OS X TM graphical user interface-based operating system
  • Unix TM multi-user and multi-process computer operating system
  • Linux TM free and open source Unix-like operating system
  • FreeBSD TM open source Unix-like operating system
  • a non-transitory computer-readable storage medium such as the memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the electronic device 1900 to implement the above method.
  • the present disclosure can be a system, method and/or computer program product.
  • a computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present disclosure.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory stick floppy disk
  • mechanically encoded device such as a printer with instructions stored thereon
  • a hole card or a raised structure in a groove and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • Computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA)
  • FPGA field programmable gate array
  • PDA programmable logic array
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the computer program product can be specifically realized by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. wait.
  • a software development kit Software Development Kit, SDK

Abstract

The present disclosure relates to an image generation method and apparatus, and an electronic device and a storage medium. According to the image generation method and apparatus, and the electronic device and the storage medium, the method comprises: acquiring a real image generation network and a target stylized image generation network; respectively inputting the same piece of random data into the target stylized image generation network and the real image generation network, so as to obtain a target stylized image output by the target stylized image generation network, and a target real image output by the real image generation network; and determining the target stylized image and the target real image, which correspond to the same piece of random data, as a pair of paired images.

Description

图像生成方法及装置、电子设备和存储介质Image generation method and device, electronic device and storage medium
本公开要求在2021年11月26日提交中国专利局、申请号为202111417366.5、申请名称为“图像生成方法及装置、电子设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。This disclosure claims priority to a Chinese patent application filed with the China Patent Office on November 26, 2021 with application number 202111417366.5 and titled "Image generation method and device, electronic device, and storage medium," the entire contents of which are incorporated by reference in this disclosure.
技术领域technical field
本公开涉及计算机技术领域,尤其涉及一种图像生成方法及装置、电子设备和存储介质。The present disclosure relates to the field of computer technology, and in particular to an image generation method and device, electronic equipment and storage media.
背景技术Background technique
图像风格化,就是将原始图像转换为特定风格的风格化图像,例如素描肖像风格,卡通形象风格,油画风格等。Image stylization is to convert the original image into a stylized image of a specific style, such as sketch portrait style, cartoon image style, oil painting style, etc.
发明内容Contents of the invention
本公开提出了一种图像生成技术方案。The disclosure proposes a technical solution for image generation.
根据本公开的一方面,提供了一种图像生成方法,包括:获取真实化图像生成网络以及目标风格化图像生成网络;将同一随机数据分别输入至所述目标风格化图像生成网络以及所述真实化图像生成网络,得到所述目标风格化图像生成网络输出的目标风格化图像,以及所述真实化图像生成网络输出的目标真实化图像,所述目标风格化图像具有目标风格,其中,所述目标风格化图像生成网络是将所述真实化图像生成网络与原始风格化图像生成网络进行融合得到的,所述原始风格化图像生成网络用于生成具有所述目标风格的图像;将同一个随机数据对应的所述目标风格化图像与所述目标真实化图像,确定为一对配对图像。通过该方式,利用随机数据,便可以生成大量配对图像,不仅降低了配对图像的构造成本,并且构造的配对图像中的目标风格化图像能够兼具足够的真实化细节与足够的风格化效果。According to an aspect of the present disclosure, an image generation method is provided, including: obtaining a realistic image generation network and a target stylized image generation network; inputting the same random data into the target stylized image generation network and the real The stylized image generation network is used to obtain the target stylized image output by the target stylized image generation network, and the target realistic image output by the realistic image generation network, the target stylized image has a target style, wherein the The target stylized image generation network is obtained by fusing the realistic image generation network with the original stylized image generation network, and the original stylized image generation network is used to generate images with the target style; the same random The target stylized image and the target realized image corresponding to the data are determined as a pair of paired images. In this way, a large number of paired images can be generated by using random data, which not only reduces the construction cost of the paired images, but also allows the target stylized image in the constructed paired images to have both sufficient realistic details and sufficient stylized effects.
在一种可能的实现方式中,所述真实化图像生成网络与所述原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,所述将所述真实化图像生成网络与所述原始风格化图像生成网络进行融合,包括:将所述原始风格化图像生成网络的前I层网络层,替换为所述真实化图像生成网络的前I层网络层,得到所述目标风格化图像生成网络,I∈[1,N);其中,I的值与所述目标风格化图像生成网络生成的目标风格化图像的风格化程度成负相关。通过该方式,可以使目标风格化图像生成网络生成的目标风格化图像兼具足够的真实化细节与足够的风格化效果。In a possible implementation manner, the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the combination of the realistic image generation network and The fusion of the original stylized image generation network includes: replacing the first I-layer network layer of the original stylized image generation network with the first I-layer network layer of the realistic image generation network to obtain the target style stylized image generation network, I∈[1,N); wherein, the value of I is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network. In this way, the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effect.
在一种可能的实现方式中,所述前I层网络层用于学习图像的低分辨率信息,低分辨率信息包括图像的边缘轮廓信息和风格信息;其中,所述将所述原始风格化图像生成网络的前I层网络层,替换为所述真实化图像生成网络的前I层网络层,包括:将所述原始风格化图像生成网络的前I层网络层学习的低分辨率信息,与所述真实化图像生成网络的前 I层网络层学习的低分辨率信息进行交换。通过该方式,可以使目标风格化图像生成网络兼顾真实化图像生成网络学习的低分辨率信息与原始风格化图像生成网络学习的高分辨率信息,进而可以生成兼具足够真实化细节与足够风格化效果的目标风格化图像。In a possible implementation manner, the first I-layer network layer is used to learn low-resolution information of images, and the low-resolution information includes edge contour information and style information of images; wherein, the original stylized The first I-layer network layer of the image generation network is replaced by the first I-layer network layer of the realized image generation network, including: the low-resolution information learned by the first I-layer network layer of the original stylized image generation network, exchange with the low-resolution information learned by the first layer I network layer of the realistic image generation network. In this way, the target stylized image generation network can take into account the low-resolution information learned by the realistic image generation network and the high-resolution information learned by the original stylized image generation network, and then can generate images with sufficient realistic details and sufficient style. The target stylized image for the Stylize effect.
在一种可能的实现方式中,所述真实化图像生成网络与所述原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,所述将所述真实化图像生成网络与所述原始风格化图像生成网络进行融合,还包括:将所述原始风格化图像生成网络的后J层网络层,替换为所述真实化图像生成网络的后J层网络层,得到所述目标风格化图像生成网络,J∈[1,N);其中,J的值与所述目标风格化图像生成网络生成的目标风格化图像的风格化程度成负相关。通过该方式,可以使目标风格化图像生成网络生成的目标风格化图像兼具足够的真实化细节与足够的风格化效果。In a possible implementation manner, the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the combination of the realistic image generation network and The fusion of the original stylized image generation network also includes: replacing the last J-layer network layer of the original stylized image generation network with the last J-layer network layer of the realized image generation network to obtain the target A stylized image generation network, J∈[1,N); wherein, the value of J is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network. In this way, the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effect.
在一种可能的实现方式中,后J层网络层用于学习图像的高分辨率信息,高分辨率信息包括图像的细节信息;其中,所述将所述原始风格化图像生成网络的后J层网络层,替换为所述真实化图像生成网络的后J层网络层,包括:将所述原始风格化图像生成网络的后J层网络层学习的高分辨率信息,与所述真实化图像生成网络的后J层网络层学习的高分辨率信息进行交换。通过该方式,可以使目标风格化图像生成网络兼顾风格化图像生成网络学习的低分辨率信息与真实化图像生成网络学习的高分辨率信息,进而可以生成兼具足够真实化细节与足够风格化效果的目标风格化图像。In a possible implementation, the post-J layer network layer is used to learn the high-resolution information of the image, and the high-resolution information includes the detailed information of the image; wherein, the post-J layer of the original stylized image generation network Layer network layer, replaced by the back J-layer network layer of the realized image generation network, including: combining the high-resolution information learned by the back J-layer network layer of the original stylized image generation network with the realized image The high-resolution information learned by the post-J network layers of the generative network is exchanged. In this way, the target stylized image generation network can take into account the low-resolution information learned by the stylized image generation network and the high-resolution information learned by the realistic image generation network, and then can generate images with sufficient realistic details and sufficient stylization. The effect's target stylized image.
在一种可能的实现方式中,所述原始风格化图像生成网络是基于风格化样本图像对所述真实化图像生成网络进行迁移学习得到的,所述风格化样本图像具有所述目标风格。通过该方式,可以使真实化图像生成网络与原始风格化图像生成网络的网络结构可以是相同的。In a possible implementation manner, the original stylized image generation network is obtained by performing transfer learning on the realistic image generation network based on a stylized sample image, and the stylized sample image has the target style. In this way, the network structure of the realized image generation network and the original stylized image generation network can be the same.
在一种可能的实现方式中,所述基于风格化样本图像对所述真实化图像生成网络进行迁移学习,包括:获取所述真实化图像生成网络,以及具有所述目标风格的风格化样本图像;利用所述风格化样本图像,对所述真实化图像生成网络进行迁移学习,得到所述原始风格化图像生成网络。通过该方式,可以使真实化图像生成网络与原始风格化图像生成网络的网络结构可以是相同的。In a possible implementation manner, the performing transfer learning on the realistic image generation network based on the stylized sample image includes: acquiring the realistic image generation network and the stylized sample image with the target style ; Using the stylized sample image, perform migration learning on the realistic image generation network to obtain the original stylized image generation network. In this way, the network structure of the realized image generation network and the original stylized image generation network can be the same.
在一种可能的实现方式中,所述真实化图像生成网络是通过对逐分辨率递增的图像生成式对抗网络模型进行网络训练得到的,所述真实化图像生成网络具有N层网络层,每n层网络层表示一个分辨率层级,所述真实化图像生成网络用于逐分辨率层级生成不同分辨率的图像,N为正整数,n∈[1,N)。In a possible implementation manner, the realistic image generation network is obtained by performing network training on a resolution-increasing image generation confrontation network model, and the realistic image generation network has N layers of network layers, each An n-layer network layer represents a resolution level, and the realistic image generation network is used to generate images of different resolutions by resolution level, N is a positive integer, n∈[1,N).
在一种可能的实现方式中,所述配对图像为多对,多对所述配对图像用于训练初始网络得到目标风格化网络,所述目标风格化网络用于将输入的图像转化为具有所述目标风格的图像。通过该方式,能够利用配对图像,有效训练出能够将输入的图像转化为具有目标风格的图像的目标风格化网络。In a possible implementation manner, the paired images are multiple pairs, and the multiple pairs of paired images are used to train the initial network to obtain a target stylized network, and the target stylized network is used to convert the input image into a Describe the image of the target style. In this way, the paired images can be used to effectively train a target stylization network that can convert an input image into an image with the target style.
根据本公开的一方面,提供了一种图像生成装置,包括:获取模块,用于获取真实 化图像生成网络以及目标风格化图像生成网络;输出模块,用于将同一随机数据分别输入至所述目标风格化图像生成网络以及所述真实化图像生成网络,得到所述目标风格化图像生成网络输出的目标风格化图像,以及所述真实化图像生成网络输出的目标真实化图像,所述目标风格化图像具有目标风格,其中,所述目标风格化图像生成网络是将所述真实化图像生成网络与风格化图像生成网络进行融合得到的,所述原始风格化图像生成网络用于生成具有所述目标风格的图像;确定模块,用于将同一个随机数据对应的所述目标风格化图像与所述目标真实化图像,确定为一对配对图像。According to an aspect of the present disclosure, an image generation device is provided, including: an acquisition module, used to obtain a realistic image generation network and a target stylized image generation network; an output module, used to input the same random data into the The target stylized image generation network and the realistic image generation network obtain the target stylized image output by the target stylized image generation network, and the target realistic image output by the realistic image generation network, and the target style The stylized image has the target style, wherein the target stylized image generation network is obtained by fusing the realistic image generation network with the stylized image generation network, and the original stylized image generation network is used to generate the An image of the target style; a determining module, configured to determine the target stylized image and the target realized image corresponding to the same random data as a pair of paired images.
根据本公开的一方面,提供了一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上述方法。According to an aspect of the present disclosure, there is provided an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to call the instructions stored in the memory to execute the above-mentioned method.
根据本公开的一方面,提供了一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。According to one aspect of the present disclosure, there is provided a computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above method is implemented.
根据本公开的一方面,提供了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行上述方法。According to one aspect of the present disclosure, a computer program is provided, including computer readable codes, and when the computer readable codes are run in an electronic device, a processor in the electronic device executes the above method.
在本公开实施例中,通过将真实化图像生成网络与原始风格化图像生成网络进行融合,得到目标风格化生成网络,进而利用随机数据,便可以生成大量配对图像,不仅降低了配对图像的构造成本,并且构造的配对图像中的目标风格化图像能够兼具足够的真实化细节与足够的风格化效果;另外,在将配对图像应用于网络模型训练中时,基于配对图像可以得到训练后的目标风格化网络,该得到的目标风格化网络能够将真实化图像转化成兼具足够的真实化细节与足够的风格化效果的图像。In the embodiment of the present disclosure, by fusing the realistic image generation network with the original stylized image generation network to obtain the target stylized generation network, and then using random data, a large number of paired images can be generated, which not only reduces the construction of paired images cost, and the target stylized image in the constructed paired image can have enough realistic details and enough stylized effect; in addition, when the paired image is applied to the network model training, the trained image can be obtained based on the paired image The target stylization network, the obtained target stylization network can transform the realistic image into an image with sufficient realistic details and sufficient stylized effect.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。根据下面参考附图对示例性实施例的详细说明,本公开的其它特征及方面将变得清楚。It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure. Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments with reference to the accompanying drawings.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。The accompanying drawings here are incorporated into the description and constitute a part of the present description. These drawings show embodiments consistent with the present disclosure, and are used together with the description to explain the technical solution of the present disclosure.
图1示出根据本公开实施例的图像生成方法的流程图。FIG. 1 shows a flowchart of an image generation method according to an embodiment of the present disclosure.
图2示出根据本公开实施例的一种网络融合的示意图。Fig. 2 shows a schematic diagram of network convergence according to an embodiment of the present disclosure.
图3示出根据本公开实施例的一种风格化程度的示意图。FIG. 3 shows a schematic diagram of a degree of stylization according to an embodiment of the disclosure.
图4示出根据本公开实施例的一种风格化程度的示意图。FIG. 4 shows a schematic diagram of a degree of stylization according to an embodiment of the disclosure.
图5示出根据本公开实施例的图像生成装置的框图。FIG. 5 shows a block diagram of an image generating device according to an embodiment of the present disclosure.
图6示出根据本公开实施例的一种电子设备的框图。Fig. 6 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
具体实施方式Detailed ways
以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。Various exemplary embodiments, features, and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. The same reference numbers in the figures indicate functionally identical or similar elements. While various aspects of the embodiments are shown in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" is not necessarily to be construed as superior or better than other embodiments.
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" in this article is just an association relationship describing associated objects, which means that there can be three relationships, for example, A and/or B can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.
另外,为了更好地说明本公开,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本公开同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本公开的主旨。In addition, in order to better illustrate the present disclosure, numerous specific details are given in the following specific implementation manners. It will be understood by those skilled in the art that the present disclosure may be practiced without some of the specific details. In some instances, methods, means, components and circuits that are well known to those skilled in the art have not been described in detail so as to obscure the gist of the present disclosure.
图1示出根据本公开实施例的图像生成方法的流程图,所述图像生成方法可以由终端设备或服务器等电子设备执行,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等,所述方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现,或者,可通过服务器执行所述方法。如图1所示,所述图像生成方法包括:Fig. 1 shows a flow chart of an image generation method according to an embodiment of the present disclosure, the image generation method may be executed by an electronic device such as a terminal device or a server, and the terminal device may be a user equipment (User Equipment, UE), a mobile device, a user Terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle-mounted device, wearable device, etc., the method can call the computer-readable information stored in the memory through the processor instructions, or the method may be executed by a server. As shown in Figure 1, the image generation method includes:
在步骤S11中,获取真实化图像生成网络以及目标风格化图像生成网络。In step S11, a realistic image generation network and a target stylized image generation network are acquired.
其中,真实化图像生成网络用于生成真实化图像,目标风格化图像生成网络用于生成具有目标风格的目标风格化图像。真实化图像可以理解为不具有目标风格的图像,目标风格化图像可以理解为具有目标风格的图像。在一种可能的实现方式中,目标风格可以包括素描肖像风格,卡通形象风格,油画风格、漫画风格等任意图像风格,其中,漫画风格例如可以至少包括:SD娃娃,变小孩,CG风格1,厚涂,暗黑韩漫,韩漫,CG风格2。应理解的是,对于目标风格的种类,本公开实施例不作限制。Among them, the realistic image generation network is used to generate realistic images, and the target stylized image generation network is used to generate target stylized images with the target style. Realized images can be understood as images without the target style, and target stylized images can be understood as images with the target style. In a possible implementation, the target style can include any image style such as sketch portrait style, cartoon image style, oil painting style, comic style, etc., wherein, the comic style can at least include, for example: SD doll, becoming a child, CG style 1, Impasto, dark Korean comics, Korean comics, CG style 2. It should be understood that the embodiment of the present disclosure does not limit the type of the target style.
在一种可能的实现方式中,真实化图像生成网络可以通过对逐分辨率递增的图像生成式对抗网络模型进行网络训练得到的,其中,逐分辨率递增的图像生成式对抗网络模型,例如可以包括ProgressiveGAN、StyleGAN、StyleGANv1、StyleGANv2、StyleGANv3以及其它StyleGAN的衍生网络模型。In a possible implementation, the realistic image generation network can be obtained by performing network training on the image generation adversarial network model with increasing resolution, wherein the image generation adversarial network model with increasing resolution can, for example, be Including ProgressiveGAN, StyleGAN, StyleGANv1, StyleGANv2, StyleGANv3 and other derived network models of StyleGAN.
可知晓的是,这类逐分辨率递增的图像生成式对抗网络模型,包括生成网络G与判别网络D,且逐分辨率递增的图像生成式对抗网络模型的基本原理,可以简单理解为:将一个随机数据z(也可称为随机噪声)输入至生成网络G中,生成网络G通过这个随机数据生成图像G(z),将生成图像G(z)输入至判别网络D中,通过判别网络D来判别输入的图像 是不是真实的,或者说是不是生成网络G所生成的图像G(z)。It can be known that this kind of image generation confrontation network model with increasing resolution includes generation network G and discriminant network D, and the basic principle of image generation confrontation network model with increasing resolution can be simply understood as: A random data z (also called random noise) is input into the generation network G, the generation network G generates an image G(z) through this random data, and the generated image G(z) is input into the discriminant network D, through the discriminant network D to judge whether the input image is real, or whether it is the image G(z) generated by the generation network G.
其中,逐分辨率递增,可以理解为,生成网络G的浅层网络层先学习并生成低分辨率的图像(如4×4的分辨率),之后逐渐随着网络深度的增加,继续学习并生成更高分辨率的图像(如1024×1024的分辨率)。Among them, increasing by resolution can be understood as that the shallow network layer of the generation network G first learns and generates low-resolution images (such as 4×4 resolution), and then gradually continues to learn and generate images as the network depth increases. Generate higher resolution images (such as 1024×1024 resolution).
在一种可能的实现方式中,真实化图像生成网络具有N层网络层,每n层网络层表示一个分辨率层级,真实化图像生成网络用于逐分辨率层级生成不同分辨率的图像,N为正整数,n∈[1,N);例如,可以采用18层网络层的StyleGAN,其中,每两层可以表示一个分辨率层级,从4×4到1024×1024逐分辨率层级生成不同分辨率的图像。通过该方式,在对真实化图像生成网络以及原始风格化图像生成网络进行网络融合的时候,相当于是把两个网络各自学习到的低分辨率信息与高分辨率信息做了交换,其中,低分辨率信息包括图像的边缘轮廓信息和风格信息,高分辨信息包括图像的细节信息。In a possible implementation, the realistic image generation network has N layers of network layers, and each n-layer network layer represents a resolution level, and the realistic image generation network is used to generate images of different resolutions by resolution level, N is a positive integer, n∈[1,N); for example, StyleGAN with 18 network layers can be used, where every two layers can represent a resolution level, and different resolution levels are generated from 4×4 to 1024×1024 rate images. In this way, when merging the realistic image generation network and the original stylized image generation network, it is equivalent to exchanging the low-resolution information learned by the two networks with the high-resolution information. The resolution information includes image edge profile information and style information, and the high-resolution information includes image detail information.
在上述逐分辨率递增的图像生成式对抗网络模型的训练过程中,生成网络G的目标就是尽量生成真实的图像去欺骗判别网络D,而判别网络D的目标就是尽量识别出生成网络G生成的图像;其中,可以利用生成网络G与判别网络D之间的对抗损失,训练该类图像生成式对抗网络模型,并可以将训练后的生成网络D作为真实化图像生成网络。应理解的是,本领域技术人员可以采用本领域已知的网络训练方式,训练该类逐分辨率递增的图像生成式对抗网络模型,对此本公开实施例不作限制。In the above-mentioned training process of the image generation confrontation network model with increasing resolution, the goal of the generation network G is to generate real images to deceive the discriminant network D, and the goal of the discriminant network D is to identify the images generated by the generative network G as much as possible. image; wherein, the confrontation loss between the generation network G and the discriminant network D can be used to train this type of image generation confrontation network model, and the trained generation network D can be used as a realistic image generation network. It should be understood that those skilled in the art may use network training methods known in the art to train this type of resolution-increasing image generative adversarial network model, which is not limited by the embodiments of the present disclosure.
在一种可能的实现方式中,目标风格化图像生成网络是将真实化图像生成网络与原始风格化图像生成网络进行融合得到的,原始风格化图像生成网络用于生成具有目标风格的图像;原始风格化图像生成网络是基于风格化样本图像对真实化图像生成网络进行迁移学习得到的,风格化样本图像具有目标风格。In a possible implementation, the target stylized image generation network is obtained by fusing the realistic image generation network with the original stylized image generation network, and the original stylized image generation network is used to generate images with the target style; the original The stylized image generation network is obtained by transferring the realistic image generation network based on the stylized sample images, which have the target style.
如上所述,原始风格化图像生成网络可以是对真实化图像生成网络进行迁移学习得到的,那么真实化图像生成网络与原始风格化图像生成网络的网络结构可以是相同的,在一种可能的实现方式中,可以将真实化图像生成网络与原始风格化图像生成网络在特定的网络层进行互换融合,从而得到目标风格化图像生成网络,例如,将真实化图像生成网络的某几层网络层,替换为原始风格化图像生成网络中对应层级的网络层。As mentioned above, the original stylized image generation network can be obtained by transfer learning the realized image generation network, then the network structure of the realized image generation network and the original stylized image generation network can be the same, in a possible In the implementation method, the realistic image generation network and the original stylized image generation network can be interchanged and fused at a specific network layer, so as to obtain the target stylized image generation network, for example, some layers of the realistic image generation network layer, replaced by the network layer of the corresponding layer in the original stylized image generation network.
通过该方式,可以使目标风格化图像生成网络生成的目标风格化图,像兼具足够的真实化细节与足够的风格化效果。In this way, the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effects.
在步骤S12中,将同一随机数据分别输入至目标风格化图像生成网络以及真实化图像生成网络,得到目标风格化图像生成网络输出的目标风格化图像,以及真实化图像生成网络输出的目标真实化图像,目标风格化图像具有目标风格。In step S12, the same random data is respectively input into the target stylized image generation network and the realized image generation network to obtain the target stylized image output by the target stylized image generation network and the target realization output by the realized image generation network Image, target stylized The image has the target style.
如上所述,是将一个随机数据z(也可称为随机噪声)输入至生成网络G中,生成网络G通过这个随机数据生成图像G(z);也即,真实化图像生成网络与目标风格化图像生成网络,实际上均是利用随机数据进行逐级上采样或者说逐级增加分辨率来生成图像的。为生成配对的目标真实化图像与目标风格化图像。As mentioned above, a random data z (also called random noise) is input into the generation network G, and the generation network G generates an image G(z) through this random data; that is, the real image generation network and the target style The optimized image generation network actually uses random data to perform step-by-step upsampling or increase the resolution step by step to generate images. To generate paired object realization images and object stylized images.
通过该方式,可以高效地生成配对图像,并且使目标真实化图像与目标风格化图像是对应匹配的,或者说,使目标风格化图像相当于是对目标真实化图像进行风格化后的图像。In this way, paired images can be efficiently generated, and the target realized image and the target stylized image are correspondingly matched, or in other words, the target stylized image is equivalent to a stylized target realized image.
在一种可能的实现方式中,随机数据可以包括:随机向量、特征矩阵、随机张量等任意类型的数值,其中,随机向量可以是服从高斯分布的隐向量,对此本公开实施例不作限制。In a possible implementation, the random data may include: any type of value such as random vectors, feature matrices, random tensors, etc., where the random vectors may be hidden vectors subject to a Gaussian distribution, which is not limited by this embodiment of the present disclosure .
在步骤S13中,将同一个随机数据对应的目标风格化图像与目标真实化图像,确定为一对配对图像。In step S13, the target stylized image and the target realized image corresponding to the same random data are determined as a pair of paired images.
其中,随机数据可以是多个,生成的配对图像可以是多对,可理解的是,每对配对图像中的目标真实化图像与目标风格化图像是基于同一随机数据生成的。There may be multiple random data, and multiple pairs of paired images may be generated. It is understandable that the target realization image and the target stylized image in each pair of paired images are generated based on the same random data.
在一种可能的实现方式中,配对图像可以用于训练初始网络,得到训练后的网络,该训练后的网络能够将输入的图像转化为具有目标风格的图像。其中,初始网络可以采用深度学习网络模型,例如可以采用卷积神经网络,对抗神经网络等网络模型。应理解的是,对于初始网络的网络结构、网络类型以及训练方式等,对此本公开实施例不作限制。In a possible implementation, the paired images can be used to train an initial network to obtain a trained network capable of transforming an input image into an image with a target style. Wherein, the initial network can adopt a deep learning network model, for example, a convolutional neural network, an adversarial neural network and other network models can be used. It should be understood that the embodiments of the present disclosure do not limit the network structure, network type, training method, etc. of the initial network.
在本公开实施例中,通过将真实化图像生成网络与原始风格化图像生成网络进行融合,得到目标风格化生成网络,可以使目标风格化生成网络生成的目标风格化图像,能够兼具足够的真实化细节以及足够的风格化效果;从而利用真实化图像生成网络与目标风格化图像生成网络,可以高效地生成优质的配对图像。In the embodiment of the present disclosure, by fusing the realistic image generation network with the original stylized image generation network to obtain the target stylized generation network, the target stylized image generated by the target stylized generation network can have sufficient Realistic details and sufficient stylized effects; thus, using the realistic image generation network and the target stylized image generation network, high-quality paired images can be efficiently generated.
如上所述,可以将真实化图像生成网络与原始风格化图像生成网络在特定的网络层进行互换融合,从而得到目标风格化图像生成网络。在一种可能的实现方式中,真实化图像生成网络与原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,将真实化图像生成网络与原始风格化图像生成网络进行融合,可以包括:As mentioned above, the realistic image generation network and the original stylized image generation network can be exchanged and fused at a specific network layer to obtain the target stylized image generation network. In a possible implementation, the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the realistic image generation network and the original stylized image generation network are fused , which can include:
将原始风格化图像生成网络的前I层网络层,替换为真实化图像生成网络的前I层网络层,得到目标风格化图像生成网络,I∈[1,N)。Replace the first I-layer network layer of the original stylized image generation network with the first I-layer network layer of the realistic image generation network to obtain the target stylized image generation network, I∈[1,N).
其中,将原始风格化图像生成网络的前I层网络层,替换为真实化图像生成网络的前I层网络层,也即,将真实化图像生成网络的前I层网络层,与原始风格化图像生成网络的后N-I层网络层进行拼接。Among them, the first I-layer network layer of the original stylized image generation network is replaced with the first I-layer network layer of the realized image generation network, that is, the first I-layer network layer of the realized image generation network is compared with the original stylized The last N-I layer network layer of the image generation network is spliced.
举例来说,图2示出根据本公开实施例的一种网络融合的示意图,如图2所示,可以将真实化图像生成网络的前N/2层网络层,与原始风格化图像生成网络的后N/2层网络层进行拼接,也即将原始风格化图像生成网络的前N/2层网络层,替换为真实化图像生成网络的前N/2层网络层,得到目标风格化图像生成网络。For example, FIG. 2 shows a schematic diagram of network fusion according to an embodiment of the present disclosure. As shown in FIG. 2 , the first N/2 layers of the network layer of the realistic image generation network can be combined with the original stylized image generation network Splicing the last N/2 network layers of the original stylized image generation network, that is, replacing the first N/2 layer network layer of the original stylized image generation network with the first N/2 layer network layer of the real image generation network, to obtain the target stylized image generation network.
其中,N层网络层中每层网络输出的分辨率是逐层上升的,也即,N层网络层的前几层网络层可以认为是低分辨率层级,后几层网络层可以认为是高分辨率层级。由此可以理解的是,通过本公开实施例所得到的目标风格化图像生成网络,在生成目标风格化过 程中,可以简单理解为,先基于随机数据逐级生成真实化的中间图,进而逐级在真实化的中间图上添加风格化效果,得到目标风格化图像。Among them, the resolution of each network output in the N-layer network layer increases layer by layer, that is, the first few network layers of the N-layer network layer can be considered as low-resolution layers, and the next few network layers can be considered as high-resolution layers. resolution level. It can be understood from this that the target stylized image generation network obtained through the embodiments of the present disclosure can be simply understood as, in the process of generating the target stylized image, firstly generate realistic intermediate images step by step based on random data, and then step by step The second stage adds a stylized effect on the realized intermediate image to obtain the target stylized image.
如上所述,在对真实化图像生成网络以及原始风格化图像生成网络进行网络融合的时候,相当于是把两个网络各自学习到的低分辨率信息与高分辨率信息做了交换,其中,低分辨率信息包括图像的边缘轮廓信息和风格信息,高分辨信息包括图像的细节信息。也即,两个网络的前I层网络层用于学习低分辨率信息,后N-1层网络层用于学习高分辨率信息。As mentioned above, when performing network fusion on the realistic image generation network and the original stylized image generation network, it is equivalent to exchanging the low-resolution information and high-resolution information learned by the two networks respectively. The resolution information includes image edge profile information and style information, and the high-resolution information includes image detail information. That is, the first I-layer network layer of the two networks is used to learn low-resolution information, and the latter N-1-layer network layer is used to learn high-resolution information.
在一种可能的实现方式中,将原始风格化图像生成网络的前I层网络层,替换为真实化图像生成网络的前I层网络层,包括:将原始风格化图像生成网络的前I层网络层学习的低分辨率信息,与真实化图像生成网络的前I层网络层学习的低分辨率信息进行交换。通过该方式,可以使目标风格化图像生成网络兼顾真实化图像生成网络学习的低分辨率信息与原始风格化图像生成网络学习的高分辨率信息,进而可以生成兼具足够真实化细节与足够风格化效果的目标风格化图像。In a possible implementation, the first I-layer network layer of the original stylized image generation network is replaced with the first I-layer network layer of the realized image generation network, including: the first I-layer network layer of the original stylized image generation network The low-resolution information learned by the network layer is exchanged with the low-resolution information learned by the previous I-layer network layer of the realistic image generation network. In this way, the target stylized image generation network can take into account the low-resolution information learned by the realistic image generation network and the high-resolution information learned by the original stylized image generation network, and then can generate images with sufficient realistic details and sufficient style. The target stylized image for the Stylize effect.
在一种可能的实现方式中,I的值可以根据风格化程度的需求设置,其中,I的值与目标风格化图像生成网络所生成的目标风格化图像的风格化程度成负相关,可以理解为,I的值越小,目标风格化图像生成网络生成的目标风格化图像越接近风格化效果(或者说越不像真实化图像),也即风格化程度越高;I的值越大,目标风格化图像越接近真实化效果(或者说越像真实化图像),也即风格化程度越低。图3示出根据本公开实施例的一种风格化程度的示意图,如图3所示,I的值越大,风格化程度越接近真实化效果,也即越像真实人脸,I的值越小,风格化程序越接近风格化效果,也即越不像真实人脸。In a possible implementation, the value of I can be set according to the requirements of the degree of stylization, wherein the value of I is negatively correlated with the degree of stylization of the target stylized image generated by the target stylized image generation network, which can be understood Therefore, the smaller the value of I, the closer the target stylized image generated by the target stylized image generation network to the stylized effect (or the less realistic image), that is, the higher the degree of stylization; the larger the value of I, The closer the target stylized image is to the realistic effect (or more like the realistic image), that is, the lower the stylization degree. Fig. 3 shows a schematic diagram of a degree of stylization according to an embodiment of the present disclosure. As shown in Fig. 3 , the larger the value of I, the closer the degree of stylization is to the realistic effect, that is, the more it looks like a real human face, the value of I The smaller it is, the closer the stylization program is to the stylization effect, that is, the less it resembles a real face.
在本公开实施例中,可以使目标风格化图像生成网络生成的目标风格化图像兼具足够的真实化细节与足够的风格化效果。In the embodiment of the present disclosure, the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effect.
在一种可能的实现方式中,真实化图像生成网络与原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,将真实化图像生成网络与原始风格化图像生成网络进行融合,还可以包括:In a possible implementation, the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the realistic image generation network and the original stylized image generation network are fused , which can also include:
将原始风格化图像生成网络的后J层网络层,替换为真实化图像生成网络的后J层网络层,得到目标风格化图像生成网络,J∈[1,N)。Replace the last J-layer network layer of the original stylized image generation network with the last J-layer network layer of the realistic image generation network to obtain the target stylized image generation network, J∈[1,N).
其中,将原始风格化图像生成网络的后J层网络层,替换为真实化图像生成网络的后J层网络层,也即,将真实化图像生成网络的后J层网络层,与原始风格化图像生成网络的前N-J层网络层进行拼接。Among them, the post-J layer network layer of the original stylized image generation network is replaced by the post-J-layer network layer of the realized image generation network, that is, the post-J-layer network layer of the realized image generation network is compared with the original stylized The first N-J layer network layers of the image generation network are spliced.
如上所述,N层网络层中每层网络输出的分辨率是逐层上升的,也即,N层网络层的前几层网络层可以认为是低分辨率层级,后几层网络层可以认为是高分辨率层级。由此可以理解的是,通过本公开实施例所得到的目标风格化图像生成网络,在生成目标风格化过程中,可以简单理解为,先基于随机数据逐级生成风格化的中间图,进而逐级在风格化的中间图上添加真实化细节,得到目标风格化图像。As mentioned above, the resolution of each network output in the N-layer network layer increases layer by layer, that is, the first few network layers of the N-layer network layer can be considered as low-resolution layers, and the next few network layers can be considered as low-resolution layers. is a high-resolution layer. It can be understood from this that the target stylized image generation network obtained through the embodiments of the present disclosure can be simply understood as, in the process of generating the target stylized image, the stylized intermediate image is firstly generated step by step based on random data, and then step by step The second stage adds realistic details to the stylized intermediate image to obtain the target stylized image.
如上所述,在对真实化图像生成网络以及原始风格化图像生成网络进行网络融合的时候,相当于是把两个网络各自学习到的低分辨率信息与高分辨率信息做了交换,其中,低分辨率信息包括图像的边缘轮廓信息和风格信息,高分辨信息包括图像的细节信息。也即,两个网络的前N-J层网络层用于学习低分辨率信息,后J层网络层用于学习高分辨率信息。As mentioned above, when performing network fusion on the realistic image generation network and the original stylized image generation network, it is equivalent to exchanging the low-resolution information and high-resolution information learned by the two networks respectively. The resolution information includes image edge profile information and style information, and the high-resolution information includes image detail information. That is, the first N-J layer network layers of the two networks are used to learn low-resolution information, and the last J-layer network layers are used to learn high-resolution information.
在一种可能的实现方式中,将原始风格化图像生成网络的后J层网络层,替换为真实化图像生成网络的后J层网络层,包括:将原始风格化图像生成网络的后J层网络层学习的高分辨率信息,与真实化图像生成网络的后J层网络层学习的高分辨率信息进行交换。通过该方式,可以使目标风格化图像生成网络兼顾风格化图像生成网络学习的低分辨率信息与真实化图像生成网络学习的高分辨率信息,进而可以生成兼具足够真实化细节与足够风格化效果的目标风格化图像。In a possible implementation, the post-J layer network layer of the original stylized image generation network is replaced by the post-J layer network layer of the realized image generation network, including: the post-J layer network layer of the original stylized image generation network The high-resolution information learned by the network layer is exchanged with the high-resolution information learned by the subsequent J-layer network layer of the realistic image generation network. In this way, the target stylized image generation network can take into account the low-resolution information learned by the stylized image generation network and the high-resolution information learned by the realistic image generation network, and then can generate images with sufficient realistic details and sufficient stylization. The effect's target stylized image.
在一种可能的实现方式中,J的值可以根据具体风格化需求设置,其中,J的值与目标风格化图像生成网络生成的目标风格化图像的风格化程度成负相关,可以理解为,J的值越大,目标风格化图像生成网络生成的目标风格化图像越接近真实化效果(或者说越像真实化图像),风格化程度越低,J的值越小,目标风格化图像越接近风格化效果(或者说越不像真实化图像),风格化程度越高。图4示出根据本公开实施例的风格化程度的示意图,如图4所示,J的值越小,风格化程度越高,也即越不像真实人脸,J的值越大,风格化程度越低,也即越像真实人脸。In a possible implementation, the value of J can be set according to specific stylization requirements, wherein the value of J is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network, which can be understood as, The larger the value of J, the closer the target stylized image generated by the target stylized image generation network to the realistic effect (or the more like a real image), the lower the degree of stylization, the smaller the value of J, the closer the target stylized image The closer to the stylized effect (or the less realistic the image is), the higher the stylization. Fig. 4 shows a schematic diagram of the degree of stylization according to an embodiment of the present disclosure. As shown in Fig. 4, the smaller the value of J, the higher the degree of stylization, that is, the less it resembles a real human face; the larger the value of J, the higher the degree of stylization. The lower the degree of humanization, the more it resembles a real human face.
在本公开实施例中,可以使目标风格化图像生成网络生成的目标风格化图像兼具足够的真实化细节与足够的风格化效果。In the embodiment of the present disclosure, the target stylized image generated by the target stylized image generation network can have sufficient realistic details and sufficient stylized effect.
如上所述,原始风格化图像生成网络是基于风格化样本图像对真实化图像生成网络进行迁移学习得到的,在一种可能的实现方式中,基于风格化样本图像对真实化图像生成网络进行迁移学习,包括:As mentioned above, the original stylized image generation network is obtained by transferring the realistic image generation network based on the stylized sample images. study, including:
获取真实化图像生成网络,以及具有目标风格的风格化样本图像;利用风格化样本图像,对真实化图像生成网络进行迁移学习,得到原始风格化图像生成网络。Obtain a realistic image generation network and a stylized sample image with the target style; use the stylized sample image to perform transfer learning on the realistic image generation network to obtain the original stylized image generation network.
其中,真实化图像生成网络,可以是按照上述网络训练过程训练得到的生成网络D。迁移学习可以理解为,使真实化图像生成网络学习风格化样本图像中的目标风格,从而生成具有目标风格的风格化图像,也即得到原始风格化图像生成网络。Wherein, the realistic image generation network may be a generation network D trained according to the above-mentioned network training process. Migration learning can be understood as enabling the realistic image generation network to learn the target style in the stylized sample image, thereby generating a stylized image with the target style, that is, to obtain the original stylized image generation network.
应理解的是,本领域技术人员可以采用本领域已知的迁移学习技术,实现利用风格化样本图像,对真实化图像生成网络进行迁移学习,得到原始风格化图像生成网络,对此本公开实施例不作限制。It should be understood that those skilled in the art can use transfer learning techniques known in the art to implement transfer learning on the realistic image generation network using stylized sample images to obtain the original stylized image generation network. Examples are not limited.
在一种可能的实现方式中,也可以参照上述真实化图像生成网络的训练方式,通过训练上述逐分辨率递增的图像生成式对抗网络模型,得到原始风格化图像生成网络;进而可以利用真实化样本图像,对原始风格化图像生成网络进行迁移学习,得到真实化图像生成网络,对此本公开实施例不作限制。In a possible implementation, the original stylized image generation network can be obtained by training the above-mentioned resolution-increasing image generation adversarial network model by referring to the training method of the above-mentioned realistic image generation network; For the sample image, transfer learning is performed on the original stylized image generation network to obtain a realistic image generation network, which is not limited in this embodiment of the present disclosure.
在本公开实施例中,可以高效地得到原始风格化图像生成网络,且原始风格化图像生成网络可以保持真实化图像生成网络的网络结构,不会增加原始风格化图像生成网络的参数量,便于之后将真实化图像生成网络与原始风格化图像生成网络进行融合。In the embodiment of the present disclosure, the original stylized image generation network can be efficiently obtained, and the original stylized image generation network can maintain the network structure of the realistic image generation network without increasing the parameter amount of the original stylized image generation network, which is convenient Afterwards, the realistic image generation network is fused with the original stylized image generation network.
如上所述,可以将同一随机数据分别输入至目标风格化图像生成网络以及真实化图像生成网络,得到目标风格化图像以及目标真实化图像。随机数据可以是多个,也即配对图像可以为多对,在一种可能的实现方式中,多对配对图像可以用于训练初始网络得到目标风格化网络,目标风格化网络用于将输入的原始图像转化为具有目标风格的图像。As mentioned above, the same random data can be input into the target stylized image generation network and the realized image generation network respectively to obtain the target stylized image and the target realized image. There can be multiple random data, that is, there can be multiple pairs of paired images. In a possible implementation, multiple pairs of paired images can be used to train the initial network to obtain the target stylized network, and the target stylized network is used to convert the input The original image is transformed into an image with the style of the target.
如上所述,初始网络可以采用本领域已知的深度学习网络模型,例如可以采用卷积神经网络,对抗神经网络等网络模型。应理解的是,对于初始网络的网络结构以及网络类型等,对此本公开实施例不作限制。As mentioned above, the initial network can adopt a deep learning network model known in the art, for example, a convolutional neural network, an adversarial neural network and other network models can be used. It should be understood that the embodiment of the present disclosure does not limit the network structure and network type of the initial network.
在一种可能的实现方式中,利用配对图像训练初始网络的训练过程,例如可以包括:将配对图像中的目标真实化图像输入至初始网络中,得到初始网络输出的预测风格化图像;根据预测风格化图像与配对图像中的目标风格化图像之间的损失,通过梯度下降、反向传播等方式,优化初始网络的网络参数至该损失收敛,得到目标风格化网络。In a possible implementation manner, the training process of using the paired images to train the initial network may include, for example: inputting the target realization image in the paired images into the initial network to obtain the predicted stylized image output by the initial network; The loss between the stylized image and the target stylized image in the paired image, through gradient descent, back propagation, etc., optimize the network parameters of the initial network until the loss converges, and obtain the target stylized network.
其中,可以根据预测风格化图像与目标风格化图像之间的距离,确定预测风格化图像与目标风格化图像之间的损失,其中,该距离可以包括:预测风格化图像与目标风格化图像之间的L1距离或L2距离等,并通过指定的损失函数(例如L1损失函数、L2损失函数),确定预测风格化图像与目标风格化图像之间的损失。Wherein, the loss between the predicted stylized image and the target stylized image can be determined according to the distance between the predicted stylized image and the target stylized image, wherein the distance can include: the distance between the predicted stylized image and the target stylized image The L1 distance or L2 distance between them, and through the specified loss function (such as L1 loss function, L2 loss function), determine the loss between the predicted stylized image and the target stylized image.
应理解的是,上述利用配对图像训练初始网络的训练过程,是本公开实施例提供的一种实现方式,实际上,本领域技术人员可以采用本领域任意已知的网络训练方式,实现利用配对图像来训练初始网络,得到训练后的目标风格化网络。It should be understood that the above-mentioned training process of using paired images to train the initial network is an implementation method provided by the embodiments of the present disclosure. In fact, those skilled in the art can use any network training method known in the art to realize using The image is used to train the initial network, and the trained target stylized network is obtained.
在一种可能的实现方式中,在得到训练后的目标风格化网络后,可以将目标风格化网络应用于短视频应用程序、摄影应用程序、游戏应用程序、社交应用程序以及各种风格的漫画人脸生成工具中,从而可以利用目标风格化图像将实际采集的人脸图像,转化为具有目标风格的风格化人脸图像。In a possible implementation, after obtaining the trained target stylized network, the target stylized network can be applied to short video applications, photography applications, game applications, social applications, and comics of various styles In the face generation tool, the actual collected face image can be converted into a stylized face image with the target style by using the target stylized image.
在本公开实施例中,能够利用配对图像,有效训练出能够将输入的图像转化为具有目标风格的图像的目标风格化网络。In the embodiments of the present disclosure, the paired images can be used to effectively train a target stylization network capable of converting an input image into an image with the target style.
根据本公开实施例中的图像生成方法,用户可以仅提供少量风格化样本图像,便可以得到原始风格化图像生成网络,以及目标风格化图像生成网络;进而利用随机数据,便可以生成大量配对图像,不仅降低了配对图像的构造成本,并且构造的配对图像中的目标风格化图像能够兼具足够的真实化细节与足够的风格化效果,另外,在将配对图像应用于网络模型训练中时,基于配对图像可以得到训练后的目标风格化网络,该得到的目标风格化网络能够将真实化图像转化成兼具足够的真实化细节与足够的风格化效果的图像。According to the image generation method in the embodiment of the present disclosure, the user can only provide a small number of stylized sample images to obtain the original stylized image generation network and the target stylized image generation network; and then use random data to generate a large number of paired images , not only reduces the construction cost of the paired image, but also the target stylized image in the constructed paired image can have enough realistic details and enough stylized effect. In addition, when the paired image is applied to the network model training, A trained target stylization network can be obtained based on the paired images, and the obtained target stylization network can transform the realistic image into an image with sufficient realistic details and sufficient stylized effect.
可以理解,本公开提及的上述各个方法实施例,在不违背原理逻辑的情况下,均可 以彼此相互结合形成结合后的实施例,限于篇幅,本公开不再赘述。本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。It can be understood that the above-mentioned method embodiments mentioned in this disclosure can all be combined with each other to form a combined embodiment without violating the principle and logic. Due to space limitations, this disclosure will not repeat them. Those skilled in the art can understand that, in the above method in the specific implementation manner, the specific execution order of each step should be determined according to its function and possible internal logic.
此外,本公开还提供了图像生成装置、电子设备、计算机可读存储介质、程序,上述均可用来实现本公开提供的任一种图像生成方法,相应技术方案和描述和参见方法部分的相应记载,不再赘述。In addition, the present disclosure also provides image generating devices, electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any image generating method provided in the present disclosure, corresponding technical solutions and descriptions, and corresponding records in the method section ,No longer.
图5示出根据本公开实施例的图像生成装置的框图,如图5所示,所述装置包括:Fig. 5 shows a block diagram of an image generation device according to an embodiment of the present disclosure. As shown in Fig. 5, the device includes:
获取模块101,用于获取真实化图像生成网络以及目标风格化图像生成网络;An acquisition module 101, configured to acquire a realistic image generation network and a target stylized image generation network;
输出模块102,用于将同一随机数据分别输入至所述目标风格化图像生成网络以及所述真实化图像生成网络,得到所述目标风格化图像生成网络输出的目标风格化图像,以及所述真实化图像生成网络输出的目标真实化图像,所述目标风格化图像具有目标风格,其中,所述目标风格化图像生成网络是将所述真实化图像生成网络与原始风格化图像生成网络进行融合得到的,所述原始风格化图像生成网络用于生成具有所述目标风格的图像;The output module 102 is configured to input the same random data into the target stylized image generation network and the realized image generation network respectively, to obtain the target stylized image output by the target stylized image generation network, and the real The target realized image output by the stylized image generation network, the target stylized image has the target style, wherein the target stylized image generation network is obtained by fusing the realized image generation network and the original stylized image generation network , the original stylized image generation network is used to generate an image with the target style;
确定模块103,用于将同一个随机数据对应的所述目标风格化图像与所述目标真实化图像,确定为一对配对图像。The determining module 103 is configured to determine the target stylized image and the target realized image corresponding to the same random data as a pair of paired images.
在一种可能的实现方式中,所述真实化图像生成网络与所述原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,所述将所述真实化图像生成网络与所述原始风格化图像生成网络进行融合,包括:将所述原始风格化图像生成网络的前I层网络层,替换为所述真实化图像生成网络的前I层网络层,得到所述目标风格化图像生成网络,I∈[1,N);其中,I的值与所述目标风格化图像生成网络生成的目标风格化图像的风格化程度成负相关。In a possible implementation manner, the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the combination of the realistic image generation network and The fusion of the original stylized image generation network includes: replacing the first I-layer network layer of the original stylized image generation network with the first I-layer network layer of the realistic image generation network to obtain the target style stylized image generation network, I∈[1,N); wherein, the value of I is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network.
在一种可能的实现方式中,所述前I层网络层用于学习图像的低分辨率信息,低分辨率信息包括图像的边缘轮廓信息和风格信息;其中,所述将所述原始风格化图像生成网络的前I层网络层,替换为所述真实化图像生成网络的前I层网络层,包括:将所述原始风格化图像生成网络的前I层网络层学习的低分辨率信息与所述真实化图像生成网络的前I层网络层学习的低分辨率信息进行交换。In a possible implementation manner, the first I-layer network layer is used to learn low-resolution information of images, and the low-resolution information includes edge contour information and style information of images; wherein, the original stylized The first I-layer network layer of the image generation network is replaced by the first I-layer network layer of the realized image generation network, comprising: combining the low-resolution information learned by the first I-layer network layer of the original stylized image generation network with The low-resolution information learned by the first layer I network layer of the realistic image generation network is exchanged.
在一种可能的实现方式中,所述真实化图像生成网络与所述原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,所述将所述真实化图像生成网络与所述原始风格化图像生成网络进行融合,还包括:将所述原始风格化图像生成网络的后J层网络层,替换为所述真实化图像生成网络的后J层网络层,得到所述目标原始风格化图像生成网络,J∈[1,N);其中,J的值与所述目标原始风格化图像生成网络生成的目标风格化图像的风格化程度成负相关。In a possible implementation manner, the realistic image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the combination of the realistic image generation network and The fusion of the original stylized image generation network also includes: replacing the last J-layer network layer of the original stylized image generation network with the last J-layer network layer of the realized image generation network to obtain the target The original stylized image generation network, J∈[1,N); wherein, the value of J is negatively correlated with the stylization degree of the target stylized image generated by the target original stylized image generation network.
在一种可能的实现方式中,后J层网络层用于学习图像的高分辨率信息,高分辨率信息包括图像的细节信息;其中,所述将所述原始风格化图像生成网络的后J层网络层,替 换为所述真实化图像生成网络的后J层网络层,包括:将所述原始风格化图像生成网络的后J层网络层学习的高分辨率信息与所述真实化图像生成网络的后J层网络层学习的高分辨率信息进行交换。In a possible implementation, the post-J layer network layer is used to learn the high-resolution information of the image, and the high-resolution information includes the detailed information of the image; wherein, the post-J layer of the original stylized image generation network Layer network layer, replaced by the last J-layer network layer of the realized image generation network, including: combining the high-resolution information learned by the rear J-layer network layer of the original stylized image generation network with the realized image generation The high-resolution information learned by the post-J network layers of the network is exchanged.
在一种可能的实现方式中,所述原始风格化图像生成网络是基于风格化样本图像对所述真实化图像生成网络进行迁移学习得到的,所述风格化样本图像具有所述目标风格。In a possible implementation manner, the original stylized image generation network is obtained by performing transfer learning on the realistic image generation network based on a stylized sample image, and the stylized sample image has the target style.
在一种可能的实现方式中,所述基于风格化样本图像对所述真实化图像生成网络进行迁移学习,包括:获取所述真实化图像生成网络,以及具有所述目标风格的风格化样本图像;利用所述风格化样本图像,对所述真实化图像生成网络进行迁移学习,得到所述原始风格化图像生成网络。In a possible implementation manner, the performing transfer learning on the realistic image generation network based on the stylized sample image includes: acquiring the realistic image generation network and the stylized sample image with the target style ; Using the stylized sample image, perform migration learning on the realistic image generation network to obtain the original stylized image generation network.
在一种可能的实现方式中,所述真实化图像生成网络是通过对逐分辨率递增的图像生成式对抗网络模型进行网络训练得到的,所述真实化图像生成网络具有N层网络层,每n层网络层表示一个分辨率层级,所述真实化图像生成网络用于逐分辨率层级生成不同分辨率的图像,N为正整数,n∈[1,N)。In a possible implementation manner, the realistic image generation network is obtained by performing network training on a resolution-increasing image generation confrontation network model, and the realistic image generation network has N layers of network layers, each An n-layer network layer represents a resolution level, and the realistic image generation network is used to generate images of different resolutions by resolution level, N is a positive integer, n∈[1,N).
在一种可能的实现方式中,所述配对图像为多对,多对所述配对图像用于训练初始网络得到目标风格化网络,所述目标风格化网络用于将输入的图像转化为具有所述目标风格的图像。In a possible implementation manner, the paired images are multiple pairs, and the multiple pairs of paired images are used to train the initial network to obtain a target stylized network, and the target stylized network is used to convert the input image into a Describe the image of the target style.
在一些实施例中,本公开实施例提供的装置具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述,为了简洁,这里不再赘述。In some embodiments, the functions or modules included in the device provided by the embodiments of the present disclosure can be used to execute the methods described in the method embodiments above, and its specific implementation can refer to the description of the method embodiments above. For brevity, here No longer.
本公开实施例还提出一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。计算机可读存储介质可以是易失性或非易失性计算机可读存储介质。Embodiments of the present disclosure also provide a computer-readable storage medium, on which computer program instructions are stored, and the above-mentioned method is implemented when the computer program instructions are executed by a processor. Computer readable storage media may be volatile or nonvolatile computer readable storage media.
本公开实施例还提出一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上述方法。An embodiment of the present disclosure also proposes an electronic device, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to invoke the instructions stored in the memory to execute the above method.
本公开实施例还提供了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行上述方法。本公开实施例还提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。An embodiment of the present disclosure also provides a computer program, including computer readable codes, and when the computer readable codes are run in an electronic device, a processor in the electronic device executes the above method. An embodiment of the present disclosure also provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
电子设备可以被提供为终端、服务器或其它形态的设备。Electronic devices may be provided as terminals, servers, or other forms of devices.
本公开涉及增强现实领域,通过获取现实环境中的目标对象的图像信息,进而借助各类视觉相关算法实现对目标对象的相关特征、状态及属性进行检测或识别处理,从而得到与具体应用匹配的虚拟与现实相结合的AR效果。示例性的,目标对象可涉及与人体相关的脸部、肢体、手势、动作等,或者与物体相关的标识物、标志物,或者与场馆或场所相关的沙盘、展示区域或展示物品等。视觉相关算法可涉及视觉定位、SLAM、三 维重建、图像注册、背景分割、对象的关键点提取及跟踪、对象的位姿或深度检测等。具体应用不仅可以涉及跟真实场景或物品相关的导览、导航、讲解、重建、虚拟效果叠加展示等交互场景,还可以涉及与人相关的特效处理,比如妆容美化、肢体美化、特效展示、虚拟网络展示等交互场景。可通过卷积神经网络,实现对目标对象的相关特征、状态及属性进行检测或识别处理。上述卷积神经网络是基于深度学习框架进行网络训练而得到的网络。This disclosure relates to the field of augmented reality. By acquiring the image information of the target object in the real environment, and then using various visual correlation algorithms to detect or identify the relevant features, states and attributes of the target object, and thus obtain the image information that matches the specific application. AR effect combining virtual and reality. Exemplarily, the target object may involve faces, limbs, gestures, actions, etc. related to the human body, or markers and markers related to objects, or sand tables, display areas or display items related to venues or places. Vision-related algorithms may involve visual positioning, SLAM, 3D reconstruction, image registration, background segmentation, object key point extraction and tracking, object pose or depth detection, etc. Specific applications can not only involve interactive scenes such as guided tours, navigation, explanations, reconstructions, virtual effect overlays and display related to real scenes or objects, but also special effects processing related to people, such as makeup beautification, body beautification, special effect display, virtual Interactive scenarios such as network display. The relevant features, states and attributes of the target object can be detected or identified through the convolutional neural network. The above-mentioned convolutional neural network is a network obtained by performing network training based on a deep learning framework.
图6示出根据本公开实施例的一种电子设备1900的框图。例如,电子设备1900可以被提供为一服务器或终端。参照图6,电子设备1900包括处理组件1922,其进一步包括一个或多个处理器,以及由存储器1932所代表的存储器资源,用于存储可由处理组件1922的执行的指令,例如应用程序。存储器1932中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件1922被配置为执行指令,以执行上述方法。FIG. 6 shows a block diagram of an electronic device 1900 according to an embodiment of the present disclosure. For example, the electronic device 1900 may be provided as a server or terminal. Referring to FIG. 6 , electronic device 1900 includes processing component 1922 , which further includes one or more processors, and a memory resource represented by memory 1932 for storing instructions executable by processing component 1922 , such as application programs. The application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions. In addition, the processing component 1922 is configured to execute instructions to perform the above method.
电子设备1900还可以包括一个电源组件1926被配置为执行电子设备1900的电源管理,一个有线或无线网络接口1950被配置为将电子设备1900连接到网络,和一个输入输出接口1958。电子设备1900可以操作基于存储在存储器1932的操作系统,例如微软服务器操作系统(Windows Server TM),苹果公司推出的基于图形用户界面操作系统(Mac OS X TM),多用户多进程的计算机操作系统(Unix TM),自由和开放原代码的类Unix操作系统(Linux TM),开放原代码的类Unix操作系统(FreeBSD TM)或类似。 The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900 , a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input and output interface 1958 . The electronic device 1900 can operate based on the operating system stored in the memory 1932, such as the Microsoft server operating system (Windows Server TM ), the graphical user interface-based operating system (Mac OS X TM ) introduced by Apple Inc., and the multi-user and multi-process computer operating system (Unix ), a free and open source Unix-like operating system (Linux ), an open source Unix-like operating system (FreeBSD ), or the like.
在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器1932,上述计算机程序指令可由电子设备1900的处理组件1922执行以完成上述方法。In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium, such as the memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the electronic device 1900 to implement the above method.
本公开可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本公开的各个方面的计算机可读程序指令。The present disclosure can be a system, method and/or computer program product. A computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present disclosure.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是(但不限于)电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。A computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. A computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或 外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。Computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开的各个方面。Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect). In some embodiments, an electronic circuit, such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA), can be customized by utilizing state information of computer-readable program instructions, which can Various aspects of the present disclosure are implemented by executing computer readable program instructions.
这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。It is also possible to load computer-readable program instructions into a computer, other programmable data processing device, or other equipment, so that a series of operational steps are performed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , so that instructions executed on computers, other programmable data processing devices, or other devices implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基 本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.
该计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。The computer program product can be specifically realized by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. wait.
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。Having described various embodiments of the present disclosure above, the foregoing description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and alterations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principle of each embodiment, practical application or improvement of technology in the market, or to enable other ordinary skilled in the art to understand each embodiment disclosed herein.

Claims (21)

  1. 一种图像生成方法,其特征在于,包括:A method for generating an image, comprising:
    获取真实化图像生成网络以及目标风格化图像生成网络;Obtain a realistic image generation network and a target stylized image generation network;
    将同一随机数据分别输入至所述目标风格化图像生成网络以及所述真实化图像生成网络,得到所述目标风格化图像生成网络输出的目标风格化图像,以及所述真实化图像生成网络输出的目标真实化图像,所述目标风格化图像具有目标风格,其中,所述目标风格化图像生成网络是将所述真实化图像生成网络与原始风格化图像生成网络进行融合得到的,所述原始风格化图像生成网络用于生成具有所述目标风格的图像;Inputting the same random data into the target stylized image generation network and the realized image generation network respectively, to obtain the target stylized image output by the target stylized image generation network and the target stylized image output by the realized image generation network A target realized image, the target stylized image has a target style, wherein the target stylized image generation network is obtained by fusing the realized image generation network with an original stylized image generation network, and the original style A simplified image generation network is used to generate images with the target style;
    将同一个随机数据对应的所述目标风格化图像与所述目标真实化图像,确定为一对配对图像。The target stylized image and the target realized image corresponding to the same random data are determined as a pair of paired images.
  2. 根据权利要求1所述的方法,其特征在于,所述真实化图像生成网络与所述原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,所述将所述真实化图像生成网络与所述原始风格化图像生成网络进行融合,包括:The method according to claim 1, wherein the realized image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the realized The image generation network is fused with the original stylized image generation network, including:
    将所述原始风格化图像生成网络的前I层网络层,替换为所述真实化图像生成网络的前I层网络层,得到所述目标风格化图像生成网络,I∈[1,N);其中,I的值与所述目标风格化图像生成网络生成的目标风格化图像的风格化程度成负相关。The first I-layer network layer of the original stylized image generation network is replaced by the first I-layer network layer of the realized image generation network, to obtain the target stylized image generation network, I∈[1,N); Wherein, the value of I is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network.
  3. 根据权利要求2所述的方法,其特征在于,所述真实化图像生成网络和所述原始风格化图像生成网络的前I层网络层用于学习图像的低分辨率信息,低分辨率信息包括图像的边缘轮廓信息和风格信息;The method according to claim 2, characterized in that, the first I-layer network layers of the realized image generation network and the original stylized image generation network are used to learn low-resolution information of images, and the low-resolution information includes Image edge profile information and style information;
    其中,所述将所述原始风格化图像生成网络的前I层网络层,替换为所述真实化图像生成网络的前I层网络层,包括:Wherein, the first I-layer network layer of the original stylized image generation network is replaced by the first I-layer network layer of the realized image generation network, including:
    将所述原始风格化图像生成网络的前I层网络层学习的所述低分辨率信息,与所述真实化图像生成网络的前I层网络层学习的所述低分辨率信息进行交换。exchanging the low-resolution information learned by the first I-layer network layer of the original stylized image generation network with the low-resolution information learned by the first I-layer network layer of the realized image generation network.
  4. 根据权利要求1所述的方法,其特征在于,所述真实化图像生成网络与所述原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,所述将所述真实化图像生成网络与所述原始风格化图像生成网络进行融合,还包括:The method according to claim 1, wherein the realized image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the realized The image generation network is fused with the original stylized image generation network, and also includes:
    将所述原始风格化图像生成网络的后J层网络层,替换为所述真实化图像生成网络的后J层网络层,得到所述目标风格化图像生成网络,J∈[1,N);其中,J的值与所述目标风格化图像生成网络生成的目标风格化图像的风格化程度成负相关。The post J-layer network layer of the original stylized image generation network is replaced by the post J-layer network layer of the realized image generation network to obtain the target stylized image generation network, J∈[1,N); Wherein, the value of J is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network.
  5. 根据权利要求4所述的方法,其特征在于,所述真实化图像生成网络和所述原始风格化图像生成网络的后J层网络层用于学习图像的高分辨率信息,高分辨率信息包括图像的细节信息;The method according to claim 4, characterized in that, the rear J-layer network layer of the realized image generation network and the original stylized image generation network is used to learn high-resolution information of images, and the high-resolution information includes details of the image;
    其中,所述将所述原始风格化图像生成网络的后J层网络层,替换为所述真实化图像生成网络的后J层网络层,包括:Wherein, the post-J layer network layer of the original stylized image generation network is replaced by the post-J layer network layer of the realized image generation network, including:
    将所述原始风格化图像生成网络的后J层网络层学习的高分辨率信息,与所述真实化图像生成网络的后J层网络层学习的所述高分辨率信息进行交换。exchanging the high resolution information learned by the last J network layers of the original stylized image generation network with the high resolution information learned by the last J network layers of the realized image generation network.
  6. 根据权利要求1至5任一所述的方法,其特征在于,所述原始风格化图像生成网络是基于风格化样本图像对所述真实化图像生成网络进行迁移学习得到的,所述风格化样本图像具有所述目标风格。The method according to any one of claims 1 to 5, wherein the original stylized image generation network is obtained by performing transfer learning on the realistic image generation network based on stylized sample images, and the stylized sample images The image has the stated target style.
  7. 根据权利要求6所述的方法,其特征在于,所述基于风格化样本图像对所述真实化图像生成网络进行迁移学习,包括:The method according to claim 6, wherein the transfer learning of the realistic image generation network based on the stylized sample image comprises:
    获取所述真实化图像生成网络,以及具有所述目标风格的风格化样本图像;Obtaining the realistic image generation network and a stylized sample image with the target style;
    利用所述风格化样本图像,对所述真实化图像生成网络进行迁移学习,得到所述原始风格化图像生成网络。Using the stylized sample image, transfer learning is performed on the realistic image generation network to obtain the original stylized image generation network.
  8. 根据权利要求1至5任一所述的方法,其特征在于,所述真实化图像生成网络是通过对逐分辨率递增的图像生成式对抗网络模型进行网络训练得到的,所述真实化图像生成网络具有N层网络层,每n层网络层表示一个分辨率层级,所述真实化图像生成网络用于逐分辨率层级生成不同分辨率的图像,N为正整数,n∈[1,N)。The method according to any one of claims 1 to 5, wherein the realistic image generation network is obtained by performing network training on an image generation confrontation network model with increasing resolution, and the realistic image generation network The network has N layers of network layers, each n layer of network layers represents a resolution level, and the realistic image generation network is used to generate images of different resolutions by resolution level, N is a positive integer, n∈[1,N) .
  9. 根据权利要求1至5任一所述的方法,其特征在于,所述配对图像为多对,多对所述配对图像用于训练初始网络得到目标风格化网络,所述目标风格化网络用于将输入的图像转化为具有所述目标风格的图像。The method according to any one of claims 1 to 5, wherein the paired images are multiple pairs, and multiple pairs of the paired images are used to train the initial network to obtain a target stylized network, and the target stylized network is used for Transform the input image into an image with the described target style.
  10. 一种图像生成装置,其特征在于,包括:An image generating device, characterized in that it comprises:
    获取模块,用于获取真实化图像生成网络以及目标风格化图像生成网络;An acquisition module, configured to acquire a realistic image generation network and a target stylized image generation network;
    输出模块,用于将同一随机数据分别输入至所述目标风格化图像生成网络以及所述真实化图像生成网络,得到所述目标风格化图像生成网络输出的目标风格化图像,以及所述真实化图像生成网络输出的目标真实化图像,所述目标风格化图像具有目标风格,其中,所述目标风格化图像生成网络是将所述真实化图像生成网络与原始风格化图像生成网络进行融合得到的,所述原始风格化图像生成网络用于生成具有所述目标风格的图像;An output module, configured to input the same random data into the target stylized image generation network and the realized image generation network respectively, to obtain the target stylized image output by the target stylized image generation network, and the realized The target realized image output by the image generation network, the target stylized image has a target style, wherein the target stylized image generation network is obtained by fusing the realized image generation network with the original stylized image generation network , the original stylized image generation network is used to generate an image with the target style;
    确定模块,用于将同一个随机数据对应的所述目标风格化图像与所述目标真实化图像,确定为一对配对图像。A determining module, configured to determine the target stylized image and the target realized image corresponding to the same random data as a pair of paired images.
  11. 根据权利要求10所述的装置,其特征在于,所述真实化图像生成网络与所述原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,所述将所述真实化图像生成网络与所述原始风格化图像生成网络进行融合,包括:The device according to claim 10, wherein the realized image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the realized The image generation network is fused with the original stylized image generation network, including:
    将所述原始风格化图像生成网络的前I层网络层,替换为所述真实化图像生成网络的前I层网络层,得到所述目标风格化图像生成网络,I∈[1,N);其中,I的值与所述目标风格化图像生成网络生成的目标风格化图像的风格化程度成负相关。The first I-layer network layer of the original stylized image generation network is replaced by the first I-layer network layer of the realized image generation network, to obtain the target stylized image generation network, I∈[1,N); Wherein, the value of I is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network.
  12. 根据权利要求11所述的装置,其特征在于,所述真实化图像生成网络和所述原始风格化图像生成网络的前I层网络层用于学习图像的低分辨率信息,低分辨率信息包括图像的边缘轮廓信息和风格信息;The device according to claim 11, characterized in that, the first layer I network layers of the realistic image generation network and the original stylized image generation network are used to learn low-resolution information of images, and the low-resolution information includes Image edge profile information and style information;
    其中,通过如下方式将所述原始风格化图像生成网络的前I层网络层,替换为所述真实化图像生成网络的前I层网络层:Wherein, the first I-layer network layer of the original stylized image generation network is replaced by the first I-layer network layer of the realized image generation network in the following manner:
    将所述原始风格化图像生成网络的前I层网络层学习的所述低分辨率信息,与所述真实化图像生成网络的前I层网络层学习的所述低分辨率信息进行交换。exchanging the low-resolution information learned by the first I-layer network layer of the original stylized image generation network with the low-resolution information learned by the first I-layer network layer of the realized image generation network.
  13. 根据权利要求10所述的装置,其特征在于,所述真实化图像生成网络与所述原始风格化图像生成网络各自具有N层网络层,N为正整数,其中,所述将所述真实化图像生成网络与所述原始风格化图像生成网络进行融合,还包括:The device according to claim 10, wherein the realized image generation network and the original stylized image generation network each have N layers of network layers, and N is a positive integer, wherein the realized The image generation network is fused with the original stylized image generation network, and also includes:
    将所述原始风格化图像生成网络的后J层网络层,替换为所述真实化图像生成网络的后J层网络层,得到所述目标风格化图像生成网络,J∈[1,N);其中,J的值与所述目标风格化图像生成网络生成的目标风格化图像的风格化程度成负相关。The post J-layer network layer of the original stylized image generation network is replaced by the post J-layer network layer of the realized image generation network to obtain the target stylized image generation network, J∈[1,N); Wherein, the value of J is negatively correlated with the stylization degree of the target stylized image generated by the target stylized image generation network.
  14. 根据权利要求13所述的装置,其特征在于,所述真实化图像生成网络和所述原始风格化图像生成网络的后J层网络层用于学习图像的高分辨率信息,高分辨率信息包括图像的细节信息;The device according to claim 13, characterized in that, the last J-layer network layers of the realized image generation network and the original stylized image generation network are used to learn high-resolution information of images, and the high-resolution information includes details of the image;
    其中,所述将所述原始风格化图像生成网络的后J层网络层,替换为所述真实化图像生成网络的后J层网络层,包括:Wherein, the post-J layer network layer of the original stylized image generation network is replaced by the post-J layer network layer of the realized image generation network, including:
    将所述原始风格化图像生成网络的后J层网络层学习的所述高分辨率信息,与所述真实化图像生成网络的后J层网络层学习的所述高分辨率信息进行交换。exchanging the high-resolution information learned by the post-J network layers of the original stylized image generation network with the high-resolution information learned by the post-J network layers of the realized image generation network.
  15. 根据权利要求10至14任一项所述的装置,其特征在于,所述原始风格化图像生成网络是基于风格化样本图像对所述真实化图像生成网络进行迁移学习得到的,所述风格化样本图像具有所述目标风格。The device according to any one of claims 10 to 14, wherein the original stylized image generation network is obtained by performing transfer learning on the realistic image generation network based on stylized sample images, and the stylized The sample image has the target style.
  16. 根据权利要求15所述的装置,其特征在于,所述基于风格化样本图像对所述真 实化图像生成网络进行迁移学习,包括:The device according to claim 15, wherein the transfer learning of the realistic image generation network based on the stylized sample image comprises:
    获取所述真实化图像生成网络,以及具有所述目标风格的风格化样本图像;Obtaining the realistic image generation network and a stylized sample image with the target style;
    利用所述风格化样本图像,对所述真实化图像生成网络进行迁移学习,得到所述原始风格化图像生成网络。Using the stylized sample image, transfer learning is performed on the realistic image generation network to obtain the original stylized image generation network.
  17. 根据权利要求10至14任一项所述的装置,其特征在于,所述真实化图像生成网络是通过对逐分辨率递增的图像生成式对抗网络模型进行网络训练得到的,所述真实化图像生成网络具有N层网络层,每n层网络层表示一个分辨率层级,所述真实化图像生成网络用于逐分辨率层级生成不同分辨率的图像,N为正整数,n∈[1,N)。The device according to any one of claims 10 to 14, wherein the realistic image generation network is obtained by performing network training on an image generation confrontation network model that increases resolution by resolution, and the realistic image The generation network has N layers of network layers, and each n-layer network layer represents a resolution level, and the realistic image generation network is used to generate images of different resolutions by resolution level, N is a positive integer, n∈[1,N ).
  18. 根据权利要求10至14任一项所述的装置,其特征在于,所述配对图像为多对,多对所述配对图像用于训练初始网络得到目标风格化网络,所述目标风格化网络用于将输入的图像转化为具有所述目标风格的图像。The device according to any one of claims 10 to 14, wherein the paired images are multiple pairs, and the multiple pairs of paired images are used to train the initial network to obtain the target stylized network, and the target stylized network uses to transform the input image into an image with the stated target style.
  19. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器;processor;
    用于存储处理器可执行指令的存储器;memory for storing processor-executable instructions;
    其中,所述处理器被配置为调用所述存储器存储的指令,以执行权利要求1至9中任意一项所述的方法。Wherein, the processor is configured to invoke instructions stored in the memory to execute the method according to any one of claims 1-9.
  20. 一种计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1至9中任意一项所述的方法。A computer-readable storage medium on which computer program instructions are stored, wherein the computer program instructions implement the method according to any one of claims 1 to 9 when executed by a processor.
  21. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行用于实现权利要求1至9中的任意一项所述的方法。A computer program, comprising computer readable code, when the computer readable code is run in the electronic device, the processor in the electronic device executes the method for implementing any one of claims 1 to 9 method.
PCT/CN2022/125425 2021-11-26 2022-10-14 Image generation method and apparatus, and electronic device and storage medium WO2023093356A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111417366.5A CN113837934B (en) 2021-11-26 2021-11-26 Image generation method and device, electronic equipment and storage medium
CN202111417366.5 2021-11-26

Publications (1)

Publication Number Publication Date
WO2023093356A1 true WO2023093356A1 (en) 2023-06-01

Family

ID=78971499

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125425 WO2023093356A1 (en) 2021-11-26 2022-10-14 Image generation method and apparatus, and electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN113837934B (en)
WO (1) WO2023093356A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837934B (en) * 2021-11-26 2022-02-22 北京市商汤科技开发有限公司 Image generation method and device, electronic equipment and storage medium
CN114418835A (en) * 2022-01-25 2022-04-29 北京字跳网络技术有限公司 Image processing method, apparatus, device and medium
CN114418919B (en) * 2022-03-25 2022-07-26 北京大甜绵白糖科技有限公司 Image fusion method and device, electronic equipment and storage medium
CN115357218A (en) * 2022-08-02 2022-11-18 北京航空航天大学 High-entropy random number generation method based on chaos prediction antagonistic learning
CN115170390B (en) * 2022-08-31 2023-01-06 广州极尚网络技术有限公司 File stylization method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200151938A1 (en) * 2018-11-08 2020-05-14 Adobe Inc. Generating stylized-stroke images from source images utilizing style-transfer-neural networks with non-photorealistic-rendering
CN111784565A (en) * 2020-07-01 2020-10-16 北京字节跳动网络技术有限公司 Image processing method, migration model training method, device, medium and equipment
CN112967174A (en) * 2021-01-21 2021-06-15 北京达佳互联信息技术有限公司 Image generation model training method, image generation device and storage medium
CN113111791A (en) * 2021-04-16 2021-07-13 深圳市格灵人工智能与机器人研究院有限公司 Image filter conversion network training method and computer readable storage medium
CN113837934A (en) * 2021-11-26 2021-12-24 北京市商汤科技开发有限公司 Image generation method and device, electronic equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10565757B2 (en) * 2017-06-09 2020-02-18 Adobe Inc. Multimodal style-transfer network for applying style features from multi-resolution style exemplars to input images
CN110310221B (en) * 2019-06-14 2022-09-20 大连理工大学 Multi-domain image style migration method based on generation countermeasure network
CN111223039A (en) * 2020-01-08 2020-06-02 广东博智林机器人有限公司 Image style conversion method and device, electronic equipment and storage medium
CN111402112A (en) * 2020-03-09 2020-07-10 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and computer readable medium
CN111667400B (en) * 2020-05-30 2021-03-30 温州大学大数据与信息技术研究院 Human face contour feature stylization generation method based on unsupervised learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200151938A1 (en) * 2018-11-08 2020-05-14 Adobe Inc. Generating stylized-stroke images from source images utilizing style-transfer-neural networks with non-photorealistic-rendering
CN111784565A (en) * 2020-07-01 2020-10-16 北京字节跳动网络技术有限公司 Image processing method, migration model training method, device, medium and equipment
CN112967174A (en) * 2021-01-21 2021-06-15 北京达佳互联信息技术有限公司 Image generation model training method, image generation device and storage medium
CN113111791A (en) * 2021-04-16 2021-07-13 深圳市格灵人工智能与机器人研究院有限公司 Image filter conversion network training method and computer readable storage medium
CN113837934A (en) * 2021-11-26 2021-12-24 北京市商汤科技开发有限公司 Image generation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113837934B (en) 2022-02-22
CN113837934A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
WO2023093356A1 (en) Image generation method and apparatus, and electronic device and storage medium
US10901740B2 (en) Synthetic depth image generation from cad data using generative adversarial neural networks for enhancement
US11042782B2 (en) Topic-guided model for image captioning system
WO2020063475A1 (en) 6d attitude estimation network training method and apparatus based on deep learning iterative matching
US8923392B2 (en) Methods and apparatus for face fitting and editing applications
JP7225188B2 (en) Method and apparatus for generating video
CN107438866A (en) Depth is three-dimensional:Study predicts new view from real world image
CN111275784A (en) Method and device for generating image
WO2022012179A1 (en) Method and apparatus for generating feature extraction network, and device and computer-readable medium
CN111539897A (en) Method and apparatus for generating image conversion model
US10783660B2 (en) Detecting object pose using autoencoders
WO2023030381A1 (en) Three-dimensional human head reconstruction method and apparatus, and device and medium
WO2023024653A1 (en) Image processing method, image processing apparatus, electronic device and storage medium
US11836836B2 (en) Methods and apparatuses for generating model and generating 3D animation, devices and storage mediums
CN113379877B (en) Face video generation method and device, electronic equipment and storage medium
US10445921B1 (en) Transferring motion between consecutive frames to a digital image
CN110827341A (en) Picture depth estimation method and device and storage medium
US10839249B2 (en) Methods and systems for analyzing images utilizing scene graphs
US20230237713A1 (en) Method, device, and computer program product for generating virtual image
US11138781B1 (en) Creation of photorealistic 3D avatars using deep neural networks
CN115049537A (en) Image processing method, image processing device, electronic equipment and storage medium
JP2014149788A (en) Object area boundary estimation device, object area boundary estimation method, and object area boundary estimation program
CN113223128B (en) Method and apparatus for generating image
US11670029B2 (en) Method and apparatus for processing character image data
US11222200B2 (en) Video-based 3D hand pose and mesh estimation based on temporal-aware self-supervised learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22897439

Country of ref document: EP

Kind code of ref document: A1