CN114792284A - Image switching method and device, storage medium and electronic equipment - Google Patents

Image switching method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN114792284A
CN114792284A CN202210232666.4A CN202210232666A CN114792284A CN 114792284 A CN114792284 A CN 114792284A CN 202210232666 A CN202210232666 A CN 202210232666A CN 114792284 A CN114792284 A CN 114792284A
Authority
CN
China
Prior art keywords
image
sample
network
switching
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210232666.4A
Other languages
Chinese (zh)
Inventor
黄坤山
蔡海军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Geshen Information Technology Co ltd
Original Assignee
Guangzhou Geshen Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Geshen Information Technology Co ltd filed Critical Guangzhou Geshen Information Technology Co ltd
Priority to CN202210232666.4A priority Critical patent/CN114792284A/en
Publication of CN114792284A publication Critical patent/CN114792284A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses an image switching method and device, a storage medium and electronic equipment. Wherein, the method comprises the following steps: displaying a real image of a currently live anchor object in a client and an image switching control for controlling and switching the image presented by the anchor object; responding to a triggering operation executed on an image switching control, and calling an image switching network deployed in a client, wherein the image switching network is an image switching network obtained by training a plurality of sample image pairs, each sample image pair comprises a sample original image and a sample target image, the sample original image displays a real image of a sample object, and the sample target image displays an avatar of the sample object; and synchronously switching the target real image of the anchor object into a target virtual image matched with the anchor object through an image switching network. The invention solves the technical problem that the image switching can not be carried out at the mobile terminal due to larger image model data volume.

Description

Image switching method and device, storage medium and electronic equipment
Technical Field
The invention relates to the field of image processing, in particular to an image switching method and device, a storage medium and electronic equipment.
Background
At present, various types of images are generally generated by depending on a trained image generation model, for example, in the field of live broadcast, a special effect image of human face animation is generally generated by depending on the trained animation image generation model.
The image generation model is usually trained by an unsupervised training method because no image is available for constraint, but the image generation model trained by the unsupervised training method has a large data volume due to a complex model structure. However, a large image generation model can only be arranged in the server, if the image generation model is arranged in the mobile terminal, if the image generation model is not compressed, the mobile terminal cannot operate due to a large amount of model data, and if the image generation model is compressed, the image generation model cannot operate normally or the operation processing speed of the model is slow due to compression.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides an image switching method and device, a storage medium and electronic equipment, which at least solve the technical problem that image switching cannot be carried out at a mobile terminal due to large image model data volume.
According to an aspect of an embodiment of the present invention, there is provided a character switching method including: displaying the real image of the anchor object currently live in the client and an image switching control for controlling and switching the image presented by the anchor object; invoking an image switching network deployed in the client in response to a triggering operation executed on the image switching control, wherein the image switching network is an image switching network trained by using a plurality of sample image pairs, each sample image pair comprises a sample original image and a sample target image, the sample original image displays a real image of a sample object, and the sample target image displays an avatar of the sample object; and switching the real image of the anchor object into a target virtual image matched with the anchor object through the image switching network.
According to another aspect of the embodiments of the present invention, there is also provided an avatar switching apparatus including: the display unit is used for displaying the real image of the anchor object which is currently live in the client and controlling and switching the image switching control of the image presented by the anchor object; a calling unit, configured to call an image switching network deployed in the client in response to a trigger operation performed on the image switching control, where the image switching network is an image switching network obtained by training a plurality of sample image pairs, each sample image pair includes a sample original image and a sample target image, the sample original image displays a real image of a sample object, and the sample target image displays an avatar of the sample object; a switching unit, configured to switch the avatar of the anchor object to a target avatar matched with the anchor object through the avatar switching network, where a similarity between the target avatar and the target avatar is greater than a first threshold.
According to still another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to execute the above-mentioned character switching method when running.
According to still another aspect of the embodiments of the present invention, there is also provided an electronic device including a memory in which a computer program is stored and a processor configured to execute the above-mentioned character switching method by the computer program.
In the embodiment of the invention, the real image of the anchor object which is currently live broadcast is displayed in the client and the image switching control for controlling and switching the image presented by the anchor object is adopted, the image switching network deployed in the client is called in response to the triggering operation executed on the image switching control, the image switching network is an image switching network obtained by training by utilizing a sample image, the real image of the anchor object presented in the client is switched into a target virtual image matched with the anchor object through the image switching network, the image switching network in the client is called through the image switching control in the client, and then the real image of the anchor object is converted into the target virtual image through the image switching network in the client, thereby achieving the aim of deploying the light-weight image switching network for converting the real image of the anchor object into the target virtual image at the client, therefore, the technical effect that a lightweight image switching network is deployed in the client to realize anchor image switching is achieved, and the technical problem that image switching cannot be carried out at the mobile terminal due to large image model data volume is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a schematic diagram of an application environment of an alternative character switching method according to an embodiment of the present invention;
FIG. 2 is a flow chart of an alternative avatar switching method according to an embodiment of the present invention;
FIG. 3 is a flow chart of an alternative avatar switching method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an image generator of an image generation network of an alternative avatar switching method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an image discriminator of an image generation network of an alternative avatar switching method according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of CAN full connection in an image discriminator of an image generation network of an alternative image switching method according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of an initial avatar-switching network of an image generation network of an alternative avatar-switching method according to an embodiment of the present invention;
FIG. 8 is a flow chart diagram illustrating an alternative character switching method according to an embodiment of the present invention;
FIG. 9 is a schematic view of an alternative embodiment of a character shifting apparatus according to the present invention;
fig. 10 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an aspect of an embodiment of the present invention, there is provided a character switching method, which may be optionally but not limited to be applied in an environment as shown in fig. 1. The client 102 is operated in the terminal device 100, and the anchor is not limited to initiating a live broadcast request to the server 120 through the client 102 over the network 110 to request live broadcast through the client 102. The process of live broadcasting through the client 102 is not limited to performing character switching, and the switched target avatar is transmitted to the server 120 through the network 110, so that the server 120 streams the switched target avatar of the anchor character to the watching client.
The avatar switching is not limited to be implemented by sequentially performing S102 to S106. And S102, displaying the real image of the anchor object. And displaying the real image of the anchor object which is currently live in the client, and an image switching control for controlling and switching the image presented by the anchor object. And S104, calling the image switching network. And calling an image switching network deployed in the client in response to a triggering operation executed on the image switching control, wherein the image switching network is an image switching network obtained by training a plurality of sample image pairs, each sample image pair comprises a sample original image and a sample target image, the sample original image displays a real image of a sample object, and the sample target image displays an avatar of the sample object. And S106, switching to the target virtual image. And synchronously switching the real image of the anchor object into a target virtual image matched with the anchor object through an image switching network.
Optionally, in this embodiment, the terminal device 100 may be a terminal device configured with a target client, and may include but is not limited to at least one of the following: mobile phones (such as Android Mobile phones, IOS Mobile phones, etc.), notebook computers, tablet computers, palm computers, MID (Mobile Internet Devices), PAD, desktop computers, smart televisions, etc. The target client is not limited to a client having a live broadcast function, and is specifically not limited to an audio client, a video client, an instant messaging client, a browser client, an education client, and the like. The network 110 may include, but is not limited to: a wired network, a wireless network, wherein the wired network comprises: a local area network, a metropolitan area network, and a wide area network, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communication. The server 120 may be a single server, a server cluster composed of a plurality of servers, or a cloud server. The above is only an example, and this is not limited in this embodiment.
As an alternative embodiment, as shown in fig. 2, the image switching method includes:
s202, displaying the real image of the anchor object currently live in the client and an image switching control for controlling and switching the image presented by the anchor object;
and S204, responding to the triggering operation executed on the image switching control, and calling the image switching network deployed in the client.
In the above S204, the image switching network is an image switching network obtained by training a plurality of sample image pairs, each sample image pair includes a sample original image and a sample target image, the sample original image displays a real image of the sample object, and the sample target image displays a virtual image of the sample object;
s206, the real image of the anchor object is switched into a target virtual image matched with the anchor object through the image switching network.
The client displays the anchor object which is currently live, the displayed image of the anchor object is not limited to the trigger operation of the anchor-based image switching control, and the real image of the anchor object is displayed under the condition that the trigger operation is not performed on the image switching control. And under the condition of executing the triggering operation on the character switching control, determining to display the virtual character corresponding to the anchor object. The avatar is not limited to an avatar obtained by converting the target dimension according to the real avatar of the anchor object. The avatar is not limited to an animated avatar, cartoon avatar, or the like.
The method comprises the steps of switching the real image of the anchor object into the virtual image without being limited to calling an image switching network through an image switching control, and converting the target real image of the anchor object into the target virtual image through the image switching network, so that the target virtual image is displayed in a client side, and the image switching of the anchor object is realized.
As an optional implementation manner, the synchronously switching the target avatar of the anchor object to the target avatar matching the anchor object through the avatar switching network includes:
s206-1, inputting a target real image of the target real image displayed with the anchor object into an image switching network, wherein the network data volume of the image switching network is less than a target threshold value;
s206-2, a target virtual image output by the image switching network is obtained, wherein the target virtual image displays a target virtual image of the anchor object.
The avatar switching of the anchor object by the avatar switching network is not limited to the use of the avatar switching network to infer an image of a target avatar including the anchor object from an image of a target avatar including the anchor object, and switching of the avatar is achieved based on inference conversion of the image including the avatar.
The network data volume of the image switching network is smaller than a target threshold value, a Pix2Pix neural network model is not limited to be selected, the model data volume is small, the precision is high, and therefore the image switching network is deployed in a client of a mobile terminal, the image switching network is directly called to carry out virtual image reasoning by responding to triggering operation of an image switching control, the target virtual image corresponding to a main broadcasting object is rapidly deduced, and image switching is achieved.
In the embodiment of the application, the real image of the anchor object currently being live broadcast is displayed in the client, and the image switching control is used for controlling and switching the image presented by the anchor object, the image switching network deployed in the client is called in response to the triggering operation executed on the image switching control, the image switching network is an image switching network obtained by training by using sample images, the real image of the anchor object presented in the client is switched into a target virtual image matched with the anchor object through the image switching network, the image switching network in the client is called through the image switching control in the client, and then the real image of the anchor object is converted into the target virtual image through the image switching network, so that the aim of deploying the light-weight image switching network in the client for converting the real image of the anchor object into the virtual image is fulfilled, therefore, the technical effect that a lightweight image switching network is deployed in the client to realize anchor image switching is achieved, and the technical problem that image switching cannot be carried out at the mobile terminal due to the fact that image model data size is large is solved.
As an alternative implementation, as shown in fig. 3, before displaying, in the client, the real character of the anchor object currently being live and the switching control for controlling switching of the character presented by the anchor object, the method further includes:
s302, a sample image set is obtained, wherein the sample image set comprises a plurality of sample image pairs;
s304, carrying out supervision training on the initial image switching network by utilizing a plurality of sample images in the sample image set until the initial image switching network meeting the conversion convergence condition is obtained, wherein the supervision training is to use the sample target images to carry out constraint when the initial image switching network carries out generation training of simulation target images based on the sample original images;
s306, extracting the image generator in the initial image switching network, and determining the image generator as the image switching network.
The method comprises the steps of utilizing a sample original image comprising a real image of a sample object and a sample target image comprising a virtual image of the sample object as an image pair, carrying out supervision training on an initial image switching network, utilizing the sample target image to carry out image generation training Chinese restraint in the network training process, and enabling the initial image switching network to carry out image reasoning according to the sample target image, so that the network structure of the image switching network obtained by training is lighter, and the network data amount is smaller than a target threshold value.
The acquisition of the sample image pair is not limited to the acquisition through an image generation network, and the sample target image corresponding to the original sample image is acquired through the image generation network, and after the sample image pair is obtained, the image switching network training is performed on the sample image pair. As an optional implementation, the acquiring the sample image set includes:
s302-1, training an image generation network by using a plurality of original images of the sample to obtain the image generation network meeting a generation convergence condition, wherein the generation convergence condition is used for indicating that the similarity between a sample virtual image in a sample simulation image output by the image generation network and a sample real image in the original image of the sample is greater than a second threshold value;
s302-2, sequentially inputting the original images of the multiple samples into an image generation network to obtain a sample target image output by the image generation network;
s302-3, constructing a sample image pair by using the sample target image and the sample original image.
In order to obtain the sample target image of the avatar corresponding to the real avatar in the sample original image, the method is not limited to training the image generation network with the sample original image, and after the image generation network satisfies the generation convergence condition, the sample target image corresponding to the sample original image is generated with the image generation network satisfying the generation convergence condition. And constructing the sample original image and the corresponding sample target image into a sample image pair. And sequentially acquiring a sample target image corresponding to each sample original image so as to obtain a sample image set comprising a plurality of sample image pairs, thereby realizing the self-supervision training of the image switching network.
As an optional implementation manner, the training of the image generation network by using a plurality of sample raw images to obtain an image generation network satisfying a generation convergence condition includes:
s302-101, inputting the sample original image into an image generator in an image generation network to obtain a sample simulation image generated by the image generator;
s302-102, inputting the sample simulation image into an image discriminator in an image generation network, and obtaining a discrimination result of the image discriminator for discriminating the truth of the sample simulation image;
s302-103, optimizing the network parameters of the image generation network according to the judgment result until the judgment result meets the judgment convergence parameters indicated by the generation convergence condition.
The image generation network is not limited to include an image generator for generating an avatar-corresponding image from an input sample original image to obtain a sample simulation image, and an image discriminator. The image discriminator is used for discriminating the truth of the sample simulation image. And performing parameter optimization on the image discriminator and the image generator according to the truth discrimination result of the sample simulation image by the image discriminator to obtain an image generation network meeting the generation convergence condition.
The image generation network is not limited to generating a countermeasure network (GAN) model. The structure of the image generator is not limited to that shown in fig. 4. The sample original image is input to an encoder 410, and the encoder 410 includes downsampling and a residual block. The input sample original image is downsampled, the image feature extraction of the sample original image is enhanced by matching with a residual block, the key area analysis is carried out by utilizing an attention feature map 420, and the decoder 440 is connected through a full connection 430. The decoder 440 combines the upsampling with the adaptive residual block under the direction of AdaLIN to obtain a sample virtual image.
The structure of the image discriminator is not limited to that shown in fig. 5. The input is a sample virtual image generated by the image generator, and the result of the determination as to whether the sample virtual image is a real image is finally output through the attention feature map 520 and the full link 530 after the down-sampling and CAM processing in the encoder 510. When the output result indicates that the sample virtual image is a false image, that is, the sample virtual image is a generated image, it is determined that the image discriminator at this time is more perfect than the image generator, and it is necessary to continuously optimize parameters of the image generator so that the sample virtual image generated by the image generator is not discriminated as a false image by the image discriminator.
The processing procedure of CAM full concatenation in the image discriminator is not limited to performing an average pooling and a maximum pooling operation on the down-sampled feature maps and compressing the respective features into B × 1 dimensions by full concatenation as shown in fig. 6. On the other hand, the weight parameters obtained by the average pooling and maximum pooling operations performed on the feature map are bit-multiplied with the original feature map again, and pass through a convolution layer once, so that the attention mechanism under the feature map is realized.
After CAM connection, AdaLIN is mainly composed of a combination of instant Normalization and Layer Normalization, without limitation to the action of AdaLIN on the resulting attention profile. The specific operation is not limited to the following formula (1).
Figure BDA0003539073240000101
Where a is the expected value of the batch size data, μ I And mu L Batchsize data mean, σ, for instance and layer, respectively I 2 And σ L 2 Then is the variance of the blocksize data for instance and layer. E is a fixed value, p is a representative probability, max is 1, and γ and β are the scaling and translation quantities, respectively.
As an optional implementation manner, the performing supervised training on the initial avatar switching network by using multiple sample images in the sample image set until obtaining the initial avatar switching network that satisfies the transformation convergence condition includes:
s304-1, inputting the sample original image into an initial image generator in an initial image switching network, and acquiring a simulation target image generated by the initial image generator;
s304-2, inputting the sample original image and the simulation target image into an initial image discriminator to obtain a simulation discrimination result of the initial image discriminator, wherein the sample original image and the simulation target image are real image pairs;
s304-3, inputting the sample original image and the sample target image into an initial image discriminator to obtain a sample discrimination result of the initial image discriminator for the sample original image and the sample target image which are a real image pair;
s304-4, calculating a network loss value of the initial image switching network according to the simulation judgment result and the sample judgment result;
s304-5, under the condition that the network loss value is smaller than the preset loss value indicated by the conversion convergence condition, determining to obtain the initial image switching network meeting the conversion convergence condition.
The initial character switching network is not limited to include an initial character generator for generating a simulation target image from the sample target image under the constraint of the sample target image and an initial character discriminator. The initial image discriminator is used for discriminating the simulation target image.
The structure of the initial image switching network is not limited to that shown in fig. 7, and the generator model G generates the simulation target image G (x) from the sample original image x, the simulation target image G (x) and the sample target image y are input as the discriminator model D, and the simulation target image G (x) and the sample target image y are discriminated by the discriminator model D, thereby obtaining the authenticity discrimination result output by the discriminator model D.
The avatar discriminator in the initial avatar switching network is not limited to the authenticity discrimination of the pair of the sample original image and the pair of the simulation target image and the sample target image each composed. The simulation discrimination result is used for indicating the discrimination result of the image discriminator that the sample original image and the simulation target image are real image pairs, and the sample discrimination result is used for indicating the discrimination result of the image discriminator that the sample original image and the sample target image are real image pairs.
The training target for the initial character discriminator is not limited to outputting a probability value as large as possible when the input is a pair of real images, such as the above-described sample original image x and sample target image y, the maximum probability value being 1. When the input is not a pair of real images, for example, the sample original image x and the simulation target image g (x) described above, a probability value as small as possible is output, and the minimum probability value is 0.
The network loss value is not limited to include a generation loss value and a discrimination loss value. The generation loss value is used for indicating the loss of the initial image generator when the simulation target image is generated, and the discrimination loss value is used for indicating the discrimination loss of the initial image discriminator on the simulation target image and the sample target image.
And when the network loss value is smaller than the loss value indicated by the conversion convergence condition, determining that the current initial image switching network reaches the conversion convergence condition, and determining an initial image generator in the current initial image switching network as the image switching network. And when the network loss value is greater than or equal to the loss value indicated by the conversion convergence condition, performing parameter optimization on the initial image generator and the initial image discriminator according to the simulation discrimination result and the sample discrimination result.
As an optional implementation, the calculating a network loss value of the initial image switching network according to the simulation judgment result and the sample judgment result includes:
s304-401, calculating a discrimination loss value according to the simulation discrimination result and the sample discrimination result;
s304-402, calculating the image distance between the sample target image and the simulation target image to obtain a generation loss value;
s304-403, carrying out weighted calculation on the discrimination loss value and the generation loss value to obtain a network loss value;
specifically, the calculation of the discrimination loss value is not limited to the following formula (2):
L D =E y [logD(x,y)]+E G [log(1-D(x,G(x)))] (2)
wherein E is y Is the expectation of the sample discrimination result, E G Is expected for simulating the discrimination result.
Specifically, the calculation of the generation loss value is not limited to the following formula (3):
L C =E C [||y-G(x)||] (3)
wherein | y-G (x) | is the image distance between the sample target image and the simulated target image, E C Is the image distance expectation.
Specifically, the calculation of the network loss value is not limited to the following formula (4):
G=L D +λL C (4)
wherein, the lambda is a scale factor and takes any value from 0 to 1.
The process of obtaining the animation image switching network is not limited to that shown in fig. 8, taking the virtual image as the animation image and the image containing the image as the face image. A1, training a human face animation data generation module by using the real human face image. After the face animation data generation module finishes training, A2, inputting the real face image into the face animation data generation module to obtain the corresponding animation face image. A3, constructing an image data pair using the real face image and the cartoon face image. A4, using image data constructed by the real face image and the cartoon face image to train the face cartoon reasoning module. After the training of the face animation inference module is finished, A5, the animation generation module in the face animation inference module is determined as the animation image switching network.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art will appreciate that the embodiments described in this specification are presently preferred and that no acts or modules are required by the invention.
According to another aspect of an embodiment of the present invention, there is also provided an avatar switching apparatus for implementing the above-described avatar switching method. As shown in fig. 9, the apparatus includes:
a display unit 902, configured to display, in the client, a real character of a anchor object currently being live broadcast, and a character switching control for controlling switching of a character presented by the anchor object;
a calling unit 904, configured to call, in response to a trigger operation performed on the avatar switching control, an avatar switching network deployed in the client, where the avatar switching network is a lightweight neural network for converting a real avatar presented in the client into an avatar;
a switching unit 906, configured to synchronously switch a target avatar of the anchor object to a target avatar matched with the anchor object through an avatar switching network, wherein a similarity between the target avatar and the target avatar is greater than a first threshold.
Optionally, the switching unit 906 is further configured to input a target real image, on which a target real image of the anchor object is displayed, into the image switching network, wherein a network data amount of the image switching network is smaller than a target threshold; and acquiring a target virtual image output by the image switching network, wherein the target virtual image displays a target virtual image of the anchor object.
Optionally, the above character switching apparatus further includes a first training unit, configured to display, in the client, a real character of a anchor object currently being live, and before a switching control for controlling switching of a character presented by the anchor object, the first training unit includes:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a sample image set, the sample image set comprises a plurality of sample image pairs, each sample image pair comprises a sample original image and a sample target image, the sample original image displays the real image of a sample object, and the sample target image displays the virtual image of the sample object;
the first training module is used for carrying out supervision training on the initial image switching network by utilizing a plurality of sample images in the sample image set until the initial image switching network meeting the conversion convergence condition is obtained, wherein the supervision training is to use a sample target image to carry out constraint when the initial image switching network carries out generation training of a simulation target image based on a sample original image;
and the extraction module is used for extracting the image generator in the initial image switching network and determining the image generator as the image switching network.
Optionally, the obtaining module is further configured to train the image generation network by using a plurality of sample original images to obtain an image generation network meeting a generation convergence condition, where the generation convergence condition is used to indicate that a similarity between a sample virtual image in a sample simulation image output by the image generation network and a sample real image in the sample original image is greater than a second threshold; sequentially inputting a plurality of sample original images into an image generation network to obtain sample target images output by the image generation network; and constructing a sample image pair by using the sample target image and the sample original image.
Optionally, the obtaining module is further configured to input the original sample image into an image generator in the image generation network, so as to obtain a sample simulation image generated by the image generator; inputting the sample simulation image into an image discriminator in an image generation network, and acquiring a discrimination result of the image discriminator for discriminating the truth of the sample simulation image; and optimizing the network parameters of the image generation network according to the judgment result until the judgment result meets the judgment convergence parameter indicated by the generated convergence condition.
Optionally, the first training module includes:
the input module is used for inputting the original sample image into an initial image generator in an initial image switching network and acquiring a simulated target image generated by the initial image generator;
the simulation judging module is used for inputting the sample original image and the simulation target image into the initial image discriminator so as to obtain a simulation judging result of the initial image discriminator for the sample original image and the simulation target image which are a real image pair;
the sample distinguishing module is used for inputting the sample original image and the sample target image into the initial image discriminator so as to obtain a sample distinguishing result of the initial image discriminator, wherein the sample original image and the sample target image are a real image pair;
the calculation module is used for calculating a network loss value of the initial image switching network according to the simulation judgment result and the sample judgment result;
and the determining module is used for determining to obtain the initial image switching network meeting the conversion convergence condition under the condition that the network loss value is less than the preset loss value indicated by the conversion convergence condition.
Optionally, the calculating module is further configured to calculate a discrimination loss value according to the simulation discrimination result and the sample discrimination result; calculating the image distance between the sample target image and the simulated target image to obtain a generation loss value; carrying out weighted calculation on the discrimination loss value and the generation loss value to obtain a network loss value;
in the embodiment of the application, the real image of the anchor object which is currently live broadcast is displayed in the client, the image switching control for controlling and switching the image presented by the anchor object is adopted, the image switching network deployed in the client is called in response to the triggering operation executed on the image switching control, the image switching network is an image switching network obtained by training by utilizing a sample image, the real image of the anchor object presented in the client is switched into a target virtual image matched with the anchor object through the image switching network, the image switching network in the client is called through the image switching control in the client, and then the real image of the anchor object is converted into the target virtual image through the image switching network, so that the aim of deploying the light-weight image switching network in the client for converting the real image of the anchor object into the target virtual image is fulfilled, therefore, the technical effect that a lightweight image switching network is deployed in the client to realize anchor image switching is achieved, and the technical problem that image switching cannot be carried out at the mobile terminal due to the fact that image model data size is large is solved.
According to still another aspect of an embodiment of the present invention, there is also provided an electronic device for implementing the above-described character switching method, which may be a terminal device or a server shown in fig. 1. The present embodiment takes the electronic device as a terminal device as an example for explanation. As shown in fig. 10, the electronic device comprises a memory 1002 and a processor 1004, the memory 1002 having stored therein a computer program, the processor 1004 being arranged to execute the steps of any of the method embodiments described above by means of the computer program.
Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of a computer network.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, displaying the real image of the anchor object currently live broadcasting and an image switching control for controlling and switching the image presented by the anchor object in the client;
s2, responding to the trigger operation executed on the image switching control, and calling an image switching network deployed in the client, wherein the image switching network is an image switching network obtained by training a plurality of sample image pairs, each sample image pair comprises a sample original image and a sample target image, the sample original image displays the real image of a sample object, and the sample target image displays the virtual image of the sample object;
and S3, switching the real image of the anchor object into a target virtual image matched with the anchor object through the image switching network.
Alternatively, it may be understood by those skilled in the art that the structure shown in fig. 10 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an IOS phone, etc.), a tablet computer, a palmtop computer, and a Mobile Internet Device (MID), a PAD, and the like. Fig. 10 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
The memory 1002 may be used to store software programs and modules, such as program instructions/modules corresponding to the image switching method and apparatus in the embodiments of the present invention, and the processor 1004 executes various functional applications and data processing by running the software programs and modules stored in the memory 1002, that is, implementing the image switching method. The memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1002 may further include memory located remotely from the processor 1004, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1002 may be, but not limited to, specifically configured to store information such as a real character, an avatar, and a character switching network. As an example, as shown in fig. 10, the memory 1002 may include, but is not limited to, the display unit 902, the calling unit 904, and the switching unit 906 of the character switching apparatus. In addition, other module units in the above image switching device may also be included, but are not limited to these, and are not described in detail in this example.
Optionally, the transmission device 1006 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 1006 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices so as to communicate with the internet or a local area Network. In one example, the transmission device 1006 is a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
In addition, the electronic device further includes: a display 1008 for displaying the real image and the virtual image; and a connection bus 1010 for connecting the respective module parts in the above-described electronic apparatus.
In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting a plurality of nodes through a network communication. Nodes can form a Peer-To-Peer (P2P, Peer To Peer) network, and any type of computing device, such as a server, a terminal, and other electronic devices, can become a node in the blockchain system by joining the Peer-To-Peer network.
According to an aspect of the application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the methods provided in the various alternative implementations of the avatar switching aspect described above. Wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:
s1, displaying the real image of the anchor object currently live broadcasting and an image switching control for controlling and switching the image presented by the anchor object in the client;
s2, responding to a trigger operation executed on the image switching control, and calling an image switching network deployed in a client, wherein the image switching network is an image switching network obtained by training a plurality of sample image pairs, each sample image pair comprises a sample original image and a sample target image, the sample original image displays a real image of a sample object, and the sample target image displays an avatar of the sample object;
and S3, synchronously switching the target real image of the anchor object into a target virtual image matched with the anchor object through the image switching network.
Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be essentially or partially contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, or network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.
In the above embodiments of the present invention, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described in detail in a certain embodiment.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be implemented in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and amendments can be made without departing from the principle of the present invention, and these modifications and amendments should also be considered as the protection scope of the present invention.

Claims (10)

1. A character switching method, comprising:
displaying a real image of a anchor object which is currently live in a client side, and an image switching control for controlling and switching the image presented by the anchor object;
invoking an avatar switching network deployed in the client in response to a triggering operation performed on the avatar switching control, wherein the avatar switching network is an image switching network trained by using a plurality of sample image pairs, each sample image pair includes a sample original image and a sample target image, the sample original image displays a real avatar of a sample object, and the sample target image displays an avatar of the sample object;
and switching the real image of the anchor object into a target virtual image matched with the anchor object through the image switching network.
2. The method of claim 1, wherein said switching a real avatar of said anchor object to a target avatar matching said anchor object through said avatar switching network comprises:
inputting a target real image on which a target real image of the anchor object is displayed into the image switching network, wherein a network data amount of the image switching network is smaller than a target threshold;
and acquiring a target virtual image output by the image switching network, wherein the target virtual image displays the target virtual image of the anchor object.
3. The method of claim 1, wherein prior to displaying in the client a real character of a anchor object currently being live and a toggle control for controlling toggling the character presented by the anchor object, further comprising:
obtaining a sample image set, wherein the sample image set comprises the plurality of sample image pairs;
performing supervised training on an initial image switching network by using the plurality of sample images in the sample image set until the initial image switching network meeting a conversion convergence condition is obtained, wherein the supervised training is to perform constraint when performing generation training of a simulation target image on the initial image switching network based on the sample original image by using the sample target image;
and extracting an image generator in the initial image switching network, and determining the image generator as the image switching network.
4. The method of claim 3, wherein the obtaining a sample set of images comprises:
training an image generation network by using a plurality of sample original images to obtain the image generation network meeting a generation convergence condition, wherein the generation convergence condition is used for indicating that the similarity between a sample virtual image in a sample simulation image output by the image generation network and a sample real image in the sample original images is greater than a second threshold value;
sequentially inputting the plurality of sample original images into the image generation network to obtain the sample target images output by the image generation network;
constructing the sample image pair using the sample target image and the sample original image.
5. The method of claim 4, wherein training an image generation network with the plurality of sample original images to obtain the image generation network satisfying a generation convergence condition comprises:
inputting the sample original image into an image generator in the image generation network to obtain the sample simulation image generated by the image generator;
inputting the sample simulation image into an image discriminator in the image generation network, and obtaining a discrimination result of the image discriminator for discriminating the truth of the sample simulation image;
and optimizing the network parameters of the image generation network according to the judgment result until the judgment result meets the judgment convergence parameter indicated by the generation convergence condition.
6. The method of claim 3, wherein the supervised training of an initial avatar-switching network with the plurality of sample images of the set of sample images until the initial avatar-switching network satisfying a transition convergence condition is obtained comprises:
inputting the sample original image into an initial image generator in the initial image switching network, and acquiring a simulation target image generated by the initial image generator;
inputting the sample original image and the simulation target image into an initial image discriminator to obtain a simulation discrimination result of the initial image discriminator, wherein the sample original image and the simulation target image are a real image pair;
inputting the sample original image and the sample target image into the initial image discriminator to obtain a sample discrimination result of the initial image discriminator, wherein the sample original image and the sample target image are a real image pair;
calculating a network loss value of the initial image switching network according to the simulation judgment result and the sample judgment result;
and under the condition that the network loss value is smaller than a preset loss value indicated by the conversion convergence condition, determining to obtain the initial image switching network meeting the conversion convergence condition.
7. The method of claim 6, wherein the calculating a network loss value for the initial image switching network based on the simulated discrimination result and the sample discrimination result comprises:
calculating a discrimination loss value according to the simulation discrimination result and the sample discrimination result;
calculating the image distance between the sample target image and the simulated target image to obtain a generation loss value;
and carrying out weighted calculation on the discriminant loss value and the generated loss value to obtain the network loss value.
8. An image switching device, comprising:
the system comprises a display unit, a video switching control unit and a video switching control unit, wherein the display unit is used for displaying a real image of a anchor object which is currently live in a client side and controlling and switching the image presented by the anchor object;
the calling unit is used for calling an image switching network deployed in the client in response to the triggering operation executed on the image switching control, wherein the image switching network is a lightweight neural network used for converting a real image presented in the client into an avatar;
and the switching unit is used for synchronously switching the target real image of the anchor object into a target virtual image matched with the anchor object through the image switching network, wherein the similarity between the target real image and the target virtual image is greater than a first threshold value.
9. A computer-readable storage medium, comprising a stored program, wherein the program when executed performs the method of any one of claims 1 to 7.
10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 7 by means of the computer program.
CN202210232666.4A 2022-03-09 2022-03-09 Image switching method and device, storage medium and electronic equipment Pending CN114792284A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210232666.4A CN114792284A (en) 2022-03-09 2022-03-09 Image switching method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210232666.4A CN114792284A (en) 2022-03-09 2022-03-09 Image switching method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN114792284A true CN114792284A (en) 2022-07-26

Family

ID=82460667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210232666.4A Pending CN114792284A (en) 2022-03-09 2022-03-09 Image switching method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114792284A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024032137A1 (en) * 2022-08-12 2024-02-15 腾讯科技(深圳)有限公司 Data processing method and apparatus for virtual scene, electronic device, computer-readable storage medium, and computer program product

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024032137A1 (en) * 2022-08-12 2024-02-15 腾讯科技(深圳)有限公司 Data processing method and apparatus for virtual scene, electronic device, computer-readable storage medium, and computer program product

Similar Documents

Publication Publication Date Title
CN111479112B (en) Video coding method, device, equipment and storage medium
CN109472764B (en) Method, apparatus, device and medium for image synthesis and image synthesis model training
CN111815534A (en) Real-time skin makeup migration method, device, electronic device and readable storage medium
CN112232325B (en) Sample data processing method and device, storage medium and electronic equipment
CN112906721B (en) Image processing method, device, equipment and computer readable storage medium
CN112001274A (en) Crowd density determination method, device, storage medium and processor
CN115222862A (en) Virtual human clothing generation method, device, equipment, medium and program product
CN111182332B (en) Video processing method, device, server and storage medium
CN114792284A (en) Image switching method and device, storage medium and electronic equipment
CN115527090A (en) Model training method, device, server and storage medium
CN112637609B (en) Image real-time transmission method, sending end and receiving end
CN115238806A (en) Sample class imbalance federal learning method and related equipment
CN111553961B (en) Method and device for acquiring line manuscript corresponding color map, storage medium and electronic device
CN111488476B (en) Image pushing method, model training method and corresponding devices
CN110232393B (en) Data processing method and device, storage medium and electronic device
CN115861605A (en) Image data processing method, computer equipment and readable storage medium
CN114501015B (en) Video coding rate processing method and device, storage medium and electronic equipment
CN116798052B (en) Training method and device of text recognition model, storage medium and electronic equipment
CN116912639B (en) Training method and device of image generation model, storage medium and electronic equipment
CN113810736B (en) AI-driven real-time point cloud video transmission method and system
CN117670652A (en) Image processing method, device, electronic equipment and storage medium
CN114756425A (en) Intelligent monitoring method and device, electronic equipment and computer readable storage medium
CN114693551A (en) Image processing method, device, equipment and readable storage medium
CN117788833A (en) Image recognition method and device, storage medium and electronic equipment
CN116010697A (en) Data processing method, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination