CN113362263A

CN113362263A - Method, apparatus, medium, and program product for changing the image of a virtual idol

Info

Publication number: CN113362263A
Application number: CN202110585489.3A
Authority: CN
Inventors: 吴准; 张晓东; 李士岩
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-05-27
Filing date: 2021-05-27
Publication date: 2021-09-07
Anticipated expiration: 2041-05-27
Also published as: CN113362263B

Abstract

The present disclosure provides a method, apparatus, medium, and program product for transforming an image of a virtual idol, and relates to the field of artificial intelligence such as deep learning and computer vision. One embodiment of the method comprises: acquiring attribute information of the virtual idol and attribute information of a standard object; determining a target image transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object; and fusing the attribute information of the virtual idol and the attribute information of the standard object based on the fusion model corresponding to the target image transformation mode to obtain a fusion result.

Description

Method, apparatus, medium, and program product for changing the image of a virtual idol

Technical Field

The embodiment of the disclosure relates to the field of computers, in particular to the field of artificial intelligence such as deep learning and computer vision, and particularly relates to a method, equipment, a medium and a program product for transforming the image of a virtual idol.

Background

At present, the fusion technology is widely applied to various scenes such as virtual visual images, long and short video interesting playing methods, photo album interesting playing methods and the like. Fusion techniques typically require preserving attribute information of one character and fusing attribute information of another character.

Disclosure of Invention

The disclosed embodiments provide a method, apparatus, medium, and program product for transforming the image of a virtual idol.

In a first aspect, an embodiment of the present disclosure provides a method for transforming an image of a virtual even image, including: acquiring attribute information of the virtual idol and attribute information of a standard object; determining a target image transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object; and fusing the attribute information of the virtual idol and the attribute information of the standard object based on the fusion model corresponding to the target image transformation mode to obtain a fusion result.

In a second aspect, an embodiment of the present disclosure provides an apparatus for transforming an image of a virtual even image, including: an information acquisition unit configured to acquire attribute information of the virtual idol and attribute information of the standard object; a mode determination unit configured to determine a target character transformation mode according to attribute information of the virtual idol and/or attribute information of the standard object; and the information fusion unit is configured to fuse the attribute information of the virtual idol and the attribute information of the standard object based on a fusion model corresponding to the target image transformation mode to obtain a fusion result.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in the first aspect.

In a fourth aspect, the disclosed embodiments propose a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method as described in the first aspect.

In a fifth aspect, the disclosed embodiments propose a computer program product comprising a computer program that, when executed by a processor, implements the method as described in the first aspect.

The method, the device, the medium and the program product for transforming the image of the virtual idol provided by the embodiment of the disclosure firstly acquire the attribute information of the virtual idol and the attribute information of a standard object; then, determining a target image transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object; and finally, fusing the attribute information of the virtual idol and the attribute information of the standard object based on the fusion model corresponding to the target image transformation mode to obtain a fusion result. The fusion model corresponding to the target image transformation mode can be determined by the attribute information of the virtual idol and/or the attribute information of the standard object, and the attribute information of the virtual idol and the attribute information of the standard object are fused to obtain a fusion result, so that the image transformation of the virtual idol is realized.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

Other features, objects, and advantages of the disclosure will become apparent from a reading of the following detailed description of non-limiting embodiments which proceeds with reference to the accompanying drawings. The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is an exemplary system architecture diagram in which the present disclosure may be applied;

FIG. 2 is a flow diagram of one embodiment of a method of transforming the avatar of a virtual idol according to the present disclosure;

FIG. 3 is a flow diagram of one embodiment of a method of transforming the avatar of a virtual idol according to the present disclosure;

FIG. 4 is a flow diagram of one embodiment of a method of generating a fusion model according to the present disclosure;

FIG. 5 is a schematic diagram of one application scenario of a method of transforming the avatar of a virtual idol according to the present disclosure;

FIG. 6 is a flow diagram for one embodiment of an apparatus for transforming the avatar of a virtual idol according to the present disclosure;

FIG. 7 is a block diagram of an electronic device used to implement an embodiment of the disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the method of transforming a avatar of a virtual idol or the apparatus for transforming a avatar of a virtual idol of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to obtain attribute information of the virtual idol, attribute information of the standard object, and the like. The

terminal devices

101, 102, 103 may have installed thereon various client applications, intelligent interactive applications, such as video-related software, live-related software, image processing applications, and so on.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, and 103 are hardware, the terminal devices may be electronic products that perform human-Computer interaction with a user through one or more modes of a keyboard, a touch pad, a display screen, a touch screen, a remote controller, voice interaction, or handwriting equipment, such as a PC (Personal Computer), a mobile phone, a smart phone, a PDA (Personal Digital Assistant), a wearable device, a PPC (Pocket PC), a tablet Computer, a smart car machine, a smart television, a smart speaker, a tablet Computer, a laptop Computer, a desktop Computer, and the like. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the above-described electronic apparatuses. It may be implemented as multiple pieces of software or software modules, or as a single piece of software or software module. And is not particularly limited herein.

The server 105 may provide various services. For example, the server 105 may acquire attribute information of virtual idols on the

terminal devices

101, 102, 103, and attribute information of standard objects; determining a target image transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object; and fusing the attribute information of the virtual idol and the attribute information of the standard object based on the fusion model corresponding to the target image transformation mode to obtain a fusion result.

The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 105 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the method for transforming the avatar of the virtual idol provided by the embodiment of the present disclosure is generally executed by the server 105, and accordingly, the device for transforming the avatar of the virtual idol is generally disposed in the server 105.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a method of transforming the avatar of a virtual idol in accordance with the present disclosure is shown. The method of transforming the avatar of a virtual idol may include the steps of:

step 201, obtaining the attribute information of the virtual idol and the attribute information of the standard object.

In the present embodiment, an execution subject (e.g., the

terminal apparatus

101, 102, 103 or the server 105 shown in fig. 1) of the method of determining the avatar of the transformation virtual idol may acquire the attribute information of the virtual idol, and the attribute information of the standard object. The virtual idol can be made in the forms of drawing, animation and the like, and can perform performance activities in virtual scenes or real scenes such as the Internet and the like. The standard object may be an object in a template, and the virtual idol may be transformed based on the standard object.

In this embodiment, the virtual idol may be an avatar corresponding to an entity object (e.g., a live anchor) in a live scene.

It should be noted that the virtual idol image is not limited to one mode. The virtual idols may have different personalities. The avatar of the virtual idol is generally a 3D avatar. The virtual idol may have different appearances and decorations. The image of each virtual idol can also correspond to a plurality of different dresses, and the dressing classification can be classified according to seasons and scenes.

Furthermore, the virtual idol image data of each round of interaction records the clothes, the makeup, the ornaments, the accessories, the hair style, the limb actions and the expressions of the virtual idol during the round of interaction, the virtual idol voice data of each round of interaction records the conversation text, the speech speed and the speech tone of the virtual idol during the round of interaction, a 300-dimensional word vector related to the conversation text is generated according to the virtual idol image data and the virtual idol voice data, vectors which are coded by 0 and 1 and related to the characteristics of the clothes, the makeup, the ornaments, the accessories, the hair style voice, the speech tone and the like of the virtual idol are generated, a 38-point skeleton key point vector related to the limb actions of the virtual idol is generated, a 29-point expression key point vector related to the expressions of the virtual idol is generated, and the four vectors are sequentially spliced, a high-dimensional vector is generated as virtual idol information for the round of interaction.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the related attribute information all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

In this embodiment, before obtaining the attribute information of the virtual idol and the attribute information of the standard object, the method for transforming the avatar of the virtual idol may further include: (1) the physical object (i.e., the object corresponding to the virtual object) inputs a specific voice instruction, such as "i want to play star face", "i want to play drama face", through a microphone; (2) after receiving the instruction, starting a camera of the execution main body to identify the gesture of the entity object, and calling a face changing technology to automatically change a star figure when a hand sweeps the face of the entity object; (3) the face changing technology is based on real-time face changing after video generation and video stream synthesizing; after changing the face, the entity object can still drive the virtual idol to perform through the face capture and the moving capture device, and meanwhile, the switching can be performed based on clothes and scenes, for example, according to the role of a movie, the scene is changed when the face is changed, and the clothes of the virtual idol are also changed together. The scene can be preset and the association between the virtual idols is established.

Step 202, determining a target image transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object.

In this embodiment, the execution subject may determine a target avatar transformation mode according to the attribute information of the virtual idol; or, determining a target image transformation mode according to the attribute information of the standard object; or, determining the target image transformation mode according to the attribute information of the virtual idol and the attribute information of the standard object. The above-mentioned target character transformation mode may be a preselected mode for transforming the avatar character, and the target character transformation mode may be used for transforming the avatar character, for example, facial transformation, hair ornament, scene (or background), clothes, ornaments, etc.

The target character conversion pattern may be generated from various visual scenes such as games, movies, and dramas. For example, the target character transformation pattern may be generated from a face, a dress, a hairstyle, a decoration, a scene (background), and the like in the "a game"; then, the target character transformation pattern may include a pattern composed of the face, clothes, hairstyle, ornaments, scene (background), and the like of the character in the "a game".

And 203, fusing the attribute information of the virtual idol and the attribute information of the standard object based on the fusion model corresponding to the target image transformation mode to obtain a fusion result.

In this embodiment, the executing entity may fuse the attribute information of the virtual idol and the attribute information of the standard object based on a fusion model corresponding to the target image transformation mode to obtain a fusion result.

Specifically, the executing body may first perform a search based on the target image transformation mode to obtain a fusion model corresponding to the target image transformation mode; then, the attribute information of the virtual idol and the attribute information of the standard object are input into the fusion model, and the result is fused. The fusion model can be used for fusing the attribute information of the virtual idol and the attribute information of the standard object so as to transform the image of the virtual idol. The fusion result may be output after the fusion model fuses the attribute information of the virtual idol and the attribute information of the standard object.

Taking face fusion as an example. Face fusion techniques typically require preserving identity information of one face image and fusing attribute information of another face image. In the faceshietter method, an auto encoder (auto encoder) attribute reconstruction network exists for a target face, and features of all scales of the attribute reconstruction network are fused into identity information of a template face.

It should be noted that the fusion in this embodiment is not limited to face fusion (for example, face fusion), but also includes hair style fusion, scene (or background) fusion, ornament fusion, and clothing fusion.

The hair style fusion can be splicing the hair style image and the face image of the virtual idol. Scene fusion can be to process the scene of the virtual even image into a transparent image; and then splicing the scene of the standard object with the virtual idol. The decoration fusion may be to determine an area of the virtual idol where the decoration needs to be set, and then to superimpose the image layer of the decoration on the area. The clothing fusion can splice the clothing with the area where the clothing of the virtual idol is located.

The method for transforming the image of the virtual idol provided by the embodiment of the disclosure comprises the steps of firstly obtaining attribute information of the virtual idol and attribute information of a standard object; then, determining a target image transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object; and finally, fusing the attribute information of the virtual idol and the attribute information of the standard object based on the fusion model corresponding to the target image transformation mode to obtain a fusion result. The fusion model corresponding to the target image transformation mode can be determined by the attribute information of the virtual idol and/or the attribute information of the standard object, and the attribute information of the virtual idol and the attribute information of the standard object are fused to obtain a fusion result, so that the image transformation of the virtual idol is realized.

In some optional implementation manners of this embodiment, determining the target character transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object may include: acquiring a preset attribute information set; matching the attribute information of the virtual idol and/or the attribute information of the standard object with a preset attribute information set to obtain a matching result; and determining a target image transformation mode according to the matching result.

In this implementation manner, the execution main body may first obtain a preset attribute information set; then, matching the attribute information of the virtual idol and the attribute information of the standard object with the attribute information in a preset attribute information set respectively to obtain a matching result; and then, determining a target image transformation mode according to the matching result. The preset attribute information set may be a set formed by pre-collected attribute information, and after the preset attribute information set is obtained, a mapping relationship between the attribute information in the preset attribute information set and the attribute information of the virtual idol and the attribute information of the standard object needs to be pre-established.

In this implementation, the target avatar transformation mode may be determined from a preset avatar transformation mode set based on the type information represented by the attribute information of the standard object. The type information represented by the attribute information of the standard object may be information that can be used to determine the type of the standard object in the attribute information of the standard object.

In one example, the type information may include: target cartoon characters, target movie and television characters, and target drama characters; if the type information is the target cartoon character, the target cartoon character corresponding to the target cartoon character can be determined from the preset character transformation mode set.

It should be noted that the preset character transformation mode set may be pre-established and include a plurality of character transformation modes. After the preset image transformation mode set is obtained, the mapping relation between the type information represented by the attribute information of the standard object and the image transformation mode can be established.

In one example, a mapping relationship between the type information and the avatar transformation mode may be established in advance; for example, the type information of the virtual idol may include a target cartoon character, a target drama character; then, the mapping relationship between the target cartoon character and the character transformation mode 1 and between the target drama character and the character transformation mode 2 can be established in advance; and if the type information represented by the attribute information of the virtual idol is the target cartoon character, determining the image transformation mode as an image transformation mode 1.

In this implementation manner, the execution main body may match the attribute information of the virtual idol with a preset attribute information set to obtain an initial matching result; if the number of the initial matching results is one, determining a target image transformation mode by the image transformation mode corresponding to the initial matching results; if the number of the initial matching results is multiple, matching can be performed with the initial matching results based on the attribute information of the standard object to obtain final matching results; and then, determining the image transformation mode corresponding to the final matching result as a target image transformation mode.

In this implementation manner, the execution body may determine the initial avatar transformation mode from a preset avatar transformation mode set according to the type information represented by the attribute information of the virtual idol. If the types of the initial image transformation modes are multiple, determining a final image transformation mode from the initial image transformation modes based on the type information represented by the attribute information of the standard object; and finally, deforming the image of the virtual even image according to the final image transformation mode.

It should be noted that, the attribute information of the virtual even image is matched first; and then, according to the matching result, whether the matching of the attribute information of the standard object needs to be executed again is determined.

In practical application, the attribute information of the virtual idol can be preset by the entity object corresponding to the virtual idol, and after the user finishes setting the attribute information of the virtual idol, the entity object can also select the attribute information of the standard object.

In this implementation manner, the determination of the target image transformation mode may be implemented through a matching result between a preset attribute information set and attribute information of the virtual idol and/or attribute information of the standard object.

In some optional implementation manners of this embodiment, determining the target character transformation mode according to the matching result includes: and if the matching result comprises third attribute information which is matched with the attribute information of the virtual idol and the attribute information of the standard object simultaneously in the preset attribute information set, determining a third image transformation mode corresponding to the third attribute information as a target image transformation mode.

In this implementation, the execution subject may be configured to associate the third attribute information with the attribute information of the virtual idol and the attribute information of the standard object. The matching may be that the similarity satisfies a preset similarity threshold; or the same. Wherein, the preset similarity threshold value can be determined by the accuracy of the user setting or the image transformation.

In the implementation mode, the accurate determination of the target image transformation mode can be realized through the matching result of the simultaneous matching of the attribute information in the preset attribute information set, the attribute information of the virtual idol and the attribute information of the standard object.

In some optional implementation manners of this embodiment, if the matching result includes first attribute information that is matched with the attribute information of the virtual idol in the preset attribute information set, or second attribute information that is matched with the attribute information of the standard object; and

the method for transforming the image of the virtual idol further comprises the following steps: acquiring user preference information aiming at an image transformation mode;

determining a target image transformation mode according to the matching result, comprising: if the number of the first image transformation mode corresponding to the first attribute information or the second image transformation mode corresponding to the second attribute information is more than one, the target image transformation mode is determined from the first image transformation mode or the second image transformation mode according to the user preference information.

In this implementation, the executing body may first match with the attribute information in the preset attribute information set based on the attribute information of the virtual idol or the attribute information of the standard object, and if the matching result includes a plurality of first avatar transformation patterns corresponding to the first attribute information (i.e., the attribute information of the virtual idol matches with the attribute information in the preset attribute information set) or a plurality of second avatar transformation patterns corresponding to the second attribute information (i.e., the attribute information of the standard object matches with the attribute information in the preset attribute information set); at this time, the target character transformation mode is determined from the plurality of first character transformation modes or the plurality of second character transformation modes in combination with the user preference information of the entity object corresponding to the virtual idol for the character transformation mode, so as to realize accurate determination of the character transformation mode based on the user preference information. The user preference information may be used to characterize the degree of interest of the entity object corresponding to the avatar in the avatar transformation mode, which may be determined by the degree of operation (or the degree of use) of the entity object corresponding to the avatar in the avatar transformation mode, for example, if the entity object corresponding to the avatar recently uses the target avatar transformation mode corresponding to the target movie character, then the target avatar transformation mode may be determined from the plurality of first avatar transformation modes or the plurality of second avatar transformation modes based on the degree of interest.

In this implementation, the execution subject may further filter based on the user preference information when the number of the first shape transformation modes corresponding to the first attribute information or the second shape transformation modes corresponding to the second attribute information is plural, so as to accurately implement the determination of the target shape transformation mode.

In some optional implementation manners of this embodiment, determining the target character transformation mode according to the matching result includes: if the matching result comprises first attribute information matched with the attribute information of the virtual idol in a preset attribute information set, determining a first image transformation mode corresponding to the first attribute information as a target image transformation mode; and if the matching result comprises second attribute information matched with the attribute information of the standard object in the preset attribute information set, determining a second image transformation mode corresponding to the second attribute information as a target image transformation mode.

In this implementation manner, the execution body may pre-establish a corresponding relationship between the first attribute information and the attribute information of the virtual idol; or, the correspondence between the second attribute information and the attribute information of the standard object. The matching may be that the similarity satisfies a preset similarity threshold; or the same. Wherein the preset similarity threshold may be determined by the accuracy of the user or the transformed image.

In this implementation manner, the determination of the target image transformation mode may be implemented through a matching result between the attribute information in the preset attribute information set and the attribute information of the virtual idol and/or the attribute information of the standard object.

In some optional implementations of this embodiment, the obtaining the attribute information of the virtual idol and the attribute information of the standard object includes: acquiring three-dimensional deformation model parameters of the virtual idol, and extracting attribute information of the virtual idol from the three-dimensional deformation model parameters; and acquiring three-dimensional deformation model parameters of the standard object, and extracting attribute information of the standard object from the three-dimensional deformation model parameters.

In this implementation manner, the execution subject may extract attribute information of the face of the virtual idol from the acquired three-dimensional deformation Model (3D Morphable Model, 3DMM) parameter of the face of the virtual idol, and extract attribute information of the standard object from the acquired 3DMM parameter of the standard object.

In this implementation manner, the acquisition of the attribute information of the virtual idol and the attribute information of the standard object can be realized based on the acquired three-dimensional deformation model parameters.

In some optional implementations of this embodiment, the obtaining the attribute information of the virtual idol and the attribute information of the standard object includes: acquiring attribute information of the virtual idol by using a three-dimensional reconstruction method; and acquiring attribute information of the standard object by using a three-dimensional reconstruction method.

In this implementation, the attribute information may include at least one of: face attribute information, hair style attribute information, scene (or background) attribute information, ornament attribute information, apparel attribute information, people number attribute information, and physiological attribute information.

In one example, facial attribute information is taken as an example, such as face attribute information. If the attribute information includes face attribute information, obtaining the attribute information of the virtual idol may include: and acquiring three-dimensional deformation model parameters of the face of the virtual idol by using a face three-dimensional reconstruction method, and extracting attribute information of the face of the virtual idol from the three-dimensional deformation model parameters.

In this implementation manner, the acquisition of the attribute information of the virtual idol and the attribute information of the standard object can be realized based on a three-dimensional reconstruction method.

In some optional implementations of this embodiment, the attribute information of the standard object or the attribute information of the virtual idol may include at least one of: face attribute information, hair style attribute information, scene attribute information, ornament attribute information, and apparel attribute information.

In this implementation, the attribute information may be embodied by face attribute information, hair style attribute information, scene (or background) attribute information, ornament attribute information, and clothing attribute information thereof. The hair style attribute information can be used for characterizing the attribute information of the hair style, such as the information of the hair style, the color, the length and the like. The scene attribute information may be used to characterize scene-related information. The ornament attribute information may be used to characterize information associated with the ornament, such as style, quantity, etc. Apparel attribute information may be used to characterize information related to the apparel, such as size, color, and the like.

In this implementation, the multi-dimensional transformation of the image of the virtual idol can be implemented based on the multi-dimensional attribute information.

In some optional implementations of this embodiment, the standard object includes at least one of: target movie characters, target drama characters, and target cartoon characters.

In this implementation, the execution subject may determine the type of the standard object by type information characterized by the attribute information of the standard object. The target movie and television characters can be character images appearing in movie and television resources. The target theatrical character may be a character image appearing in a theatrical character, which may also include a corresponding facial mask. The cartoon character may be a character image appearing in an animation.

In one example, a target movie character is taken as an example.

Determining a target image transformation mode as a target image transformation mode corresponding to a target movie and television character according to the attribute information of the virtual idol and/or the attribute information of the standard object; and then, fusing the attribute information of the virtual idol and the attribute information of the standard object based on a fusion model corresponding to a target image transformation mode corresponding to a target movie and television character to obtain a fusion result.

The first image is the face of the entity object corresponding to the virtual idol, the second image is a film and television character 'A', the third image is obtained by fusing a fusion model corresponding to a target image transformation mode corresponding to a target film and television character, the third image is obtained by replacing four parts of eyebrows, eyes, a nose and a mouth on the basis of the first image, and the face shape is kept along with the face shape contour in the first image.

In the actual fusion, the parts other than the eyebrows, the eyes, the nose, and the mouth, for example, the hairstyle, the ears, and the ornaments may be fused.

In one example, a target dramatic character is taken as an example.

When the fusion model is fused, the attribute information of the standard object can be overlaid on the attribute information of the virtual idol. For example, makeup at the eyes of the standard object is directly overlaid at the eyes of the virtual idol, or apparel of the standard object is directly overlaid on the body of the virtual idol.

If the angle of the standard object is different from the angle of the virtual idol, the angle of the standard object may be adjusted first so that the adjusted angle of the standard object is the same as the angle of the virtual idol. For example, if the face of the standard object is 30 degrees upward, the angle of the attribute information of the standard object may be adjusted to match the angle of the virtual idol before the fusion.

In one example, a target cartoon character is taken as an example.

The cartoon character may be an animation, a character in a game. When the cartoon character is fused, the method for transforming the image of the virtual idol can further comprise the following steps: the virtual idols are bound in advance to the muscles and bones of the physical object.

In the implementation manner, the determination of the target character transformation mode can be realized through the type of the standard object.

With further reference to FIG. 3, FIG. 3 illustrates a flow 300 of one embodiment of a method of transforming the avatar of a virtual idol in accordance with the present disclosure. The method of transforming the avatar of a virtual idol may include the steps of:

step 301, obtaining the attribute information of the virtual idol and the attribute information of the standard object.

Step 302, a preset attribute information set is obtained.

In the present embodiment, an execution subject (e.g., the

terminal apparatus

101, 102, 103 or the server 105 shown in fig. 1) of the method of determining the avatar of the transform virtual idol may acquire a preset attribute information set. Step 301 and step 302 may be performed simultaneously, or separately.

Step 303, matching the attribute information of the virtual idol and/or the attribute information of the standard object with a preset attribute information set to obtain a matching result.

In this embodiment, the execution body may match the attribute information of the virtual idol and/or the attribute information of the standard object with a preset attribute information set to obtain a matching result.

And step 304, determining a target image transformation mode according to the matching result.

In this embodiment, the execution body may determine the target character transformation pattern according to the matching result.

And 305, fusing the attribute information of the virtual idol and the attribute information of the standard object based on the fusion model corresponding to the target image transformation mode to obtain a fusion result.

In this embodiment, the specific operations of

steps

301 and 305 have been described in detail in steps 201 and 203, respectively, in the embodiment shown in fig. 2, and are not described again here.

As can be seen from fig. 3, compared with the embodiment corresponding to fig. 2, the step of determining the target character transformation mode is highlighted in the embodiment for transforming the character of the virtual idol. Therefore, in the scheme described in this embodiment, the attribute information of the virtual idol and the attribute information of the standard object are respectively matched with the attribute information in the preset attribute information set to obtain a matching result; and then, obtaining a target image transformation mode according to the matching result. The target image transformation mode can be obtained according to different matching results, and therefore image transformation of the virtual idol can be achieved based on fusion models corresponding to different target image transformation modes.

In some optional implementation manners of this embodiment, determining the target character transformation mode according to the matching result includes: if the matching result comprises first attribute information matched with the attribute information of the virtual idol in a preset attribute information set, determining a first image transformation mode corresponding to the first attribute information as a target image transformation mode; if the matching result comprises second attribute information matched with the attribute information of the standard object in the preset attribute information set, determining a second image transformation mode corresponding to the second attribute information as a target image transformation mode; and if the matching result comprises third attribute information matched with the attribute information of the virtual idol and the attribute information of the standard object in the preset attribute information set, determining a third image transformation mode corresponding to the third attribute information as a target image transformation mode.

In this implementation manner, the similarity between the attribute information and the matching information may satisfy a preset similarity threshold; or the attribute information of the virtual idol is the same as the first attribute information, or the attribute information of the standard object is the same as the second attribute information.

In the implementation mode, the image transformation of the virtual even image can be realized based on the fusion models corresponding to different target image transformation modes.

With further reference to FIG. 4, FIG. 4 illustrates a flow 400 of one embodiment of a method of generating a fusion model according to the present disclosure. The method for generating the fusion model can comprise the following steps:

step 401, obtaining a training sample, where the training sample includes: sample attribute information for the virtual idol, and sample attribute information for the standard object.

In the present embodiment, an executing subject (e.g., the

terminal devices

101, 102, 103 shown in fig. 1) of the method of generating the fusion model may collect training samples generated thereon; alternatively, the executing entity (e.g., server 105 shown in fig. 1) obtains training samples from terminal devices (e.g.,

terminal devices

101, 102, 103 shown in fig. 1).

Step 402, fusing the sample attribute information of the virtual idol and the sample attribute information of the standard object to obtain sample fusion attribute information.

In this embodiment, the execution body may fuse the virtual idol and the standard object to obtain a fused object; and then, acquiring the attribute information of the fusion object, and taking the attribute information of the fusion object as sample fusion attribute information.

Taking a face as an example, for a face image of a virtual idol in a training sample, face attribute information of the virtual idol in the face image can be acquired by using a human face three-dimensional reconstruction method. Preferably, the three-dimensional deformation Model (3D Morphable Model, 3DMM) parameters of the virtual idol may be first obtained by using a human face three-dimensional reconstruction method, and then the face attribute information of the virtual idol may be extracted from the 3DMM parameters. Wherein, different dimensions in the 3DMM parameters respectively correspond to the information of the identity, the expression, the posture, the illumination, the hair style, the clothes, the ornaments and the like of the virtual idol.

In one example, facial blending is taken as an example.

Fusing the sample attribute information of the virtual idol and the sample attribute information of the standard object to obtain sample fusion attribute information, which may include: the face image of the virtual idol and the face image of the standard object can be fused to obtain a fused face image; after that, the attribute information of the fused face image is acquired, and the acquired attribute information is taken as sample fusion attribute information. Alternatively, a fusion method based on a generated confrontation network (GAN) may be used to fuse the face image of the virtual idol and the face image of the standard object, so as to obtain a fused face image. In practical applications, any GAN-based fusion method may be employed, such as a facial face migration (faceshieter) method.

In this embodiment, the fusion technique generally needs to retain the attribute information of one character and fuse the attribute information of another character. In the faceshietter method, an auto-encoder (auto encoder) attribute reconstruction network exists for the face image of the virtual idol, and features of each scale of the attribute reconstruction network are fused into the face image of the standard object.

And 403, constructing an attribute consistency loss function according to the sample attribute information of the virtual idol and the sample fusion attribute information, and performing self-supervision learning of the fusion model by using the attribute consistency loss function.

In the implementation mode, the sample attribute information of the virtual idol and the sample fusion attribute information can be respectively obtained, and the two attribute information are expected to have consistency during fusion, so that an attribute consistency loss function can be constructed according to the sample attribute information of the virtual idol and the sample fusion attribute information, and the model can be fused by using the attribute consistency loss function.

In one example, the L2 norm (L2-norm) of the sample attribute information and the sample fusion attribute information of the virtual idol may be computed as an attribute consistency loss function, which may be in the specific form: | A-B | non-conducting phosphor²Wherein A and B represent the sample attribute information and the sample fusion attribute information of the virtual idol, respectively.

In addition, the self-supervision learning of the fusion model can be carried out by combining the attribute consistency loss function and the identity consistency loss function in the GAN-based fusion method. For example, the self-supervised learning of the fusion model is performed by combining the attribute consistency loss function and the attribute information consistency loss function in the faceshieter method.

It should be noted that, the consistency between the sample fusion attribute information and the sample attribute information of the virtual idol is ensured through the above manner.

In this embodiment, the method for generating a fusion model may further include: aiming at different image transformation modes, the image transformation of the virtual even image is realized by a fusion model corresponding to the different image transformation modes.

The method for generating the fusion model provided by the embodiment of the disclosure can respectively obtain the sample attribute information and the sample fusion attribute information of the virtual idol, can construct the attribute consistency loss function by using the obtained attribute information, and guides the training of the model by using the attribute consistency loss function, so that the model training effect is improved, the fusion effect when the model obtained by training is fused is further improved, a more real fusion image can be obtained, and a corresponding fusion result can be obtained by giving any virtual idol and a target object according to the model obtained by training, so that the method has wide applicability, lower implementation cost and the like.

In some optional implementations of the present embodiment, the fusion model may be a generative confrontation network model.

Specifically, the fusing the sample attribute information of the virtual idol and the sample attribute information of the standard object to obtain the sample fusion attribute information includes: and inputting the sample attribute information of the virtual idol and the sample attribute information of the standard object into a generator of the generating type confrontation network model to obtain sample fusion attribute information.

In this implementation manner, the execution subject may input the sample attribute information of the standard object and the sample attribute information of the virtual idol into the generator of the generative confrontation network model, and implement fusion of the sample attribute information of the virtual idol and the sample attribute information of the standard object, so as to obtain the sample fusion attribute information.

Constructing an attribute consistency loss function according to the sample attribute information and the sample fusion attribute information of the virtual idol, and performing self-supervision learning of the fusion model by using the attribute consistency loss function, wherein the method comprises the following steps: respectively inputting the sample fusion attribute information and the sample attribute information of the standard object into a discriminator of a generative confrontation network model to obtain a first discrimination result aiming at the sample fusion attribute information and a second discrimination result aiming at the sample attribute information of the standard object; determining an attribute consistency loss function according to the first judgment result, the second judgment result and the sample fusion attribute information; and adjusting the network parameters of the generator according to the attribute consistency loss function.

In this implementation, the executing entity may determine the attribute consistency loss function to adjust the network parameters of the generator according to the first determination result for the sample fusion attribute information, the second determination result for the sample fusion attribute information of the standard object, and the sample fusion attribute information obtained by the identifier of the generative countermeasure network model.

In this implementation manner, the execution body may perform face fusion on the sample attribute information of the virtual idol and the sample attribute information of the standard object by using a fusion method based on a Generative Adaptive Network (GAN) model, so as to obtain a fusion result. In practical applications, any GAN-based fusion method, such as the migration (faceshieter) method, may be used.

Fusion techniques typically require preserving attribute information of one character and fusing attribute information of another character. In the faceshietter method, an auto-encoder (auto encoder) attribute reconstruction network exists for a virtual idol, and features of all scales of the attribute reconstruction network are fused into sample attribute information of a standard object.

In the implementation mode, a better fusion effect can be obtained by adopting the fusion method based on the GAN, thereby facilitating the subsequent treatment.

With further reference to fig. 5, fig. 5 is a schematic diagram 500 of one application scenario of a method of transforming the avatar of a virtual idol according to the present disclosure. In the application scenario, taking a human face as an example, the terminal device 501 may be configured to obtain a virtual idol of an entity object; then, acquiring the face attribute information of the virtual idol and the face attribute information of the standard object; then, determining a target image transformation mode according to the face attribute information of the virtual idol and/or the face attribute information of the standard object; then, fusing the attribute information of the virtual idol and the attribute information of the standard object based on a fusion model corresponding to the target image transformation mode to obtain a fusion result; then, the live broadcast is sent to the terminal device 503 through the network 502, and the user of the terminal device 503 watches the live broadcast of the entity object (i.e., the user) of the terminal device 501.

With further reference to fig. 6, as an implementation of the method shown in the above figures, the present disclosure provides an embodiment of an apparatus for transforming the image of a virtual idol, which corresponds to the embodiment of the method shown in fig. 2, and which is particularly applicable to various electronic devices.

As shown in fig. 6, the apparatus 600 for transforming the avatar of a virtual idol of the present embodiment may include: an information acquisition unit 601, a mode determination unit 602, and an information fusion unit 603. Wherein, the information obtaining unit 601 is configured to obtain attribute information of the virtual idol and attribute information of the standard object; a mode determination unit 602 configured to determine a target character transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object; and the information fusion unit 603 is configured to fuse the attribute information of the virtual idol and the attribute information of the standard object based on a fusion model corresponding to the target image transformation mode to obtain a fusion result.

In the present embodiment, in the apparatus 600 for transforming the image of virtual idol: the detailed processing and the technical effects of the information obtaining unit 601, the mode determining unit 602, and the information fusing unit 603 can refer to the related descriptions of step 201 and step 203 in the corresponding embodiment of fig. 2, which are not described herein again.

In some optional implementations of this embodiment, the mode determining unit 602 includes: an information acquisition subunit configured to acquire a preset attribute information set; the result obtaining subunit is configured to match the attribute information of the virtual idol and/or the attribute information of the standard object with a preset attribute information set to obtain a matching result; and a mode determination subunit configured to determine a target character transformation mode according to the matching result.

In some optional implementations of this embodiment, the mode determining subunit is further configured to: and if the matching result comprises third attribute information which is matched with the attribute information of the virtual idol and the attribute information of the standard object simultaneously in the preset attribute information set, determining a third image transformation mode corresponding to the third attribute information as a target image transformation mode.

In some optional implementation manners of this embodiment, if the matching result includes first attribute information that is matched with the attribute information of the virtual idol in the preset attribute information set, or second attribute information that is matched with the attribute information of the standard object; an information acquisition unit 601 further configured to acquire user preference information for the avatar conversion mode; a mode determination subunit further configured to: if the number of the first image transformation mode corresponding to the first attribute information or the second image transformation mode corresponding to the second attribute information is more than one, the target image transformation mode is determined from the first image transformation mode or the second image transformation mode according to the user preference information.

In some optional implementations of this embodiment, the mode determining subunit is further configured to: if the matching result comprises first attribute information matched with the attribute information of the virtual idol in a preset attribute information set, determining a first image transformation mode corresponding to the first attribute information as a target image transformation mode; if the matching result comprises second attribute information matched with the attribute information of the standard object in the preset attribute information set, determining a second image transformation mode corresponding to the second attribute information as a target image transformation mode; and if the matching result comprises third attribute information matched with the attribute information of the virtual idol and the attribute information of the standard object in the preset attribute information set, determining a third image transformation mode corresponding to the third attribute information as a target image transformation mode.

In some optional implementations of this embodiment, the information obtaining unit 601 is further configured to: acquiring three-dimensional deformation model parameters of the virtual idol, and extracting attribute information of the virtual idol from the three-dimensional deformation model parameters; and acquiring three-dimensional deformation model parameters of the standard object, and extracting attribute information of the standard object from the three-dimensional deformation model parameters.

In some optional implementations of this embodiment, the information obtaining unit 601 is further configured to: acquiring attribute information of the virtual idol by using a three-dimensional reconstruction device; and acquiring attribute information of the standard object by using the three-dimensional reconstruction device.

In some optional implementations of this embodiment, the means for transforming the avatar of the virtual idol further comprises: a sample obtaining unit configured to obtain a training sample, the training sample including: sample attribute information of the virtual idol and sample attribute information of the standard object; the information obtaining unit is configured to fuse the sample attribute information of the virtual idol and the sample attribute information of the standard object to obtain sample fusion attribute information; and the model training unit is configured to construct an attribute consistency loss function according to the sample attribute information of the virtual even image and the sample fusion attribute information, and perform self-supervision learning of the fusion model by using the attribute consistency loss function.

In some optional implementations of this embodiment, the fusion model is a generative confrontation network model;

an information obtaining unit further configured to: inputting the sample attribute information of the virtual idol and the sample attribute information of the standard object into a generator of a generating type confrontation network model to obtain sample fusion attribute information;

a model training unit further configured to: respectively inputting the sample fusion attribute information and the sample attribute information of the standard object into a discriminator of a generative confrontation network model to obtain a first discrimination result aiming at the sample fusion attribute information and a second discrimination result aiming at the sample attribute information of the standard object; determining an attribute consistency loss function according to the first judgment result, the second judgment result and the sample fusion attribute information; and adjusting the network parameters of the generator according to the attribute consistency loss function.

In some optional implementations of this embodiment, the attribute information of the standard object or the attribute information of the virtual idol includes at least one of: face attribute information, hair style attribute information, scene attribute information, ornament attribute information, and apparel attribute information.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 701 performs the respective methods and processes described above, such as a method of transforming the avatar of a virtual idol. For example, in some embodiments, the method of transforming the avatar of the virtual idol may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into RAM 703 and executed by the computing unit 701, one or more steps of the method of transforming the avatar of a virtual idol described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured by any other suitable means (e.g., by means of firmware) to perform the method of transforming the avatar of the virtual idol.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Artificial intelligence is the subject of studying computers to simulate some human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural voice processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in this disclosure may be performed in parallel, sequentially, or in a different order, as long as the desired results of the technical solutions mentioned in this disclosure can be achieved, and are not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A method of transforming the appearance of a virtual idol, comprising:

acquiring attribute information of the virtual idol and attribute information of a standard object;

determining a target image transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object;

and fusing the attribute information of the virtual idol and the attribute information of the standard object based on a fusion model corresponding to the target image transformation mode to obtain a fusion result.

2. The method of claim 1, wherein the determining a target avatar transformation mode according to the attribute information of the virtual idol and/or the attribute information of the standard object comprises:

acquiring a preset attribute information set;

matching the attribute information of the virtual idol and/or the attribute information of the standard object with the preset attribute information set to obtain a matching result;

and determining the target image transformation mode according to the matching result.

3. The method of claim 2, wherein said determining the target avatar transformation pattern based on the matching result comprises:

and if the matching result comprises third attribute information which is matched with the attribute information of the virtual idol and the attribute information of the standard object in the preset attribute information set at the same time, determining a third image transformation mode corresponding to the third attribute information as the target image transformation mode.

4. The method according to claim 2, wherein if the matching result includes first attribute information matching with the attribute information of the virtual idol in the preset attribute information set or second attribute information matching with the attribute information of the standard object; and

the method further comprises the following steps:

acquiring user preference information aiming at an image transformation mode;

the determining the target image transformation mode according to the matching result comprises:

if the number of the first image transformation mode corresponding to the first attribute information or the second image transformation mode corresponding to the second attribute information is multiple, the target image transformation mode is determined from the first image transformation mode or the second image transformation mode according to the user preference information.

5. The method of any of claims 1-4, wherein said obtaining attribute information of the virtual idol and attribute information of the standard object comprises:

acquiring three-dimensional deformation model parameters of the virtual idol, and extracting attribute information of the virtual idol from the three-dimensional deformation model parameters; and

and acquiring three-dimensional deformation model parameters of the standard object, and extracting attribute information of the standard object from the three-dimensional deformation model parameters.

6. The method of claim 1, wherein the fusion model is determined based on:

obtaining a training sample, wherein the training sample comprises: sample attribute information of the virtual idol and sample attribute information of the standard object;

fusing the sample attribute information of the virtual idol and the sample attribute information of the standard object to obtain sample fusion attribute information;

and constructing an attribute consistency loss function according to the sample attribute information of the virtual idol and the sample fusion attribute information, and performing self-supervision learning of the fusion model by using the attribute consistency loss function.

7. The method of claim 6, wherein if the fusion model is a generative confrontation network model; and

the fusing the sample attribute information of the virtual idol and the sample attribute information of the standard object to obtain sample fusion attribute information comprises:

inputting the sample attribute information of the virtual idol and the sample attribute information of the standard object into a generator of the generative confrontation network model to obtain the sample fusion attribute information;

the constructing an attribute consistency loss function according to the sample attribute information of the virtual idol and the sample fusion attribute information, and performing the self-supervision learning of the fusion model by using the attribute consistency loss function comprises the following steps:

inputting the sample fusion attribute information and the sample attribute information of the standard object into a discriminator of the generative confrontation network model respectively to obtain a first discrimination result aiming at the sample fusion attribute information and a second discrimination result aiming at the sample attribute information of the standard object;

determining the attribute consistency loss function according to the first judgment result, the second judgment result and the sample fusion attribute information;

and adjusting the network parameters of the generator according to the attribute consistency loss function.

8. The method according to any of claims 1-7, wherein the attribute information comprises at least one of: face attribute information, hair style attribute information, scene attribute information, ornament attribute information, and apparel attribute information.

9. An apparatus for transforming the appearance of a virtual idol, comprising:

an information acquisition unit configured to acquire attribute information of the virtual idol and attribute information of the standard object;

a mode determination unit configured to determine a target character transformation mode according to attribute information of the virtual idol and/or attribute information of the standard object;

and the information fusion unit is configured to fuse the attribute information of the virtual idol and the attribute information of the standard object based on a fusion model corresponding to the target image transformation mode to obtain a fusion result.

10. The apparatus of claim 9, wherein the mode determination unit comprises:

an information acquisition subunit configured to acquire a preset attribute information set;

the result obtaining subunit is configured to match the attribute information of the virtual idol and/or the attribute information of the standard object with the preset attribute information set to obtain a matching result;

a mode determination subunit configured to determine the target character transformation mode according to the matching result.

11. The apparatus of claim 10, wherein the mode determination subunit is further configured to:

12. The apparatus according to claim 10, if the matching result includes a first attribute information matching with the attribute information of the virtual idol in the preset attribute information set, or a second attribute information matching with the attribute information of the standard object; and

the information acquisition unit is further configured to: acquiring user preference information aiming at an image transformation mode;

the mode determination subunit further configured to: if the number of the first image transformation mode corresponding to the first attribute information or the second image transformation mode corresponding to the second attribute information is multiple, the target image transformation mode is determined from the first image transformation mode or the second image transformation mode according to the user preference information.

13. The apparatus according to any one of claims 9-12, wherein the information obtaining unit is further configured to:

14. The apparatus of claim 9, the apparatus further comprising:

a sample obtaining unit configured to obtain a training sample including: sample attribute information of the virtual idol and sample attribute information of the standard object;

the information obtaining unit is configured to fuse the sample attribute information of the virtual idol and the sample attribute information of the standard object to obtain sample fusion attribute information;

and the model training unit is configured to construct an attribute consistency loss function according to the sample attribute information of the virtual even image and the sample fusion attribute information, and the attribute consistency loss function is utilized to perform the self-supervision learning of the fusion model.

15. The apparatus of claim 14, wherein the fusion model is a generative confrontation network model; and

the information obtaining unit is further configured to: inputting the sample attribute information of the virtual idol and the sample attribute information of the standard object into a generator of the generative confrontation network model to obtain the sample fusion attribute information;

the model training unit further configured to: inputting the sample fusion attribute information and the sample attribute information of the standard object into a discriminator of the generative confrontation network model respectively to obtain a first discrimination result aiming at the sample fusion attribute information and a second discrimination result aiming at the sample attribute information of the standard object; determining the attribute consistency loss function according to the first judgment result, the second judgment result and the sample fusion attribute information; and adjusting the network parameters of the generator according to the attribute consistency loss function.

16. The apparatus according to any of claims 9-15, wherein the attribute information comprises at least one of: face attribute information, hair style attribute information, scene attribute information, ornament attribute information, and apparel attribute information.

17. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

18. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-8.

19. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-8.