CN116912345B

CN116912345B - Portrait cartoon processing method, device, equipment and storage medium

Info

Publication number: CN116912345B
Application number: CN202310854234.1A
Authority: CN
Inventors: 肖冠正; 张鑫; 苏泽阳; 赵岩
Original assignee: iMusic Culture and Technology Co Ltd
Current assignee: iMusic Culture and Technology Co Ltd
Priority date: 2023-07-12
Filing date: 2023-07-12
Publication date: 2024-04-26
Anticipated expiration: 2043-07-12
Also published as: CN116912345A

Abstract

The invention discloses a portrait animation processing method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring a portrait picture; preprocessing the portrait picture to obtain a preprocessed image; performing super-resolution denoising treatment on the preprocessed image through a pre-trained human image super-resolution model to obtain a super-resolution image; and performing cartoon treatment on the superdivision image through a pre-trained cartoon model to obtain a cartoon figure. The embodiment of the invention combines the super-resolution technology and the cartoon technology, improves the cartoon processing effect of the portrait, and can be widely applied to the technical field of image processing.

Description

Portrait cartoon processing method, device, equipment and storage medium

Technical Field

The invention relates to the technical field of image processing, in particular to a method, a device, equipment and a storage medium for processing a portrait animation.

Background

In the field of image processing, portrait animation is a technique that converts a real portrait into a cartoon or cartoon-style image. In recent years, a method for realizing animation by a deep learning technique has come to appear. In the related art, a style migration technology is generally used for realizing the migration of cartoon textures to a real-world scene, but clear facial contours cannot be reserved when the method is used for processing a human face, and pictures generated by cartoon are relatively fuzzy and have poor generation effect. In view of the foregoing, there is a need for solving the technical problems in the related art.

Disclosure of Invention

In view of this, the embodiments of the present invention provide a method, apparatus, device, and storage medium for processing a portrait animation, so as to improve the definition of a generated image.

In one aspect, the invention provides a method for processing a portrait animation, which comprises the following steps:

Acquiring a portrait picture;

preprocessing the portrait picture to obtain a preprocessed image;

Performing super-resolution denoising treatment on the preprocessed image through a pre-trained human image super-resolution model to obtain a super-resolution image;

and performing cartoon treatment on the superdivision image through a pre-trained cartoon model to obtain a cartoon figure.

Optionally, the preprocessing the portrait picture to obtain a preprocessed image includes:

decoding the portrait picture to obtain image data;

Performing edge filling processing on the image data to obtain a filled image;

and carrying out normalization processing on the filling image to obtain a preprocessed image.

Optionally, the performing super-resolution denoising processing on the preprocessed image through a pre-trained image super-resolution model to obtain a super-resolution image includes:

inputting the preprocessed image into the pre-trained portrait super-resolution model;

performing feature extraction processing on the preprocessed image through multi-layer depth separable convolution to obtain multi-layer features;

And carrying out up-sampling treatment on the multi-layer features through the sub-pixel convolution layer to obtain a super-resolution image.

Optionally, before the super-resolution denoising processing is performed on the preprocessed image through the pre-trained super-resolution model of the portrait, the method further includes pre-training the super-resolution model of the portrait, including:

Acquiring a portrait training data set;

Inputting the portrait training data set into the portrait super-resolution model to obtain a generated data set;

inputting the generated data set into a super-resolution discriminator network to obtain a generated data discrimination result;

and updating parameters of the portrait super-resolution model according to the generated data identification result.

Optionally, the performing cartoon processing on the super-resolution image through a pre-trained cartoon model to obtain a cartoon figure, including:

Inputting the superdivision image into the pre-trained cartoon model;

performing feature extraction processing on the super-resolution image through an encoder to obtain a feature vector;

and converting the feature vector through a decoder to obtain the cartoon figure.

Optionally, before the cartoon processing is performed on the super-resolution image through the pre-trained cartoon model to obtain the cartoon figure, the method further comprises pre-training the cartoon model, and specifically comprises the following steps:

Acquiring face image data, and performing data augmentation processing on the face image data to obtain cartoon training data;

performing cartoon treatment on the cartoon training data through the cartoon model to obtain cartoon generation data;

the cartoon generation data is identified by a cartoon identifier, so that a cartoon identification result is obtained;

And updating parameters of the cartoon model according to the cartoon identification result.

Optionally, the performing data augmentation processing on the face image data to obtain cartoon training data includes:

scaling and rotating the face image data to obtain image augmentation data;

And performing edge filling processing on the image augmentation data to obtain cartoon training data.

On the other hand, the embodiment of the invention also provides a portrait animation processing device, which comprises:

the first module is used for acquiring a portrait picture;

the second module is used for preprocessing the portrait picture to obtain a preprocessed image;

the third module is used for performing super-resolution denoising processing on the preprocessed image through a pre-trained human image super-resolution model to obtain a super-resolution image;

and the fourth module is used for performing cartoon processing on the super-resolution image through a pre-trained cartoon model to obtain a cartoon figure.

Optionally, the second module is configured to perform preprocessing on the portrait image to obtain a preprocessed image, and includes:

The first unit is used for decoding the portrait picture to obtain image data;

A second unit, configured to perform edge filling processing on the image data to obtain a filled image;

And the third unit is used for carrying out normalization processing on the filling image to obtain a preprocessed image.

Optionally, the third module is configured to perform super-resolution denoising processing on the preprocessed image through a pre-trained portrait super-resolution model, to obtain a super-resolution image, and includes:

a fourth unit for inputting the pre-processed image into the pre-trained portrait super-resolution model;

A fifth unit, configured to perform feature extraction processing on the preprocessed image through multi-layer depth separable convolution, to obtain multi-layer features;

and a sixth unit, configured to perform upsampling processing on the multi-layer feature through a sub-pixel convolution layer, so as to obtain a super-resolution image.

Optionally, the device further includes a fifth module, configured to pre-train the portrait super-resolution model, specifically including:

a seventh unit, configured to acquire a portrait training dataset;

An eighth unit, configured to input the portrait training data set to the portrait super-resolution model, to obtain a generated data set;

A ninth unit, configured to input the generated data set to a super-resolution discriminator network, to obtain a generated data discrimination result;

and a tenth unit, configured to update parameters of the portrait super-resolution model according to the generated data identification result.

Optionally, the fourth module is configured to perform animation processing on the superdivision image through a pre-trained animation model to obtain an animation portrait, and includes:

an eleventh unit for inputting the superdivision image into the pre-trained animation model;

a twelfth unit, configured to perform feature extraction processing on the super-resolution image through an encoder to obtain a feature vector;

And a thirteenth unit, configured to perform conversion processing on the feature vector by using a decoder, so as to obtain a cartoon figure.

Optionally, the apparatus further includes a sixth module for pre-training the cartoon model, including:

A fourteenth unit, configured to acquire face image data, and perform data augmentation processing on the face image data to obtain cartoon training data;

a fifteenth unit, configured to perform cartoon processing on the cartoon training data through the cartoon model to obtain cartoon generation data;

Sixteenth unit, which is used for identifying the cartoon generation data through cartoon identifier to obtain cartoon identification result;

seventeenth unit, is used for updating the parameter of the said cartoon model according to the said cartoon identification result.

Optionally, the fourteenth unit is configured to acquire face image data, perform data augmentation processing on the face image data, and obtain cartoon training data, and includes:

the first subunit is used for carrying out scaling and rotation processing on the face image data to obtain image augmentation data;

And the second subunit is used for performing edge filling processing on the image augmentation data to obtain cartoon training data.

On the other hand, the embodiment of the invention also discloses electronic equipment, which comprises a processor and a memory;

the memory is used for storing programs;

the processor executes the program to implement the method as described above.

In another aspect, embodiments of the present invention also disclose a computer readable storage medium storing a program for execution by a processor to implement a method as described above.

In another aspect, embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the foregoing method.

Compared with the prior art, the technical scheme provided by the invention has the following technical effects: according to the embodiment of the invention, super-resolution denoising processing is carried out on the preprocessed image through a pre-trained human image super-resolution model, so that a super-resolution image is obtained; according to the embodiment of the invention, the input image is subjected to the definition processing by using the super-resolution technology, so that the input noise is reduced, and the animation effect is improved; then, according to the embodiment of the invention, the super-resolution image is subjected to cartoon treatment through the pre-trained cartoon model to obtain the cartoon human image, so that the end-to-end cartoon treatment can be carried out without depending on the face key point detection and segmentation model, and the cartoon treatment effect is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for processing a cartoon of a portrait, which is provided by an embodiment of the invention;

fig. 2 is a flowchart of step S102 in fig. 1;

fig. 3 is a flowchart of step S103 in fig. 1;

FIG. 4 is a schematic structural diagram of a super resolution model of a portrait according to an embodiment of the present invention;

fig. 5 is a flowchart of step S104 in fig. 1;

FIG. 6 is a schematic structural diagram of a cartoon model according to an embodiment of the present invention;

FIG. 7 is a flowchart of a training method for a cartoon model provided by an embodiment of the present invention;

fig. 8 is a flowchart of step S501 in fig. 7;

Fig. 9 is a schematic structural diagram of a portrait animation processing device according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

In the related art, in the field of image processing, portrait animation is a technology of converting a real portrait image into a cartoon or cartoon style image. Traditional portrait animation methods are typically implemented using image processing algorithms and filters, but cannot generate high quality images because they do not accurately represent the details and textures in the real image. In recent years, a method for realizing animation by a deep learning technique has come to appear. For example, there are methods for realizing the migration of cartoon textures to a real-world scene by means of style migration technology, but clear facial contours cannot be reserved when a face is processed. Although the cartoon processing is performed on the specific area based on the key point information and the segmentation information of the face, when the picture input by the user is relatively blurred, the cartoon processing effect is poor, such as abnormal processing of the face area due to the fact that the face cannot be detected.

Therefore, the embodiment of the invention provides a portrait animation processing method, which combines the super-resolution technology and the animation technology to improve the portrait animation processing effect. The portrait animation processing method in the embodiment of the invention can be applied to a terminal, a server, software running in the terminal or the server and the like. The terminal may be, but is not limited to, a tablet computer, a notebook computer, a desktop computer, etc. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms.

Referring to fig. 1, an embodiment of the present invention provides a method for processing a portrait animation, including:

s101, acquiring a portrait picture;

S102, preprocessing the portrait picture to obtain a preprocessed image;

S103, performing super-resolution denoising treatment on the preprocessed image through a pre-trained human image super-resolution model to obtain a super-resolution image;

S104, performing cartoon processing on the super-resolution image through a pre-trained cartoon model to obtain a cartoon figure.

In the embodiment of the invention, a portrait picture is firstly obtained, and the portrait picture can be a whole body photo image, a half body photo image or a face image and the like. And then preprocessing the portrait picture, namely, preprocessing data such as decoding, edge filling, normalization and the like on the portrait picture input by the target object to obtain a preprocessed image. And inputting the preprocessed image into a pre-trained human image super-resolution model to perform super-resolution denoising processing, so as to obtain a super-resolution image. According to the embodiment of the invention, the input preprocessed image is subjected to the sharpening processing through the human image super-resolution model, so that the noise point and the fuzzy area of the input image are removed, the interference of the noise point and the fuzzy on the follow-up animation model is reduced, and the animation effect is improved. And finally, performing cartoon treatment on the superdivision image through a pre-trained cartoon model to obtain a cartoon figure. According to the embodiment of the invention, the cartoon model trained based on the paired face data can be directly applied to a whole-body image scene through data augmentation strategies such as zooming, rotation and edge filling during training, the construction difficulty of the whole-body image cartoon model is reduced, the end-to-end cartoon processing model is obtained through training, and the end-to-end cartoon processing can be carried out without depending on the face key point detection and segmentation model, so that any image uploaded by a target object is converted into a cartoon image.

It should be noted that, in each specific embodiment of the present application, when related processing is required to be performed according to data related to the identity or characteristics of the target object, such as information of the target object, behavior data of the target object, history data of the target object, and position information of the target object, permission or consent of the target object is obtained first, and the collection, use, processing, etc. of the data complies with related laws and regulations and standards. In addition, when the embodiment of the application needs to acquire the sensitive information of the target object, the independent permission or independent consent of the target object is acquired through a popup window or a jump to a confirmation page or the like, and after the independent permission or independent consent of the target object is explicitly acquired, the necessary target object related data for enabling the embodiment of the application to normally operate is acquired.

Further as an optional embodiment, referring to fig. 2, in step S102, the preprocessing the portrait picture to obtain a preprocessed image includes:

s201, decoding the portrait picture to obtain image data;

s202, performing edge filling processing on the image data to obtain a filled image;

And S203, carrying out normalization processing on the filling image to obtain a preprocessed image.

In the embodiment of the invention, the portrait picture is decoded, and the image data can be obtained by carrying out mapping, quantization and symbol encoding on an input image through an encoder, and then carrying out symbol decoding and reflection processing through a decoder. In addition, the embodiment of the invention also records the image data in numpy array form, wherein numpy is the basis of numerical calculation and scientific calculation in Python family, is the basis of many implementation by using tool kit, and numpy provides vector and matrix operation to help to optimize the performance of the quantization analysis algorithm. And then performing edge filling processing on the image data to obtain a filled image. In one possible embodiment, the input image is edge-filled to expand its width and height to a multiple of 32 by determining if the width and height of the input image are multiples of 32, if not. And finally, carrying out normalization processing on the filling image to obtain a preprocessed image, namely carrying out normalization processing on pixel values of pictures in the filling image, and normalizing the value range of the pixel values from 0 to 255 to a 0-1 interval.

Further as an optional embodiment, referring to fig. 3, in step S103, the performing, by using a pre-trained portrait super-resolution model, super-division denoising processing on the preprocessed image to obtain a super-division image includes:

s301, inputting the preprocessing image into the pre-trained portrait super-resolution model;

S302, performing feature extraction processing on the preprocessed image through multi-layer depth separable convolution to obtain multi-layer features;

S303, carrying out up-sampling processing on the multi-layer features through the sub-pixel convolution layer to obtain a super-resolution image.

In the embodiment of the invention, the pre-processed image is subjected to super-resolution denoising processing through a pre-trained human image super-resolution model, and the pre-processed image is input into the pre-trained human image super-resolution model, wherein the human image super-resolution model comprises a plurality of layers of depth separable convolution layers and sub-pixel convolution layers, the depth separable convolution layers comprise channel-by-channel convolution and point-by-point convolution and are used for extracting features, and the sub-pixel convolution layers are used for carrying out pixel recombination on input features and are convolution layers with an up-sampling function, which are applied to super-resolution reconstruction application. And carrying out feature extraction processing on the preprocessed image through multi-layer depth separable convolution to obtain multi-layer features, and carrying out up-sampling processing on the multi-layer features through a sub-pixel convolution layer to obtain a super-resolution image. Referring to fig. 4, the portrait super-resolution model is composed of a first layer of normal convolution layer, a plurality of layers of depth separable convolution, a second layer of normal convolution layer, and a sub-pixel convolution layer. According to the embodiment of the invention, through the definition processing of the human image super-resolution model on the input image, the noise point and the fuzzy area of the input image are removed, the interference of the noise point and the fuzzy on the animation model is reduced, and the subsequent animation effect is improved.

Further as an optional implementation manner, before the super-resolution denoising processing is performed on the preprocessed image through the pre-trained super-resolution model of the portrait, the method further includes pre-training the super-resolution model of the portrait, including:

Acquiring a portrait training data set;

In the embodiment of the invention, the pre-training human image super-resolution model is further included before super-division denoising processing is performed on the pre-processed image through the pre-training human image super-resolution model to obtain the super-division image. Specifically, the human figure training data set is acquired, and is input into the human figure super-resolution model for training. The embodiment of the invention can acquire high-definition portrait pictures through photographing equipment to construct a training data set, and can also acquire the training data set from a corresponding open-source portrait picture training database, wherein the training data set comprises paired portrait pictures, namely a low-definition portrait picture and a high-definition portrait picture, and the low-definition portrait picture is obtained by carrying out random JPEG compression, noise addition, fuzzy processing and scaling processing on the high-definition portrait picture. In addition, the embodiment of the invention uses a generating countermeasure network training method to carry out model training, namely, a portrait training data set is input into the portrait super-resolution model, and a generated data set after portrait super-resolution processing is obtained. And inputting the generated data set into a super-resolution discriminator network for discrimination to obtain a generated data discrimination result, and updating parameters of the human figure super-resolution model according to the generated data discrimination result. The super-resolution discriminator network structure may adopt classical network structures such as VGG16, VGG19 or ResNet. The generation countermeasure network training method is to take a super-resolution model of a human figure as a generator in the generation countermeasure network, identify data generated by the generator through a discriminator, and update parameters of the discriminator and the generator according to a discriminator result. Specifically, n super-resolution samples are extracted from a human figure training set, and n samples are generated by inputting common human figure training data distribution corresponding to the super-resolution samples into a super-resolution human figure model, namely a generator. The generator G is fixed and the discriminator D is trained to distinguish as far as possible between true and false. After the discriminator D is cyclically updated k times, the generator G is updated 1 time so that the discriminator is as indistinguishable as possible from true or false. Wherein n or k is a positive integer, and specific numerical values can be determined according to an actual training process. According to the embodiment of the invention, the super-resolution portrait model is trained by adopting the generation countermeasure network, so that the super-resolution effect of the super-resolution portrait model can be improved.

Further alternatively, referring to fig. 5, in step S104, the performing, by using a pre-trained animation model, animation processing on the super-resolution image to obtain an animated figure includes:

s401, inputting the superdivision image into the pre-trained cartoon model;

S402, performing feature extraction processing on the super-resolution image through an encoder to obtain a feature vector;

s403, converting the feature vector through a decoder to obtain the cartoon figure.

In the embodiment of the invention, the superdivision image is input into a pre-trained cartoon model, the cartoon model comprises an encoder and a decoder, the characteristic extraction processing is carried out on the superdivision image through the encoder to obtain a characteristic vector, and the conversion processing is carried out on the characteristic vector through the decoder to obtain the cartoon figure. After completing the cartoon processing, the embodiment of the invention removes the filling area and restores the image to the size of the input image according to the parameters of edge filling during the preprocessing in the step S102. Referring to FIG. 6, the animation model uses the Decoder-Encoder network architecture of class Unet in an embodiment of the invention. The Unet network structure is symmetrical and is similar to the English letter U, so the network structure is called Unet. The cartoon model in the embodiment of the invention comprises an encoder and a decoder, wherein the encoder consists of five layers of convolution blocks, the decoder also consists of five layers of convolution blocks, the encoder and the decoder are symmetrical, the encoder is used for extracting the characteristics, carrying out pooling treatment on the characteristics to reduce the characteristic dimension, the decoder is used for extracting the characteristics, carrying out up-sampling treatment on the characteristics to restore the characteristic dimension, and in each layer, the convolution blocks of the encoder and the decoder of the same layer are used for fusing the characteristics respectively extracted in a splicing way.

Further as an optional implementation manner, referring to fig. 7, before the performing, by using a pre-trained animation model, animation processing on the super-resolution image to obtain an animation portrait, the method further includes pre-training the animation model, and specifically includes:

S501, acquiring face image data, and performing data augmentation processing on the face image data to obtain cartoon training data;

S502, performing cartoon treatment on the cartoon training data through the cartoon model to obtain cartoon generation data;

S503, carrying out identification processing on the cartoon generation data through a cartoon identifier to obtain a cartoon identification result;

S504, updating parameters of the animation model according to the animation identification result.

In the embodiment of the invention, before the pre-trained cartoon model is used for carrying out the cartoon treatment on the super-resolution image to obtain the cartoon human image, the pre-trained cartoon model is also used for firstly acquiring human image data, and the cartoon model is constructed by adopting a generation countermeasure network training mode, wherein the training data are paired human face cartoon data, namely, the cartoon training data are paired human face cartoon data, namely, a common human face image and a corresponding human face cartoon image, and the paired human face cartoon data are subjected to the same data augmentation treatment to obtain the cartoon training data. According to the embodiment of the invention, the cartoon training data is subjected to cartoon treatment through the cartoon model to obtain the cartoon generation data, and then the cartoon identification result is obtained through the cartoon identifier to identify the cartoon generation data. The identifier network structure adopted by the cartoon identifier can adopt classical network structures such as VGG16, VGG19 or ResNet and the like. In addition, the loss function used by the cartoon model in the embodiment of the invention in training is as follows:

L_total＝1*L_MAE+10*L_Face+0.05*L_percep+0.1*L_GAN

Wherein L _MAE is the average absolute error loss of the generated cartoon image and the target cartoon image, L _Face is the average absolute error loss of the face region in the generated cartoon image and the target cartoon image (the face region is obtained by the open face analysis BiSeNet model), L _percep is the perception loss, and L _GAN is the GAN (generator) loss.

According to the embodiment of the invention, when the cartoon model is constructed, the cartoon model trained based on paired face data can be directly applied to a whole-body image scene through data augmentation strategies such as zooming, rotation and edge filling, so that the difficulty in constructing the whole-body image cartoon model is reduced, the end-to-end cartoon processing model is obtained through training, and the end-to-end cartoon processing can be carried out without depending on the face key point detection and segmentation model, so that any portrait image uploaded by a user is converted into a cartoon image, and the cartoon processing effect is improved.

Further alternatively, referring to fig. 8, in the step S501, the performing data augmentation processing on the face image data to obtain cartoon training data includes:

S601, performing scaling and rotation processing on the face image data to obtain image augmentation data;

s602, performing edge filling processing on the image augmentation data to obtain cartoon training data.

In the embodiment of the invention, the scaling and rotation processing is carried out on the face image data to obtain the image augmentation data, the edge filling processing is carried out on the image augmentation data, the possible positions of the face in the real scene are simulated through scaling, rotation and edge filling processing, and the cartoon model can learn the mapping relation of the textures of the face region through the edge filling processing. According to the embodiment of the invention, the data augmentation processing is adopted to enable the cartoon model to learn the mapping relation of the textures of the face-removing region, so that the cartoon model trained based on the paired face data can be directly applied to a whole-body image scene, the construction difficulty of the whole-body image cartoon model is reduced, and the cartoon effect of the cartoon model is improved.

In one possible embodiment of the present invention, the portrait pictures uploaded by the user are first obtained and preprocessed. And performing super-resolution processing on the human images through a pre-trained human image super-resolution model to obtain high-definition human image pictures. And inputting the high-definition portrait picture into the cartoon model to obtain a cartoon image. In another possible embodiment, the portrait pictures uploaded by the user are obtained and preprocessed. And inputting the portrait picture into the cartoon model to obtain a cartoon image. In yet another possible embodiment, the portrait video uploaded by the user is acquired and preprocessed frame by frame. And performing super-division processing on the video frames frame by frame through a pre-trained portrait super-resolution model to obtain high-definition portrait frames. And inputting the high-definition portrait frame into the cartoon model to obtain the cartoon image frame. And sequentially writing the cartoon image frames into and encoding the cartoon image frames into a video to obtain a cartoon video result. In addition, in a feasible embodiment of the invention, the portrait video uploaded by the user is acquired and preprocessed frame by frame. And inputting the video frames into the cartoon model to obtain cartoon image frames. And sequentially writing the cartoon image frames into and encoding the cartoon image frames into a video to obtain a cartoon video result.

On the other hand, referring to fig. 9, the embodiment of the invention further provides a portrait animation processing device, which includes:

a first module 901, configured to obtain a portrait picture;

a second module 902, configured to perform preprocessing on the portrait image to obtain a preprocessed image;

a third module 903, configured to perform super-resolution denoising processing on the preprocessed image through a pre-trained portrait super-resolution model, so as to obtain a super-resolution image;

And a fourth module 904, configured to perform animation processing on the super-resolution image through a pre-trained animation model, so as to obtain an animation human image.

It can be understood that the content in the above embodiment of the method for processing the portrait animation is applicable to the embodiment of the device for processing the portrait animation, and the functions specifically realized by the embodiment of the device are the same as those of the embodiment of the method for processing the portrait animation, and the beneficial effects achieved by the embodiment of the method for processing the portrait animation are the same as those achieved by the embodiment of the method for processing the portrait animation.

Referring to fig. 10, an embodiment of the present invention further provides an electronic device including a processor 1002 and a memory 1001; the memory is used for storing programs; the processor executes the program to implement the method as described above.

It can be understood that the content in the above embodiment of the method for processing the portrait animation is applicable to the embodiment of the electronic device, and the functions specifically realized by the embodiment of the electronic device are the same as those of the embodiment of the method for processing the portrait animation, and the beneficial effects achieved by the embodiment of the method for processing the portrait animation are the same as those achieved by the embodiment of the method for processing the portrait animation.

Corresponding to the method of fig. 1, an embodiment of the present invention also provides a computer-readable storage medium storing a program to be executed by a processor to implement the method as described above.

Similarly, the content in the embodiment of the portrait animation processing method is applicable to the embodiment of the computer readable storage medium, and the functions of the embodiment of the computer readable storage medium are the same as those of the embodiment of the portrait animation processing method, and the beneficial effects achieved by the embodiment of the portrait animation processing method are the same as those achieved by the embodiment of the portrait animation processing method.

Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the method shown in fig. 1.

Similarly, the content in the embodiment of the portrait animation processing method is applicable to the embodiment of the computer program product or the computer program, and the functions of the embodiment of the computer program product or the embodiment of the computer program are the same as those of the embodiment of the portrait animation processing method, and the achieved beneficial effects are the same as those of the embodiment of the portrait animation processing method.

In summary, the embodiment of the invention has the following advantages: according to the embodiment of the invention, the super-resolution technology is combined with the animation technology, and the super-resolution technology is utilized to perform the definition processing on the input image so as to remove the noise points and the fuzzy areas of the input image, reduce the interference of the noise points and the fuzzy on the animation model and improve the animation effect. And in the process of constructing the cartoon model, the cartoon model trained based on the paired face data can be directly applied to a whole-body image scene through data augmentation strategies such as zooming, rotation and edge filling, so that the difficulty in constructing the whole-body image cartoon model is reduced. The end-to-end cartoon processing model provided by the embodiment of the invention can carry out end-to-end cartoon processing without depending on a face key point detection and segmentation model, thereby converting any image uploaded by a user into a cartoon image and improving the cartoon processing effect.

In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.

Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiment of the present application has been described in detail, the present application is not limited to the embodiments described above, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present application, and these equivalent modifications or substitutions are included in the scope of the present application as defined in the appended claims.

Claims

1. A method for processing a portrait animation, the method comprising:

Acquiring a portrait picture;

preprocessing the portrait picture to obtain a preprocessed image;

Performing cartoon treatment on the super-resolution image through a pre-trained cartoon model to obtain a cartoon figure;

the super-resolution denoising processing is carried out on the preprocessed image through a pre-trained human image super-resolution model to obtain a super-resolution image, and the super-resolution denoising processing comprises the following steps:

2. The method of claim 1, wherein preprocessing the portrait picture to obtain a preprocessed image comprises:

decoding the portrait picture to obtain image data;

Performing edge filling processing on the image data to obtain a filled image;

3. The method of claim 1, wherein prior to the super-resolution denoising of the preprocessed image by the pre-trained portrait super-resolution model, the method further comprises pre-training the portrait super-resolution model, comprising:

Acquiring a portrait training data set;

4. The method according to claim 1, wherein said performing a cartoon process on said super-resolution image by means of a pre-trained cartoon model to obtain a cartoon figure comprises:

Inputting the superdivision image into the pre-trained cartoon model;

5. The method according to claim 1, wherein before the performing the cartoon processing on the super-resolution image through the pre-trained cartoon model to obtain a cartoon figure, the method further includes pre-training the cartoon model, specifically including:

6. The method of claim 5, wherein the performing data augmentation processing on the face image data to obtain cartoon training data comprises:

scaling and rotating the face image data to obtain image augmentation data;

7. A portrait animation processing device, the device comprising:

the first module is used for acquiring a portrait picture;

A fourth module, configured to perform cartoon processing on the super-resolution image through a pre-trained cartoon model, so as to obtain a cartoon figure;

The third module is configured to perform super-resolution denoising processing on the preprocessed image through a pre-trained portrait super-resolution model, to obtain a super-resolution image, and includes:

8. An electronic device comprising a memory and a processor;

the memory is used for storing programs;

The processor executing the program implements the method of any one of claims 1 to 6.

9. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 6.