CN113592971A - Virtual human body image generation method, system, equipment and medium - Google Patents
Virtual human body image generation method, system, equipment and medium Download PDFInfo
- Publication number
- CN113592971A CN113592971A CN202110865481.2A CN202110865481A CN113592971A CN 113592971 A CN113592971 A CN 113592971A CN 202110865481 A CN202110865481 A CN 202110865481A CN 113592971 A CN113592971 A CN 113592971A
- Authority
- CN
- China
- Prior art keywords
- human body
- target
- body image
- posture
- source
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 238000013528 artificial neural network Methods 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 58
- 238000010606 normalization Methods 0.000 claims description 31
- 239000011159 matrix material Substances 0.000 claims description 22
- 239000004576 sand Substances 0.000 claims description 12
- 238000005457 optimization Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 8
- 230000004927 fusion Effects 0.000 claims description 6
- 230000008685 targeting Effects 0.000 claims description 6
- 230000003042 antagnostic effect Effects 0.000 claims description 5
- 238000003860 storage Methods 0.000 claims description 5
- 230000002708 enhancing effect Effects 0.000 claims description 4
- 230000008447 perception Effects 0.000 claims description 4
- 230000036544 posture Effects 0.000 description 178
- 230000003044 adaptive effect Effects 0.000 description 19
- 238000013527 convolutional neural network Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 15
- 230000000694 effects Effects 0.000 description 5
- 238000009826 distribution Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 239000008186 active pharmaceutical agent Substances 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 210000000689 upper leg Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a virtual human body image generation method, a system, equipment and a medium, wherein the method comprises the following steps: inputting the source human body image and the target posture image into a pre-trained virtual human body image generation network to obtain a target posture human body image; wherein, virtual human image generation network is convolution neural network, includes: the encoder is used for inputting a source human body image and a target posture image, and encoding to obtain a source human body characteristic and a target human body characteristic; the structure-based appearance generation module is used for inputting and updating the source human body characteristics and the target human body characteristics to obtain updated source human body characteristics and target human body characteristics; and the decoder is used for inputting the target human body characteristics output by the structure-based appearance generation module and decoding to obtain the target posture human body image. The invention utilizes the virtual human body image generation network under the posture guidance based on the human body structure to generate the vivid human body image with the correct target posture.
Description
Technical Field
The invention belongs to the technical field of computer vision and computer graphics intersection, and particularly relates to a virtual human body image generation method, a virtual human body image generation system, virtual human body image generation equipment and a virtual human body image generation medium.
Background
The virtual human body image task under the posture guidance aims at a source human body image and a given target posture image to generate a new human body image of a target posture; wherein the human body posture of the generated new human body image is consistent with the target posture, and the human body appearance is similar to the appearance of the source human body image. The task has many application scenes, such as data augmentation in film production, virtual reality, motion recognition tasks and the like.
At present, the virtual image generation method under the current stage of posture guidance mainly has the following defects:
generating a virtual human body image of a target posture, wherein the posture consistency and the appearance consistency of the generated image are considered at the same time; the target posture is usually greatly different from the posture in the source human body image and is very complex, the existing method usually considers how to deform the source human body image to obtain the human body image with the target posture, and the human body posture of the generated image is not clear or even inconsistent with the target posture because the effectiveness of deformation cannot be ensured, so that the quality of the generated human body image with the specified posture is very low.
In summary, there is a need for a new method, system, device and medium for generating virtual human body images under posture guidance based on human body structures.
Disclosure of Invention
The present invention is directed to a method, system, device and medium for generating a virtual human body image, so as to solve one or more of the above-mentioned problems. The invention utilizes the virtual human body image generation network under the posture guidance based on the human body structure to generate the vivid human body image with the correct target posture.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention discloses a virtual human body image generation method, which comprises the following steps:
inputting a source human body image and a target posture image into a pre-trained virtual human body image generation network, and outputting the virtual human body image generation network to obtain a target posture human body image;
wherein, the virtual human body image generation network is a convolution neural network, and comprises:
the encoder is used for inputting a source human body image and a target posture image, and encoding to obtain a source human body characteristic and a target human body characteristic;
the structure-based appearance generation module is used for inputting and updating the source human body characteristics and the target human body characteristics to obtain updated source human body characteristics and target human body characteristics;
and the decoder is used for inputting the target human body characteristics output by the structure-based appearance generation module and decoding to obtain a target posture human body image.
The invention has the further improvement that the step of acquiring the trained virtual human body image generation network specifically comprises the following steps:
acquiring a sample data set; each sample data in the sample data set comprises source human body image sample data, target human body image sample data, source human body posture sample data and target human body posture sample data;
inputting source human body image sample data, source human body posture sample data and target human body posture sample data in selected sample data of the sample data set into the virtual human body image generation network to obtain virtual target human body image data; constructing a loss function based on the virtual target human body image data and target human body image sample data in the selected sample data, and performing iterative optimization on the virtual human body image generation network;
and obtaining the trained virtual human body image generation network after reaching the preset iteration times or convergence conditions.
A further refinement of the invention is that the structure-based appearance generation module comprises:
the structure perception self-adaptive normalization module is used for inputting the source human body characteristics and the target human body characteristics, generating stylized target posture characteristics and outputting the stylized target posture characteristics;
and the characteristic enhancement module is used for inputting the generated stylized target posture characteristic and the source human body characteristic and outputting the updated source human body characteristic and the updated target human body characteristic.
In the sample data set, the step of obtaining the source human body posture sample data and the target human body posture sample data in each sample data comprises: carrying out attitude estimation on the human body image by adopting an openposition attitude estimation method to obtain 18 human body joint point coordinate sequences; wherein the source human body imageIs expressed as P (I)s)={p1,…,pKH, 18; target human body imageIs expressed as P (I)t)={p1,…,pKH, 18; based on the obtained coordinate sequence of the human body joint points, K heat maps are used for representing human body posture information; wherein the source human body posture information is expressed asThe target human posture information is expressed as
In the encoder, the step of encoding to obtain the source human body characteristic and the target human body characteristic specifically includes: targeting pose information P with 2 downsampled convolutional layerstEncoding as target human features Ct(ii) a Source human image I with 2 downsampled convolution layerssAnd attitude information PsEncoding as source human body characteristic Cs。
A further improvement of the present invention is that, in the structure-based appearance generating module, the step of inputting and updating the source human body characteristics and the target human body characteristics to obtain updated source human body characteristics and target human body characteristics specifically includes:
dividing the human body image into a plurality of human body parts and 1 background part based on the obtained human body joint point coordinate sequence to obtain L masks of each part; wherein each partial mask of the source human body image is represented asThe partial mask of the target human body image is represented as
Targeting body features C with two convolutional layers+Carrying out convolution to obtain target human body characteristicsTargeting body features C with two convolutional layerssConvolving to obtain source human body characteristicsAccording to source human body characteristics FsAnd partial mask M of source human body imagesGenerating a style vectorWherein for VstyFor each of the rows of the plurality of rows,is a C-dimensional vector representing the characteristics of each part of the source human body image; obtaining style vector of 1 st part by mean pooling
Wherein Resize (·) represents a zoom operation;representing element-by-element multiplication; pool (·) denotes pooling;
according to the corresponding relation of all parts of the source human body image and the target human body image, the style vector V is converted into a style vector VstyPartial mask M inserted into target human body imagetObtaining style matrix T in corresponding partssty(ii) a Wherein the 1 st style vector is usedInserting the 1 st mask of the target human body image into the 1 st style matrix in a broadcasting way to generate a 1 st style matrixAll L style matricesThe element-by-element addition of 0, …, L results in the final style matrix Tsty:
Using two convolutional layers to form a style matrix TstyConvolution is carried out to obtain modulation parameters in normalization operationAnd
to the target attitude feature F+Go on to return in batchesIs subjected to a normalization treatment to obtainUsing gamma and beta to FnormModulating to obtain stylized target attitude characteristic Fsty:
Fsty=γFnorm+β;
To target characteristic attitude FstyAnd source human features FsSplicing and fusing, and then enhancing the fused features by using a Squeeze-and-Excitation operation to obtain enhanced features Ffuse,
Obtaining updated target human body characteristic C't:
updated source human body characteristic C'sIs a source human body characteristic FsAnd updated target body characteristic C'tAnd (4) splicing.
A further refinement of the invention provides that the loss function comprises: an antagonistic loss function, a perceptual loss function, and a loss function based on similarity of human structures.
A further improvement of the present invention is that the structure-based appearance generation module is replaced with an integrated appearance generation module;
the integrated appearance generating module is composed of a plurality of the structure-based appearance generating modules in a cascade.
The invention relates to a virtual human body image generation system, which comprises:
the image generation module is used for inputting the source human body image and the target posture image into a pre-trained virtual human body image generation network and outputting the virtual human body image generation network to obtain a target posture human body image;
wherein, the virtual human body image generation network is a convolution neural network, and comprises:
the encoder is used for inputting a source human body image and a target posture image, and encoding to obtain a source human body characteristic and a target human body characteristic;
the structure-based appearance generation module is used for inputting and updating the source human body characteristics and the target human body characteristics to obtain updated source human body characteristics and target human body characteristics;
and the decoder is used for inputting the target human body characteristics output by the structure-based appearance generation module and decoding to obtain a target posture human body image.
An electronic device of the present invention includes a processor and a memory, the processor is configured to execute a computer program stored in the memory to implement the virtual human body image generation method according to any one of the above aspects of the present invention.
A computer-readable storage medium of the present invention stores at least one instruction, which when executed by a processor, implements a virtual human body image generation method as any one of the above.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a novel method for generating a virtual human body image under posture guidance based on a human body structure, which can generate a vivid human body image with a correct target posture. Specifically, aiming at the technical problems that the effectiveness of deformation cannot be guaranteed, the quality of the generated human body image with the specified posture is low, a fuzzy human body posture is easy to generate, and even the consistency of the posture cannot be maintained in the existing method, the invention constructs a human body structure-based posture-guided virtual human body image generation network (SAGN), and carries out iterative optimization on the constructed convolutional neural network (SAGN) to obtain a pre-trained convolutional neural network (SAGN) to realize the posture-guided virtual human body image generation. The virtual human body image generation network (SAGN) under the posture guidance based on the human body structure can directly generate the appearance of the virtual human body image generation network according to the target posture, so that the consistency of the human body posture and the target posture of the generated image can be ensured to the maximum extent, and meanwhile, the virtual human body image generation network also has vivid human body appearance, so that a vivid human body image with a correct posture can be generated, and the virtual human body image generation under the guidance of the target posture is realized; meanwhile, a new idea is provided for solving the difficult task of generating the human body image in the target posture.
In the system, aiming at the problems that the existing method cannot ensure the effectiveness of deformation, the generated human body image with the specified posture has low quality, the fuzzy human body posture is easy to generate, and even the posture consistency cannot be maintained, a virtual human body image generation network (SAGN) guided by the posture based on the human body structure is introduced, and the SAGN consists of a series of appearance generation modules (SAG-Blk) based on the structure. Each structure-based appearance generation module consists of a structure-aware adaptive normalization (SAN) submodule and a Feature Enhancement (FE) submodule; the structure-aware adaptive normalization (SAN) module generates stylized target attitude features by using a normalization method, and the Feature Enhancement (FE) module provides rich appearance information for the stylized target attitude features, so that the appearance features of the stylized target attitude features are further enhanced. The two sub-modules cooperate together to gradually generate a vivid human body image with a correct posture. And realizing the generation of the virtual human body image under the guidance of the target posture.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art are briefly introduced below; it is obvious that the drawings in the following description are some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a schematic flowchart of a method for generating a virtual human body image under posture guidance based on a human body structure according to an embodiment of the present invention;
FIG. 2 is a schematic view of a joint of a human body according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a virtual human body image generation network (SAGN) under the guidance of human body structure-based posture in an embodiment of the present invention;
FIG. 4 is a schematic diagram of a fabric-aware adaptive (SAN) module in an embodiment of the present invention;
FIG. 5 is a schematic diagram of a partial result on a Market-1501 data set in accordance with an embodiment of the present invention;
fig. 6 is a graphical representation of a portion of the results on the depfashinon dataset in an embodiment of the present invention.
Detailed Description
In order to make the purpose, technical effect and technical solution of the embodiments of the present invention clearer, the following clearly and completely describes the technical solution of the embodiments of the present invention with reference to the drawings in the embodiments of the present invention; it is to be understood that the described embodiments are only some of the embodiments of the present invention. Other embodiments, which can be derived by one of ordinary skill in the art from the disclosed embodiments without inventive faculty, are intended to be within the scope of the invention.
Example 1
In embodiment 1 of the present invention, a method for generating a virtual human body image under posture guidance based on a human body structure is provided, in consideration of generating an appearance directly according to a target posture to maximally ensure consistency between a human body posture and the target posture of a generated image, for solving the problems that the existing method cannot ensure validity of deformation, the generated human body image in a specified posture has low quality, and a blurred human body posture is easily generated, and even the posture consistency cannot be maintained.
SAGN consists of a series of structure-based appearance generating modules (SAG-Blk); wherein each structure-based appearance generation module is comprised of a structure-aware adaptive normalization (SAN) sub-module and a Feature Enhancement (FE) sub-module. The structure-aware adaptive normalization (SAN) module generates stylized target attitude features by using a normalization method, and the Feature Enhancement (FE) module provides rich appearance information for the stylized target attitude features, so that the appearance features of the stylized target attitude features are further enhanced. The two sub-modules cooperate together to gradually generate a vivid human body image with a correct posture. And realizing the generation of the virtual human body image under the guidance of the target posture.
The method for generating the virtual human body image under the posture guidance based on the human body structure in the embodiment 1 of the invention comprises the following steps:
step 2, constructing a virtual human body image generation network (SAGN) guided by the posture based on the human body structure; wherein the specific steps of the virtual human body image generation network (SAGN) construction include:
constructing an Encoder (Encoder); constructing a structure-based appearance generation module (SAG-Blk); constructing a Decoder (Decoder); wherein, the specific steps of the structure-based appearance generation module construction include: constructing a structure-aware adaptive normalization (SAN) module and a Feature Enhancement (FE) module;
step 4, constructing a loss function based on the virtual target human body image obtained in the step 3 and the target human body image acquired and obtained in the step 1, and performing iterative optimization on the network constructed in the step 2; and after the preset iteration times are reached, obtaining an optimized virtual human body image generation network under the posture guidance based on the human body structure, and generating a vivid human body image with a correct posture by realizing the virtual human body image generation under the target posture guidance.
In embodiment 2 of the present invention, in step 1, the specific steps of obtaining the posture information of the source human body image and the target human body image according to the source human body image and the target human body image include:
step 1.1, carrying out posture estimation on the human body image by using a posture estimation method to obtain joint point coordinate sequences of a preset number of source human body images and joint point coordinate sequences of a target human body image;
and step 1.2, representing the human body posture information by heat map based on the human body joint point coordinate sequence obtained in the step 1.1, and obtaining source human body posture information and target human body posture information.
Illustratively, step 1.1 in the embodiment of the present invention specifically includes: carrying out attitude estimation on the human body image by using an openposition attitude estimation method to obtain 18 human body joint point coordinate sequences; wherein the source human body imageIs expressed as p (is) ═ p1,…,pKH, 18; target human body imageIs expressed as P (I)t)={p1,…,pK},K=18。
In step 1.2, the method specifically comprises the following steps: based on the human body joint point coordinate sequence obtained in the step 1.1, representing human body posture information by K heat maps; wherein the source human body posture information is expressed asThe target human posture information is expressed as
In embodiment 3 of the present invention, in step 2, the specific step of constructing a virtual human body image generation network (SAGN) based on the posture guidance of the human body structure includes:
step 2.1, constructing an Encoder (Encoder), encoding the input target posture information, the input source human body image and the input posture information respectively, and encoding the target posture information, the input source human body image and the input posture information into target human body characteristics and source human body characteristics to obtain an Encoder;
step 2.2, constructing an appearance generating module (SAG-Blk) based on a structure, updating the target human body characteristics and the source human body characteristics in the step 2.1 (or new target human body characteristics and new source human body characteristics output by the last SAG-Blk), and generating appearance information of the target human body characteristics in the step 2.1 according to the source human body characteristics in the step 2.1 to obtain new target human body characteristics and new source human body characteristics; wherein, the total T is 9 cascaded SAG-Blk to gradually generate appearance information of the target human body characteristics; finally, 9 cascaded appearance generating modules based on the structure are obtained;
and 2.3, constructing a Decoder (Decoder), decoding the new target human body characteristics output by the last SAG-Blk in the step 2.2, and generating a human body image of the target posture to obtain the Decoder.
Illustratively, step 2.1 in the embodiment of the present invention specifically includes: targeting pose information P with 2 downsampled convolutional layerstEncoding as target human features Ct(ii) a Source human image I with 2 downsampled convolution layerssAnd attitude information PsEncoding as source human body characteristic Cs。
In embodiment 4 of the present invention, in step 2.2, the specific step of constructing the structure-based appearance generating module (SAG-Blk) includes:
step 2.2.1, a structure-aware adaptive normalization (SAN) module is constructed, the target human body feature in the step 2.1 (or a new target human body feature output by the last SAG-Blk) is updated by using a normalization method, and stylized target posture features are generated; finally obtaining a structure-aware self-adaptive normalization module;
step 2.2.2, constructing a Feature Enhancement (FE) module to provide rich appearance information for the stylized target posture feature obtained in step 2.2.1, namely, the source human body feature in step 2.1 (or a new source human body feature output by the last SAG-Blk) further enhances the appearance feature of the stylized target posture feature; and finally obtaining the characteristic enhancement module.
Illustratively, step 2.2.1 in the embodiment of the present invention specifically includes: based on the human body joint point coordinate sequence obtained in the step 1.1, dividing the human body image into a plurality of human body joint point coordinate sequencesA human body part and 1 background part are dried, and L masks of all parts are obtained; wherein each partial mask of the source human body image is represented asThe partial mask of the target human body image is represented as
Step 2.2.1 specifically comprises: targeting body features C with two convolutional layers+(or the last SAG-Blk output New target body characteristic C't) Carrying out convolution to obtain target human body characteristicsTargeting body features C with two convolutional layerss(or the last SAG-Blk output New Source body characteristic C's) Convolving to obtain source human body characteristics
According to source human body characteristics FsAnd partial mask M of source human body imagesGenerating a style vectorWherein for VstyFor each of the rows of the plurality of rows,is a C-dimensional vector representing the characteristics of each part of the source human body image; specifically, mean pooling is used herein to obtain the style vector for the 1 st segment
Where Resize (·) denotes a zoom operation, where M needs to be scaledsScaling to have and FsThe same size, i.e., H 'x W';representing element-by-element multiplication; pool (·) denotes a pooling operation where all elements other than 0 are pooled.
According to the corresponding relation of all parts of the source human body image and the target human body image, the style vector V is converted into a style vector VstyPartial mask M inserted into target human body imagetObtaining style matrix T in corresponding partssty(ii) a Wherein the 1 st style vector is usedInserting the style matrix into the 1 st mask of the target human body in a broadcasting way to generate a 1 st style matrixAll L style matrices Element by element addition to obtain the final style matrix Tsty:
The style matrix T obtained herestyThere is a body pose that is consistent with the target body pose, but contains the most critical body appearance information.
Using two convolutional layers to form a style matrix TstyConvolution is carried out to obtain modulation parameters in normalization operationAndmeanwhile, here, the target attitude feature F is+Is subjected to batch normalization treatment to obtainFinally, F is subtended with gamma and betanormModulating to obtain stylized target attitude characteristic Fsty:
Fsty=γFnorm+β;
Notably, the stylized target pose feature F derived herestyThe posture information of the target posture is kept, and the most key human body appearance information is also contained.
In step 2.2.2 of the embodiment of the present invention, the method specifically includes: to target characteristic attitude FstyAnd source human features FsSplicing and fusing, and then enhancing the fused features by using a Squeeze-and-Excitation operation to obtain enhanced features FfuseA residual module is added to further accelerate network training to obtain a final new target human body feature C't:
Wherein,represents a feature fusion operation, pair Ct,Ft,FsThe fusion is carried out by splicing and adding. Finally, new source human body characteristic C'sIs a source human body characteristic FsAnd New target human body characteristic C'tAnd (4) splicing.
In step 2.3 of the embodiment of the present invention, the method specifically includes: new target body feature C 'output to last SAG-Blk with 2 upsampled convolutional layers'tDecoding to generate human body image of target posture
In step 4 of the embodiment of the present invention, constructing the loss function based on the virtual target human body image obtained in step 3 and the target human body image acquired and obtained in step 1 specifically includes: a confrontational loss function, a perceptual loss function, and a loss function based on similarity of human structures.
The system for generating a virtual human body image based on posture guidance of a human body structure in embodiment 4 of the present invention includes:
the sample acquisition module is used for acquiring and acquiring a source human body image and a target human body image; obtaining target posture information according to the target human body image and the target human body image;
a network model construction module for constructing a virtual human body image generation network (SAGN) under the guidance of the posture based on the human body structure; wherein the human body image generation network (SAGN) comprises three parts: an Encoder (Encoder), a structure-based appearance generation module (SAG-Blk), a Decoder (Decoder); wherein the structure-based appearance generation module comprises two parts: a structure-aware adaptive normalization (SAN) module and a Feature Enhancement (FE) module;
the training module is used for inputting the source human body image, the source human body posture information and the target human body posture information into a constructed network (SAGN) to obtain a virtual target human body image;
an optimization module for constructing a loss function based on the virtual target human body image and the real target human body image, and performing iterative optimization on the network (SAGN); and after the preset iteration times are reached, obtaining an optimized virtual human body image generation network under the posture guidance based on the human body structure, and generating a vivid human body image with a correct posture by realizing the virtual human body image generation under the target posture guidance.
Aiming at the problems that the existing method cannot ensure the effectiveness of deformation, the generated human body image with the specified posture has low quality, the fuzzy human body posture is easy to generate, and even the posture consistency cannot be maintained, the system introduces a virtual human body image generation network (SAGN) guided by the posture based on the human body structure, and the SAGN consists of a series of appearance generation modules (SAG-Blk) based on the structure. Each structure-based appearance generation module consists of a structure-aware adaptive normalization (SAN) sub-module and a Feature Enhancement (FE) sub-module. The structure-aware adaptive normalization (SAN) module generates stylized target attitude features by using a normalization method, and the Feature Enhancement (FE) module provides rich appearance information for the stylized target attitude features, so that the appearance features of the stylized target attitude features are further enhanced. The two sub-modules cooperate together to gradually generate a vivid human body image with a correct posture. And realizing the generation of the virtual human body image under the guidance of the target posture.
The method for generating the virtual human body image under the posture guidance based on the human body structure, disclosed by the embodiment 5 of the invention, comprises the following steps of:
1.1) carrying out posture estimation on the human body image by using a posture estimation method to obtain joint point coordinate sequences of a preset number of source human body images and joint point coordinate sequences of a target human body image;
1.2) based on the human body joint point coordinate sequence obtained in the step 1.1), representing human body posture information by heat map, and obtaining source human body posture information and target human body posture information.
Step 2, constructing a virtual human body image generation network (SAGN) under the guidance of the posture of the human body structure:
2.1) constructing an Encoder (Encoder), respectively encoding the input target posture information, the input source human body image and the input posture information into target human body characteristics and source human body characteristics to obtain an Encoder;
2.2) constructing an appearance generating module (SAG-Blk) based on a structure, updating the target human body characteristics and the source human body characteristics in the step 2.1) (or new target human body characteristics and new source human body characteristics output by the last SAG-Blk), and generating appearance information of the target human body characteristics in the step 2.1) according to the source human body characteristics in the step 2.1) to obtain new target human body characteristics and new source human body characteristics; wherein, the total T is 9 cascaded SAG-Blk to gradually generate appearance information of the target human body characteristics; finally, 9 cascaded appearance generating modules based on the structure are obtained;
2.3) constructing a Decoder (Decoder), decoding the new target human body characteristics output by the last SAG-Blk in the step 2.2), and generating a human body image of the target posture to obtain the Decoder.
1) organizing data input into a convolutional neural network;
2) and generating a target posture human body image by using the convolutional neural network (SAGN) constructed in the step 2.
Step 4, constructing a loss function of a convolutional neural network (SAGN):
4.1) constructing a resistance loss function;
4.2) constructing a perception loss function;
4.3) constructing a loss function based on the similarity of human body structures;
step 5, optimizing network parameters, and realizing the generation of the virtual human body image under the guidance of the target posture:
5.1) carrying out iterative optimization on the network parameters constructed in the step 2 according to the loss function obtained in the step 4;
5.2) when the preset iteration number is reached, generating the virtual human body image under the guidance of the target posture by using the convolutional neural network (SAGN) constructed in the step 2.
Aiming at the problems that the existing method cannot ensure the effectiveness of deformation, the generated human body image with the specified posture is low in quality, fuzzy human body postures are easy to generate, and even the posture consistency cannot be maintained, the virtual human body image generation method under the posture guidance based on the human body structure introduces a virtual human body image generation network (SAGN) under the posture guidance based on the human body structure, and the SAGN is composed of a series of appearance generation modules (SAG-Blk) based on the structure. Each structure-based appearance generation module consists of a structure-aware adaptive normalization (SAN) sub-module and a Feature Enhancement (FE) sub-module. The structure-aware adaptive normalization (SAN) module generates stylized target attitude features by using a normalization method, and the Feature Enhancement (FE) module provides rich appearance information for the stylized target attitude features, so that the appearance features of the stylized target attitude features are further enhanced. The two sub-modules cooperate together to gradually generate a vivid human body image with a correct posture. And realizing the generation of the virtual human body image under the guidance of the target posture.
Referring to fig. 1, a method for generating a virtual human body image under posture guidance based on a human body structure according to an embodiment of the present invention includes the following steps:
1.1) carrying out posture estimation on the human body image by using a posture estimation method to obtain the K-18 joint point coordinates of the source human body image and the K-18 joint point coordinates of the target human body image.
In the embodiment of the invention, an openposition attitude estimation method is used for carrying out attitude estimation on a human body image to obtain 18 human body joint point coordinate sequences; wherein the source human body imageIs expressed as P (I)s)={p1,…,pKH, 18; target human body imageIs expressed as P (I)t)={p1,…,pKH, 18; fig. 2 is a schematic diagram of 18 joint points.
1.2) based on the human body joint point coordinate sequence obtained in the step 1.1), representing human body posture information by heat map, and obtaining source human body posture information and target human body posture information.
In the embodiment of the invention, in order to utilize the spatial characteristics of the coordinates of the human body joint points, K is 18 heat maps to represent the human body posture information; wherein the source human body posture information is expressed asTargetThe human posture information is expressed as
Step 2, constructing a virtual human body image generation network (SAGN) under the guidance of the posture of the human body structure:
the virtual human body image generation network (SAGN) guided by the posture based on the human body structure is composed of an encoder, a structure-based appearance generation module and a decoder; fig. 3 is a schematic diagram of a virtual human body image generation network (SAGN) structure under the guidance of a human body structure-based posture.
2.1) constructing an Encoder (Encoder), and respectively encoding the input target attitude information and the input source human body image and attitude information.
In an embodiment of the present invention, target pose information P is mapped to 2 downsampled convolutional layerstEncoding as target human features Ct(ii) a Source human image I with 2 downsampled convolution layerssAnd attitude information PsEncoding as source human body characteristic Cs。
2.2) building a structure-based appearance generation module (SAG-Blk).
In the embodiment of the invention, T is 9 structure-based appearance generation modules (SAG-Blk) are shared by a virtual human body image generation network (SAGN) under the posture guidance of a human body structure, and each SAG-Blk is composed of a structure-aware adaptive (SAN) module and a Feature Enhancement (FE) module. The structure-aware adaptive normalization (SAN) module generates stylized target attitude features by using a normalization method, and the Feature Enhancement (FE) module provides rich appearance information for the stylized target attitude features, so that the appearance features of the stylized target attitude features are further enhanced. The two sub-modules cooperate together to gradually generate a vivid human body image with a correct posture. And realizing the generation of the virtual human body image under the guidance of the target posture.
2.2.1) construct a fabric-aware adaptive (SAN) module.
In the embodiment of the invention, the human body image is divided into 10 human body parts and 1 back based on the human body joint point coordinate sequence obtained in the step 1.1A landscape part including a head, a left (right) upper arm, a left (right) lower arm, a left (right) thigh, a left (right) shank, a trunk and a background; obtaining masks of L parts; wherein each partial mask of the source human body image is represented asThe partial mask of the target human body image is represented as
Fig. 4 is a schematic diagram of a structure-aware adaptive (SAN) module. In an embodiment of the invention, two convolutional layers are used to pair the target human body feature Ct(or the last SAG-Blk output New target body characteristic C't) Carrying out convolution to obtain target human body characteristicsTargeting body features C with two convolutional layerss(or the last SAG-Blk output New Source body characteristic C's) Convolving to obtain source human body characteristics
According to source human body characteristics FsAnd partial mask M of source human body imagesGenerating a style vectorWherein for VstyFor each of the rows of the plurality of rows,is a C-dimensional vector representing the characteristics of each part of the source human body image; specifically, mean pooling is used herein to obtain the style vector for the 1 st segment
Where Resize (·) denotes a zoom operation, where M needs to be scaledsScaling to have and FsThe same size, i.e., H 'x W';representing element-by-element multiplication; pool (·) denotes a pooling operation where all elements other than 0 are pooled.
According to the corresponding relation of all parts of the source human body image and the target human body image, the style vector V is converted into a style vector VstyPartial mask M inserted into target human body imagetObtaining style matrix T in corresponding partssty(ii) a Wherein the 1 st style vector is usedInserting the style matrix into the 1 st mask of the target human body in a broadcasting way to generate a 1 st style matrixAll L style matrices Element by element addition to obtain the final style matrix Tsty:
The style matrix T obtained herestyThere is a body pose that is consistent with the target body pose, but contains the most critical body appearance information.
Using two convolutional layers to form a style matrix TstyConvolution is carried out to obtain modulation parameters in normalization operationAndmeanwhile, here, the target attitude feature F istIs subjected to batch normalization treatment to obtainFinally, F is subtended with gamma and betanormModulating to obtain stylized target attitude characteristic Fsty:
Fsty=γFnorm+β;
Notably, the stylized target pose feature F derived herestyThe posture information of the target posture is kept, and the most key human body appearance information is also contained.
2.2.1) building a Feature Enhancement (FE) module.
To target characteristic attitude FstyAnd source human features FsSplicing and fusing, and then enhancing the fused features by using a Squeeze-and-Excitation operation to obtain enhanced features FfuseA residual module is added to further accelerate network training to obtain a final new target human body feature C't:
Wherein,represents a feature fusion operation, pair Ct,Ft,FsThe fusion is carried out by splicing and adding. Finally, new source human body characteristic C'sIs a source human body characteristic FsAnd New target human body characteristic C'tAnd (4) splicing.
And 2.3) constructing a Decoder (Decoder), decoding the new target human body characteristics output by the last SAG-Blk in the step 2.2), and generating a human body image of the target posture.
In an embodiment of the invention, the new target human body feature C 'output by the last SAG-Blk in step 2.2 is subjected to convolution layer sampling 2'tDecoding to generate human body image of target posture
1) data input to the convolutional neural network is organized.
The data input into the network is divided into two parts, one part is the target human body posture information expressed by heat map obtained in step 1, and the other part is the source human body image and the source human body posture information expressed by heat map.
2) And generating a target posture human body image by using the convolutional neural network (SAGN) constructed in the step 2.
And inputting the organized data into a network to generate a human body image of the target posture.
Step 4, constructing a loss function of a convolutional neural network (SAGN):
in the embodiment of the invention, a combination of an antagonistic loss function, a perceptual loss function and a loss function based on human body structure similarity is used as the loss function of the convolutional neural network (SAGN) provided by the invention.
Wherein the fight loss function uses a discriminator for measuring the distance between the true image distribution and the generated image distribution and continuously reducing the distance between the two distributions. The invention constructs two discriminators, namely an appearance discriminator and a posture discriminator respectively, which are used for ensuring a real image ItAnd generating an imageAppearance consistency and posture consistency, and the formula is defined as:
wherein,andrepresenting the human body posture image and the distribution of the real human body image,which represents the human image generation network proposed by the present invention. More details can be found in the Progressive position authentication transfer for person image generation.
The perceptual loss function is used to measure the similarity between the feature maps of the real image and the generated image, and usually the L1 distance between two feature maps is calculated as the perceptual loss, namely:
wherein phi isiIs the output of the i-th layer of a pre-trained network, and usually adopts the characteristic diagram of the conv1_2 layer output of the VGG-19 network pre-trained in ImageNet. See the Perceptial losses for real-time style transfer and super-resolution for more details.
The loss function based on human body structure similarity is used for measuring the structure similarity of each human body part of the real image and the generated image. The accurate measurement of the similarity of each human body part can bring clear human body boundary and detailed texture characteristics to the virtual human body image. It is defined as:
wherein MSSIM (·,. cndot.) is a structural similarity, i.e., ItAndstructural similarity of (1) background (section 0). SSIMl(-) is the structural similarity of the 1 st part of the human image. See Loss Functions for Person Image Generation for more details.
4.1) constructing a resistance loss function.
"general adaptive networks" achieves better effects in image generation using the penalty function, and the penalty function in this paper is used as one of the penalty functions of the convolutional neural network (SAGN) proposed by the present invention.
4.2) constructing a perception loss function.
"Perceptual losses for real-time style transfer and super-resolution" achieves better effect in style migration by using the Perceptual loss function, and the Perceptual loss function in the paper is taken as one of the loss functions of the convolutional neural network (SAGN) proposed by the invention for reference.
4.3) constructing a loss function based on the similarity of human body structures.
The Loss functions for person image generation use Loss functions based on human body structure similarity to effectively calculate the structure similarity of each part of a human body, and obtain a better effect in the aspect of human body image generation.
Step 5, optimizing network parameters, and realizing the generation of the virtual human body image under the guidance of the target posture:
5.1) carrying out iterative optimization on the network parameters constructed in the step 2 according to the loss function obtained in the step 4;
iterate 90k times using Adam optimizer, where β1=0.5,β2=0.999。
5.2) when the preset iteration number is reached, generating the virtual human body image under the guidance of the target posture by using the convolutional neural network (SAGN) constructed in the step 2.
In summary, the method of the invention provides a virtual human body image generation network under posture guidance based on a human body structure aiming at a source human body image and any one target human body posture image; firstly, carrying out posture estimation on an input human body image to obtain a joint point coordinate sequence of the human body image; then constructing a virtual human body image generation network (SAGN) under the guidance of the posture based on the human body structure, wherein the virtual human body image generation network (SAGN) comprises an encoder, a structure-based appearance generation module (SAG-Blk) and a decoder, and the structure-based appearance generation module (SAG-Blk) is composed of a structure-aware adaptive normalization (SAN) sub-module and a Feature Enhancement (FE) sub-module; then constructing a loss function of the convolutional neural network, wherein the loss function comprises an antagonistic loss function, a perceptual loss function and a loss function based on human body structure similarity; and finally, generating a virtual human body image under the guidance of the target posture by using the loss function combined optimization proposed convolutional neural network (SAGN). Compared with the existing method, the method carries out qualitative and quantitative comparative experimental analysis, and the effectiveness of the method is verified on two public data sets, namely Market-1501 and DeepFashinon.
Tables 1a and 1b are the results of the quantitative experiments of the present invention, respectively, with Table 1a being the results of the method under the Market-1501 data set and Table 1b being the results of the method under the DeepFashion data set.
TABLE 1a Experimental results of this method under Market-1501 data set
TABLE 1b Experimental results of this method under the DeepFashion data set
SSIM, IS and DS are common indexes for measuring the quality of image generation, the larger the numerical value IS, the more vivid and the higher the quality of the generated image IS, FID and LPIPS are also common indexes for measuring the quality of image generation, and the smaller the numerical value IS, the more vivid and the higher the quality of the generated image IS. As can be seen from Table 1a, on the Market-1501 data set, the image generated by the method reaches the highest value on all indexes, particularly SSIM (structural similarity), and reaches the highest value of 0.321. As can be seen from Table 1b, on the DeepFashinon data set, the images generated by the method all reach the second level on SSIM, IS, DS and LPIPS, and a more reliable image generation effect IS obtained. Therefore, from the quantitative result, the virtual human body image generation method based on the structural similarity can generate a more real virtual human body image.
Fig. 5 and fig. 6 are qualitative experimental results of the present invention, respectively, and fig. 5 is an image generated by the present invention under the Market-1501 data set, and it can be seen that the virtual human body image with clear human body posture and real appearance is generated by our method. Especially in the case of large pose transitions, the virtual body image generated by our method still maintains the correct body pose (e.g., lines 2,4, 5); fig. 6 is a generated image of the present invention under the deep fast image data set, and it can be seen that human body images generated by other methods are easy to have some artificial traces, while our method still maintains the correct human body posture and real appearance, and it is noted that our method can maintain the correct and integrity of the posture of the generated virtual human body even if the target posture has a very complex human body posture, which makes the generated image look very real. Therefore, from the qualitative result, the virtual human body image generation method under the posture guidance based on the human body structure can generate a vivid human body image with a correct posture.
In summary, the invention discloses a new method, a system and electronic equipment for generating a virtual human body image under posture guidance based on a human body structure, belonging to the technical field of computer vision and computer graphics intersection. The invention constructs a virtual human body image generation network (SAGN) under the guidance of human body structure posture, which comprises an encoder, a structure-based appearance generation module (SAG-Blk) and a decoder, wherein the structure-based appearance generation module (SAG-Blk) consists of a structure-aware adaptive normalization (SAN) sub-module and a Feature Enhancement (FE) sub-module; then constructing a loss function of the convolutional neural network, wherein the loss function comprises an antagonistic loss function, a perceptual loss function and a loss function based on human body structure similarity; and finally, generating a virtual human body image under the guidance of the target posture by using the loss function combined optimization proposed convolutional neural network (SAGN). The invention can generate a vivid virtual human body image with correct posture.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Although the present invention has been described in detail with reference to the above embodiments, those skilled in the art can make modifications and equivalents to the embodiments of the present invention without departing from the spirit and scope of the present invention, which is set forth in the claims of the present application.
Claims (10)
1. A virtual human body image generation method is characterized by comprising the following steps:
inputting a source human body image and a target posture image into a pre-trained virtual human body image generation network, and outputting the virtual human body image generation network to obtain a target posture human body image;
wherein, the virtual human body image generation network is a convolution neural network, and comprises:
the encoder is used for inputting the source human body image and the target posture image, and encoding to obtain and output source human body characteristics and target human body characteristics;
the structure-based appearance generation module is used for inputting and updating the source human body characteristics and the target human body characteristics, acquiring and outputting the updated source human body characteristics and the updated target human body characteristics;
and the decoder is used for inputting the target human body characteristics output by the structure-based appearance generation module and decoding to obtain a target posture human body image.
2. The virtual human body image generation method according to claim 1, wherein the step of acquiring the trained virtual human body image generation network specifically includes:
acquiring a sample data set; each sample data in the sample data set comprises source human body image sample data, target human body image sample data, source human body posture sample data and target human body posture sample data;
inputting source human body image sample data, source human body posture sample data and target human body posture sample data in selected sample data of the sample data set into the virtual human body image generation network to obtain virtual target human body image data; constructing a loss function based on the virtual target human body image data and target human body image sample data in the selected sample data, and performing iterative optimization on the virtual human body image generation network;
and obtaining the trained virtual human body image generation network after reaching the preset iteration times or convergence conditions.
3. The virtual human body image generation method according to claim 1, wherein the structure-based appearance generation module comprises:
the structure perception self-adaptive normalization module is used for inputting the source human body characteristics and the target human body characteristics, generating stylized target posture characteristics and outputting the stylized target posture characteristics;
and the characteristic enhancement module is used for inputting the generated stylized target posture characteristic and the source human body characteristic and outputting the updated source human body characteristic and the updated target human body characteristic.
4. The virtual human body image generation method according to claim 2,
in the sample data set, the step of acquiring source human body posture sample data and target human body posture sample data in each sample data comprises: carrying out attitude estimation on the human body image by adopting an openposition attitude estimation method to obtain18 human body joint point coordinate sequences; wherein the source human body imageIs expressed as P (I)s)={p1,…,pKH, 18; target human body imageIs expressed as P (I)t)={p1,…,pKH, 18; based on the obtained coordinate sequence of the human body joint points, K heat maps are used for representing human body posture information; wherein the source human body posture information is expressed asThe target human posture information is expressed as
In the encoder, the step of encoding to obtain the source human body characteristic and the target human body characteristic specifically includes: targeting pose information P with 2 downsampled convolutional layerstEncoding as target human features Ct(ii) a Source human image I with 2 downsampled convolution layerssAnd attitude information PsEncoding as source human body characteristic Cs。
5. The method for generating a virtual human body image according to claim 4, wherein the step of inputting and updating the source human body features and the target human body features in the structure-based appearance generation module to obtain the updated source human body features and the updated target human body features specifically comprises:
dividing the human body image into a plurality of human body parts and 1 background part based on the obtained human body joint point coordinate sequence to obtain L masks of each part; wherein each partial mask of the source human body image is represented asThe partial mask of the target human body image is represented as
Targeting body features C with two convolutional layerstCarrying out convolution to obtain target human body characteristicsTargeting body features C with two convolutional layerssConvolving to obtain source human body characteristicsAccording to source human body characteristics FsAnd partial mask M of source human body imagesGenerating a style vectorWherein for VstyFor each of the rows of the plurality of rows,l is a C-dimensional vector and represents the characteristics of each part of the source human body image; obtaining a style vector for the ith part using mean pooling
Wherein Resize (·) represents a zoom operation;representing element-by-element multiplication; pool (·) denotes pooling;
according to the corresponding relation of all parts of the source human body image and the target human body image, the style vector V is converted into a style vector VstyPartial mask M inserted into target human body imagetObtaining style matrix T in corresponding partssty(ii) a Wherein, the first style vector is usedInserting the first style matrix into the first mask of the target human body image in a broadcasting way to generate a first style matrixAll L style matrices Adding L element by element to obtain final style matrix Tsty:
Using two convolutional layers to form a style matrix TstyConvolution is carried out to obtain modulation parameters in normalization operationAnd
to the target attitude feature FtIs subjected to batch normalization treatment to obtainUsing gamma and beta to FnormModulating to obtain stylized target attitude characteristic Fsty:
Fsty=γFnorm+β;
To target characteristic attitude FstyHuman body of HeyuanCharacteristic FsSplicing and fusing, and then enhancing the fused features by using a Squeeze-and-Excitation operation to obtain enhanced features Ffuse,
Obtaining updated target human body characteristic C't:
updated source human body characteristic C'sIs a source human body characteristic FsAnd updated target body characteristic C'tAnd (4) splicing.
6. The virtual human image generation method of claim 5, wherein the loss function comprises: an antagonistic loss function, a perceptual loss function, and a loss function based on similarity of human structures.
7. The virtual human image generation method of claim i, wherein the structure-based appearance generation module is replaced with an integrated appearance generation module;
the integrated appearance generating module is composed of a plurality of the structure-based appearance generating modules in a cascade.
8. A virtual human body image generation system, comprising:
the image generation module is used for inputting the source human body image and the target posture image into a pre-trained virtual human body image generation network and outputting the virtual human body image generation network to obtain a target posture human body image;
wherein, the virtual human body image generation network is a convolution neural network, and comprises:
the encoder is used for inputting a source human body image and a target posture image, and encoding to obtain a source human body characteristic and a target human body characteristic;
the structure-based appearance generation module is used for inputting and updating the source human body characteristics and the target human body characteristics to obtain updated source human body characteristics and target human body characteristics;
and the decoder is used for inputting the target human body characteristics output by the structure-based appearance generation module and decoding to obtain a target posture human body image.
9. An electronic device, characterized in that the electronic device comprises a processor and a memory, the processor being configured to execute a computer program stored in the memory to implement the virtual human body image generation method according to any one of claims 1 to 7.
10. A computer-readable storage medium storing at least one instruction which, when executed by a processor, implements the virtual human body image generation method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110865481.2A CN113592971B (en) | 2021-07-29 | 2021-07-29 | Virtual human body image generation method, system, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110865481.2A CN113592971B (en) | 2021-07-29 | 2021-07-29 | Virtual human body image generation method, system, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113592971A true CN113592971A (en) | 2021-11-02 |
CN113592971B CN113592971B (en) | 2024-04-16 |
Family
ID=78252264
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110865481.2A Active CN113592971B (en) | 2021-07-29 | 2021-07-29 | Virtual human body image generation method, system, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113592971B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114821811A (en) * | 2022-06-21 | 2022-07-29 | 平安科技(深圳)有限公司 | Method and device for generating person composite image, computer device and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110852941A (en) * | 2019-11-05 | 2020-02-28 | 中山大学 | Two-dimensional virtual fitting method based on neural network |
US10679046B1 (en) * | 2016-11-29 | 2020-06-09 | MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. | Machine learning systems and methods of estimating body shape from images |
WO2020168844A1 (en) * | 2019-02-19 | 2020-08-27 | Boe Technology Group Co., Ltd. | Image processing method, apparatus, equipment, and storage medium |
CN112116673A (en) * | 2020-07-29 | 2020-12-22 | 西安交通大学 | Virtual human body image generation method and system based on structural similarity under posture guidance and electronic equipment |
-
2021
- 2021-07-29 CN CN202110865481.2A patent/CN113592971B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10679046B1 (en) * | 2016-11-29 | 2020-06-09 | MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. | Machine learning systems and methods of estimating body shape from images |
WO2020168844A1 (en) * | 2019-02-19 | 2020-08-27 | Boe Technology Group Co., Ltd. | Image processing method, apparatus, equipment, and storage medium |
CN110852941A (en) * | 2019-11-05 | 2020-02-28 | 中山大学 | Two-dimensional virtual fitting method based on neural network |
CN112116673A (en) * | 2020-07-29 | 2020-12-22 | 西安交通大学 | Virtual human body image generation method and system based on structural similarity under posture guidance and electronic equipment |
Non-Patent Citations (2)
Title |
---|
张婧;孙金根;陈亮;刘韵婷;: "基于无监督学习的单人多姿态图像生成方法", 光电技术应用, no. 02 * |
陈佳宇;钟跃崎;余志才;: "基于二值图像的三维人体模型重建", 毛纺科技, no. 09 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114821811A (en) * | 2022-06-21 | 2022-07-29 | 平安科技(深圳)有限公司 | Method and device for generating person composite image, computer device and storage medium |
CN114821811B (en) * | 2022-06-21 | 2022-09-30 | 平安科技(深圳)有限公司 | Method and device for generating person composite image, computer device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113592971B (en) | 2024-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109255831B (en) | Single-view face three-dimensional reconstruction and texture generation method based on multi-task learning | |
CN116977522A (en) | Rendering method and device of three-dimensional model, computer equipment and storage medium | |
CN115914505B (en) | Video generation method and system based on voice-driven digital human model | |
CN113160035A (en) | Human body image generation method based on posture guidance, style and shape feature constraints | |
CN113570685A (en) | Image processing method and device, electronic device and storage medium | |
CN116385667B (en) | Reconstruction method of three-dimensional model, training method and device of texture reconstruction model | |
CN110751733A (en) | Method and apparatus for converting 3D scanned object into avatar | |
CN113362422A (en) | Shadow robust makeup transfer system and method based on decoupling representation | |
CN111462274A (en) | Human body image synthesis method and system based on SMP L model | |
CN115049556A (en) | StyleGAN-based face image restoration method | |
CN115018989B (en) | Three-dimensional dynamic reconstruction method based on RGB-D sequence, training device and electronic equipment | |
CN112819951A (en) | Three-dimensional human body reconstruction method with shielding function based on depth map restoration | |
CN117635771A (en) | Scene text editing method and device based on semi-supervised contrast learning | |
CN117237542B (en) | Three-dimensional human body model generation method and device based on text | |
CN113592971B (en) | Virtual human body image generation method, system, equipment and medium | |
CN117593178A (en) | Virtual fitting method based on feature guidance | |
CN112116673B (en) | Virtual human body image generation method and system based on structural similarity under posture guidance and electronic equipment | |
CN116934972B (en) | Three-dimensional human body reconstruction method based on double-flow network | |
CN111311732A (en) | 3D human body grid obtaining method and device | |
CN116863044A (en) | Face model generation method and device, electronic equipment and readable storage medium | |
CN114092610B (en) | Character video generation method based on generation of confrontation network | |
CN116978057A (en) | Human body posture migration method and device in image, computer equipment and storage medium | |
Motegi et al. | Human motion generative model using variational autoencoder | |
CN114331894A (en) | Face image restoration method based on potential feature reconstruction and mask perception | |
CN117893642B (en) | Face shape remodelling and facial feature exchanging face changing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |