CN115272632B - Virtual fitting method based on gesture migration - Google Patents

Virtual fitting method based on gesture migration Download PDF

Info

Publication number
CN115272632B
CN115272632B CN202210795212.8A CN202210795212A CN115272632B CN 115272632 B CN115272632 B CN 115272632B CN 202210795212 A CN202210795212 A CN 202210795212A CN 115272632 B CN115272632 B CN 115272632B
Authority
CN
China
Prior art keywords
image
clothing
analysis
network
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210795212.8A
Other languages
Chinese (zh)
Other versions
CN115272632A (en
Inventor
朱佳龙
姜明华
史衍康
陈子宜
刘军
余锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Textile University
Original Assignee
Wuhan Textile University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Textile University filed Critical Wuhan Textile University
Priority to CN202210795212.8A priority Critical patent/CN115272632B/en
Publication of CN115272632A publication Critical patent/CN115272632A/en
Application granted granted Critical
Publication of CN115272632B publication Critical patent/CN115272632B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces
    • G06Q30/0643Graphical representation of items or shoppers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/04Indexing scheme for image data processing or generation, in general involving 3D image data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Graphics (AREA)
  • Architecture (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a virtual fitting method based on gesture migration, which comprises the following steps: acquiring an original analysis chart and a try-on image, firstly extracting clothing pixel information in the try-on image, and then performing texture repair to obtain a fine clothing image; inputting the original analysis chart and the target gesture into an analysis guiding network to obtain an analysis guiding chart; according to the analysis guide diagram, primarily limiting the twisting range of the clothing; acquiring a target posture, preprocessing to obtain an analysis chart from which the lower body is removed, and obtaining a distorted clothing image through a clothing warping network; and generating a fitting result of the target posture according to the analysis guide map, the distorted clothing image, the target posture and the fitting person image. According to the invention, the analysis guide map, the distorted target clothing image, the target gesture and the try-on image are simultaneously input into the neural network model, so that the fitting effect map of the target gesture is obtained, the fitting effect is improved, and the problem that skin and cloth pixels are mixed due to the transformation of the try-on gesture is solved.

Description

Virtual fitting method based on gesture migration
Technical Field
The invention belongs to the field of clothing image processing, and particularly relates to a virtual fitting method based on gesture migration.
Background
In recent years, as shopping modes are changed from off-line to on-line, on-line clothing shopping modes are favored by consumers, but there is a problem that the consumers cannot try on, and cannot experience the effect of putting on the clothing. The virtual fitting can enable sellers to display the clothing advantages more objectively, enable the trading parties to know information more intuitively, promote trading, reduce unnecessary work, improve work efficiency and meet user demands.
At present, the prior art is mainly divided into two modes of 2D image-based and 3D reconstruction-based virtual fitting methods which combine virtual fitting and gesture migration to realize multi-gesture, and the technology of multi-gesture fitting directly based on the 2D image is less, and the fitting result has the phenomena of confusion of skin and cloth pixels, detail loss and the like; the effect based on the 3D reconstruction mode is better, but the requirements on the calculation force, the performance and the quality of the generated model are relatively higher, which is not beneficial to the popularization of the technology.
The Chinese patent with publication number of CN 108734787A discloses a picture synthesis virtual fitting method based on multi-gesture and part decomposition, which synthesizes by using multi-gesture and part decomposition instead of simply synthesizing the whole clothes picture, so that the virtual fitting effect can be achieved more truly, but the technology does not consider the problems of confusion between skin and cloth pixels, detail loss and the like caused by gesture conversion, and greatly influences the fitting effect.
Disclosure of Invention
The invention aims to solve the problems and provide a virtual fitting method based on gesture migration, which limits the distortion range of a target clothing image by utilizing an analysis guide graph and avoids excessive distortion of the target clothing image when the target clothing image is transformed along with the target gesture; according to the target gesture and the analysis diagram of the lower body, a target clothing image which is distorted along with the target gesture is obtained by utilizing a clothing warping network, a try-on image, the target clothing image which is distorted along with the target gesture, the target gesture and the analysis guidance diagram are input into a try-on image generating network at the same time, a try-on result of the target gesture is obtained by utilizing the try-on image generating network, the try-on effect is improved, the problem that skin and cloth pixels are confused due to try-on gesture conversion is avoided, and more clothing texture details are kept.
The technical scheme of the invention is a virtual fitting method based on gesture migration, which comprises the following steps:
step 1, acquiring an original analysis image and a try-on image, firstly extracting clothing pixel information in the try-on image to obtain a crude clothing image, and then performing texture repair to obtain a fine clothing image;
step 2, inputting the original analysis chart, the target clothing and the target gesture into an analysis guiding network to obtain an analysis guiding chart;
step 3, primarily limiting the twisting range of the target clothing according to the analysis guide diagram;
step 4, acquiring a target gesture, preprocessing to obtain an analysis chart from which the lower body is removed, and obtaining a distorted target clothing image through a clothing warping network;
and 5, generating a try-on result of the target gesture through the image generation network according to the analysis guide map, the distorted target clothing image, the target gesture and the try-on image.
Further, step 1 performs pixel-level repair on the clothing image, and the specific repair process includes:
firstly, learning edge information characteristics of a clothing image through a convolutional neural layer, and focusing on a region with a pixel value which is changed severely; and then, carrying out pixel repair on the region with the pixel value which is changed severely by using an interpolation method, so as to ensure that the clothing edge is smooth and is in natural transition with the background.
Preferably, step 1 comprises the sub-steps of:
firstly, extracting pixel information of a corresponding region in an image of a test wearer according to clothing semantic information in an original analytic graph to obtain a preliminary clothing image, wherein the clothing image has the condition of blurred image edges or gaps on the image edges;
and then, carrying out pixel filling or filling on the fuzzy and notch areas in the clothing image by using an interpolation method to obtain a finer clothing image.
Further, the specific process of step 2 is as follows:
firstly, inputting an original analysis chart and a target gesture into an analysis guiding network, extracting image features by utilizing a multi-layer convolution network of the analysis guiding network, adding a residual error module and a wavelet sampling layer into the analysis guiding network for extracting a higher-level semantic structure, so that the analysis guiding network deeply learns the relation details among all parts of a human body, wherein the wavelet sampling layer converts the feature chart into a frequency domain for downsampling through wavelet transformation, and texture information can be better reserved;
then, inputting the extracted image features into a multi-layer deconvolution network of an analysis guide network, up-sampling the image, adding a normalization layer between deconvolution to enhance feature fusion between global features and local features, introducing a normalization constraint loss function, and controlling the up-sampling process to retain more semantic details;
and finally, comparing the generated analysis guide map with the target state in space position, ensuring that each semantic part is attached to the corresponding gesture key point in position, better processing the relationship between the arm and the garment when the arm and the garment are overlapped, and fine-tuning the semantic position to obtain a more regular analysis guide map.
Preferably, the normalized constraint loss function is as follows:
in the method, in the process of the invention,representing a normalized constraint loss function, G representing global features of the image, G 'representing global features of the parsed image, L representing local features of the image, L' representing local features of the parsed image,/and/or->Representing the global feature matching loss function of the image before and after parsing, < + >>Represents the local feature matching loss function of the image before and after analysis,are learning coefficients for adjusting the importance of global features and local features.
The analysis guide map contains semantic segmentation information, and specifically comprises the following steps: face, hair, neck, coat area, left arm, right arm, left hand, right hand, left shoulder, right shoulder, lower coat area.
Preferably, the target gesture contains 18 key points, specifically including: nose, neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, left ankle, right eye, left eye, right ear, left ear.
Further, the specific process of step 4 is as follows:
firstly, removing semantic information of the lower body according to different pixel values of semantic information in the analysis guide graph to obtain an analysis graph from which the lower body is removed;
then, by removing the analytic graph and the target gesture of the lower body, the whole outline of the distortion of the clothing image is limited, the forced deformation of the clothing image by the warping network is avoided, and the excessive distortion of the clothing is avoided;
and finally, deforming the clothing image through a warping network and introducing a plane deformation loss function to obtain a distorted clothing image.
Preferably, the plane deformation loss function is as follows:
wherein C is x (x),C y (x) Respectively representing x and y coordinates, |C of the sampling parameter x (x+i,y)-C x (x, y) I represents Euclidean distance between two nodes, i, j are deformation amounts, and gamma, delta are deformation coefficients.
Further, in step 5, the try-on image generating network is an end-to-end network, and includes a generator and a discriminator, the input of the generator is an analysis guide image, a distorted clothing image and a try-on image, under the limitation of the analysis guide image, a rough try-on result image is generated according to pixel information of the distorted clothing image and the try-on image, the rough try-on result image passes through the discriminator and is introduced with a characteristic point loss function, whether the rough try-on result image accords with the target gesture is judged, more arm area characteristics are extracted, details of the rough try-on result image are continuously enhanced, and image definition is improved.
Preferably, the feature point matching loss function is as follows:
in the method, in the process of the invention,representing a characteristic point matching loss function, W representing a human body posture coordinate point in a rough test result graph, M representing a target posture coordinate point, W i (x) Representing the abscissa of coordinate point i in the rough test result diagram, M i (x) The abscissa of coordinate point i in the target gesture graph is represented, n represents the total number of characteristic points, and W is represented i (x)-M i (x) And the I represents the Euclidean distance of the key point of the same part on the x axis, alpha and beta are adjustment coefficients, and alpha+beta=1.
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the invention, the analysis guide image containing semantic segmentation information, the target clothing image distorted along with the target gesture, the target gesture and the try-on image are simultaneously input into the try-on image generation network, the try-on image generation network is utilized to obtain the try-on effect image of the target gesture of the try-on, so that the try-on effect is greatly improved, the problem that skin and cloth pixels are mixed due to try-on gesture conversion is solved, more clothing texture details are kept in the try-on effect image, and the virtual try-on experience is improved.
(2) According to the invention, the distortion range of the target clothing image is limited by utilizing the analysis guide graph containing semantic segmentation information, so that the target clothing image is prevented from being excessively distorted when being transformed along with the target gesture, and the virtual try-on effect is more vivid.
(3) According to the invention, the clothing image is obtained from the try-on image, and the fuzzy and notch areas in the clothing image are subjected to texture refinement, so that a finer clothing image is obtained, the problem of lack of the clothing image in the training data set is solved, training reinforcement of the try-on image generation network, the analysis guide network and the clothing warping network is facilitated, and the robustness of the fitting method is enhanced.
(4) In the invention, a normalization layer and a normalization constraint loss function are introduced in the analysis and guidance process of obtaining the analysis and guidance graph, and more semantic details are reserved in the control and up-sampling process while the fusion of global features and local features is enhanced.
(5) According to the invention, the characteristic point matching loss function is introduced into the try-on image generation network, so that whether the preliminary try-on result diagram accords with the target gesture is judged, the problem of cross shielding of the arm and the clothing is effectively avoided, and the virtual fitting effect is further improved.
Drawings
The invention is further described below with reference to the drawings and examples.
Fig. 1 is a schematic flow chart of a virtual fitting method according to an embodiment of the invention.
Fig. 2 is a diagram of an analysis guidance network structure of a virtual fitting method according to an embodiment of the present invention.
Fig. 3 is a diagram illustrating a structure of a garment warp network according to a virtual fitting method according to an embodiment of the present invention.
Fig. 4 is a diagram illustrating a network configuration of a fitting image generation network of a virtual fitting method according to an embodiment of the present invention.
Fig. 5 is a schematic view of a virtual fitting system according to an embodiment of the present invention.
Detailed Description
Example 1
As shown in fig. 1, the virtual fitting method based on gesture migration includes the following steps:
(1) Acquiring an original analysis chart and a try-on image, firstly extracting clothing pixel information in the try-on image to obtain a crude clothing image, and then performing texture repair to obtain a fine clothing image;
the clothing image acquisition process comprises the following steps: firstly, extracting pixel information of a corresponding area in an image of a test person according to clothing semantic information in an original analysis chart to obtain a rough clothing image, wherein the edges of the clothing image are blurred and gaps exist. And then, carrying out texture restoration on the clothing image, and carrying out pixel filling or filling on the fuzzy and notch areas in the rough clothing image by using an interpolation method to obtain a finer clothing image.
The original analysis chart contains semantic information of each part of the test person, and comprises the following steps: face, hair, neck, coat area, left arm, right arm, left hand, right hand, left shoulder, right shoulder, lower coat area.
The texture restoration firstly learns the edge information characteristics of the clothing image through a convolutional neural network, focuses on the region with the pixel value changed severely, and then carries out pixel restoration on the region with the pixel value changed severely by using an interpolation method so as to ensure that the clothing edge is smooth and transits naturally with the background.
(2) Acquiring an original analysis chart and a target gesture, and inputting the original analysis chart and the target gesture into an analysis guiding network to obtain an analysis guiding chart;
the analysis guide chart displays semantic segmentation information after the posture of the test person is transformed, and the semantic segmentation information comprises information of a face, hair, a neck, a coat, an arm and a lower garment.
The target pose is composed of 18 key points including nose, neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, left ankle, right eye, left eye, right ear, and left ear.
As shown in fig. 2, the parsing guidance network is composed of a multi-layer convolution network and a multi-layer deconvolution network, the input of the parsing guidance network is an original parsing diagram, a target garment and a target gesture, and the output is the parsing guidance diagram.
The analytic guide process specifically comprises the following steps: firstly, inputting an original analysis chart and a target gesture, extracting image features through a multi-layer convolution network, adding a residual error module and a wavelet sampling layer into an analysis guiding network for extracting a higher-level semantic structure, so that the analysis guiding network deeply learns the relation details among all parts of a human body, wherein the wavelet sampling layer converts the feature chart into a frequency domain for downsampling through wavelet transformation, and can better retain texture information; then, the extracted image features are input into a multi-layer deconvolution network, the image is up-sampled, a normalization layer is added between deconvolution to enhance feature fusion between the global features and the local features, a normalization constraint loss function is introduced, and more semantic details are reserved in the up-sampling process. And finally, comparing the generated analysis guide map with the target state in space position, ensuring that each semantic part is attached to the corresponding gesture key point in position, better processing the relationship between the arm and the garment when the arm and the garment are overlapped, and fine-tuning the semantic position to obtain a more regular analysis guide map.
And in the normalization layer, regarding the features obtained by deconvolution of the previous layer as local features, regarding the features obtained by deconvolution of the next layer as global features, and controlling the influence of the current local features and the global features on the subsequent fusion result by introducing a normalization constraint loss function.
Wherein the normalized constraint loss function is expressed as:
in the method, in the process of the invention,representing a normalized constraint loss function, G representing global features of the image, G 'representing global features of the parsed image, L representing local features of the image, L' representing local features of the parsed image,/and/or->Representing the global feature matching loss function of the image before and after parsing, < + >>Represents the local feature matching loss function of the image before and after analysis,are learning coefficients for adjusting the importance of global features and local features.
(3) According to the analysis guide diagram, primarily limiting the twisting range of the clothing;
(4) Acquiring a target posture, preprocessing to obtain an analysis chart from which the lower body is removed, and obtaining a distorted clothing image through a clothing warping network, as shown in fig. 3;
the specific process of acquiring the distorted clothing image is as follows: firstly, removing semantic information of the lower body according to different pixel values of semantic information in the analysis guide graph to obtain an analysis graph from which the lower body is removed; then, by removing the analytic graph and the target gesture of the lower body, the whole outline of the distortion of the clothing image is limited, the forced deformation of the clothing image by the warping network is avoided, and the excessive distortion of the clothing is avoided; and finally, deforming the clothing image through a warping network and introducing a plane deformation loss function to obtain a distorted clothing image.
Wherein the plane deformation loss function is expressed as:
wherein C is x (x),C y (x) Respectively representing x and y coordinates, |C of the sampling parameter x (x+i,y)-C x (x, y) I represents Euclidean distance between two nodes, i, j are deformation amounts, and gamma, delta are deformation coefficients.
(5) Generating a try-on result of the target gesture through the image generation network according to the analysis guide map, the distorted clothing image, the target gesture and the try-on image;
as shown in fig. 4, the image generating network is an end-to-end network, and is composed of a generator and a discriminator, the generator is composed of an encoder and a decoder, the input of the generator is an analysis guide image, a distorted clothing image and a try-on image, under the limitation of the analysis guide image, a rough try-on result image is generated according to the pixel information of the distorted clothing image and the try-on image, a human body posture image of the rough try-on result image is obtained, whether the rough try-on result image accords with a target posture is judged through the discriminator and a feature point matching loss function is introduced, and more arm region features are encouraged to be extracted, so that the details of the rough try-on result image are continuously enhanced, and the image definition is improved.
The feature point matching loss function is as follows:
in the method, in the process of the invention,representing a characteristic point matching loss function, W representing a human body posture coordinate point in a rough test result graph, M representing a target posture coordinate point, W i (x) Representing the abscissa of coordinate point i in the rough test result diagram, M i (x) The abscissa of coordinate point i in the target gesture graph is represented, n represents the total number of characteristic points, and W is represented i (x)-M i (x) And the I represents the Euclidean distance of the key point of the same part on the x axis, alpha and beta are adjustment coefficients, and alpha+beta=1.
Example two
As shown in fig. 5, the virtual fitting system for posture migration includes an analysis guiding module, a clothing matching module, and an image fusion module.
The analysis guiding module is used for firstly extracting pixels and repairing textures according to the original analysis graph, the test wearer image and the target gesture, and then generating an analysis guiding graph through an analysis guiding network;
the clothing matching module is used for obtaining a distorted clothing image through a clothing warping network according to the analysis guide image, the target gesture and the analysis image with the lower body removed;
and the image fusion module is used for generating a try-on result of the target gesture through the try-on image generation network according to the analysis guide map, the distorted clothing image, the target gesture and the try-on image.
As shown in fig. 2, the input of the semantic parsing network is an original parsing map and a target gesture map, and the output is a parsing guide map, i.e., a parsing map after gesture migration. The original analysis map and the target posture map are respectively processed by 5 residual blocks which are sequentially connected, each residual block uses 3 multiplied by 3 convolution to extract characteristics, the residual blocks are connected by a wavelet layer, and the wavelet layer downsamples the characteristic map in a frequency domain space; the residual block at the tail end is connected with a normalization layer and is used for enhancing feature fusion between global features and local features, introducing a normalization constraint loss function and controlling more semantic details to be reserved in the up-sampling process; after normalization, the adjacent deconvolution layers are processed by 5 deconvolution layers which are connected in sequence, the adjacent deconvolution layers are connected by an inverse wavelet layer, the inverse wavelet layer is used for up-sampling, and the deconvolution layer at the tail end outputs an analysis guide graph.
As shown in fig. 3, the input of the garment warp network is the analysis guide map and the garment image, and the output is the distorted garment image. Firstly, analyzing a guide image and a service image, respectively encoding by an encoder, and respectively extracting image characteristics of the guide image and the service image; then, calculating a deformation coefficient theta through the image characteristics of the two, and limiting the whole outline of the distortion of the clothing image through analyzing the guide image and the target gesture, so as to avoid the forced distortion of the clothing image by a warping network and avoid the excessive distortion of the clothing; and finally, deforming the clothing image by twisting operation and introducing a plane deformation loss function to obtain a twisted clothing image.
As shown in fig. 4, the inputs to the try-on image generation network are the analysis guidance map, the distorted clothing image, and the try-on image, and the output is the try-on image. The test image generating network is an end-to-end network and comprises a generator and a discriminator, wherein the generator consists of an encoder and a decoder, the input of the generator is an analysis guide image, a distorted clothing image and a test person image, under the limit of the analysis guide image, a rough test result image is generated according to pixel information of the distorted clothing image and the test person image, the rough test result image is judged to be in accordance with the target posture through the discriminator and a characteristic point loss function is introduced, more arm area characteristics are extracted, the details of the rough test result image are enhanced, and the image definition is improved.
The virtual fitting system for posture migration adopts the same virtual fitting method as the embodiment.
The implementation result shows that the method not only ensures higher precision of semantic segmentation, but also increases the robustness of clothing deformation, so that the fitting result image retains more details, thereby greatly improving the virtual fitting effect of the high-resolution 2D image and improving the fitting effect and user experience.

Claims (6)

1. The virtual fitting method based on gesture migration is characterized by comprising the following steps of:
step 1, acquiring an original analysis image and a try-on image, firstly extracting clothing pixel information in the try-on image to obtain a crude clothing image, and then performing texture repair to obtain a fine clothing image;
step 2, inputting the original analysis chart, the target clothing and the target gesture into an analysis guiding network to obtain an analysis guiding chart;
step 3, primarily limiting the twisting range of the target clothing according to the analysis guide diagram;
step 4, acquiring a target gesture, preprocessing to obtain an analysis chart from which the lower body is removed, and obtaining a distorted clothing image through a clothing warping network;
step 5, generating a try-on result of the target gesture through a try-on image generating network according to the analysis guide map, the distorted target clothing image, the target gesture and the try-on image;
the specific process of the step 2 is as follows:
firstly, inputting an original analysis chart and a target gesture into an analysis guiding network, extracting image features by utilizing a multi-layer convolution network of the analysis guiding network, adding a residual error module and a wavelet sampling layer into the analysis guiding network for extracting higher-level semantic structures, so that the analysis guiding network deeply learns the relation details among all parts of a human body, wherein the wavelet sampling layer converts the feature chart into a frequency domain for downsampling through wavelet transformation, and texture information can be better reserved;
then, inputting the extracted image features into a multi-layer deconvolution network of an analysis guide network, up-sampling the image, adding a normalization layer between convolution and deconvolution to enhance feature fusion between global features and local features, introducing a normalization constraint loss function, and controlling the up-sampling process to retain more semantic details;
finally, comparing the space position of the generated analysis guide graph with that of the target state, ensuring that each semantic part is attached to the corresponding gesture key point in position, better processing the overlapping relationship between the arm and the garment, and fine-tuning the semantic position to obtain a more regular analysis guide graph;
the semantic analysis network comprises 5 residual blocks which are sequentially connected, wherein the residual blocks are connected through a wavelet layer, and the wavelet layer downsamples the feature map in a frequency domain space; the residual block at the tail end is connected with a normalization layer and is used for enhancing feature fusion between global features and local features, introducing a normalization constraint loss function and controlling more semantic details to be reserved in the up-sampling process; after normalization, processing by 5 deconvolution layers which are sequentially connected, connecting adjacent deconvolution layers by an inverse wavelet layer, wherein the inverse wavelet layer is used for up-sampling, and outputting an analysis guide graph by the deconvolution layer at the tail end;
in step 5, the test image generation network comprises a generator and a discriminator, wherein the input of the generator is an analysis guide image, a distorted clothing image and a test person image, a rough test result image is generated according to pixel information of the distorted clothing image and the test person image under the limitation of the analysis guide image, and then the rough test result image is judged whether to conform to the target gesture or not through the discriminator and the feature point matching loss function is introduced, and more arm region features are extracted, so that the details of the rough test result image are enhanced, and the image definition is improved;
the feature point matching loss function is as follows:
in the method, in the process of the invention,representing a characteristic point matching loss function, W representing a human body posture coordinate point in a rough test result graph, M representing a target posture coordinate point, W i (x) Representing the abscissa of a human body posture coordinate point i in a rough test result diagram, M i (x) The abscissa of coordinate point i in the target gesture graph is represented, n represents the total number of characteristic points, and W is represented i (x)-M i (x) And the I represents the Euclidean distance of the key point of the same part on the x axis, alpha and beta are adjustment coefficients, and alpha+beta=1.
2. The virtual fitting method according to claim 1, wherein step 1 performs a pixel level repair of the garment image, and the specific repair process includes: firstly, learning edge information characteristics of a clothing image through a convolutional neural layer, and focusing on a region with a pixel value which is changed severely; and then, carrying out pixel repair on the region with the pixel value which is changed severely by using an interpolation method, so as to ensure that the clothing edge is smooth and is in natural transition with the background.
3. A virtual fitting method according to claim 2, characterized in that step 1 comprises the sub-steps of:
firstly, extracting pixel information of a corresponding region in an image of a test wearer according to clothing semantic information in an original analytic graph to obtain a rough clothing image, wherein the clothing image has the condition of blurred image edges or gaps on the image edges;
and then, carrying out texture restoration on the clothing image, and carrying out pixel filling or filling on the fuzzy and notch areas in the rough clothing image by using an interpolation method to obtain a finer clothing image.
4. A virtual fitting method according to claim 3, wherein the normalized constraint loss function is as follows:
in the method, in the process of the invention,representing a normalized constraint loss function, G represents a global feature of the image, G Representing global features of the resolved image, L representing local features of the image, L Representing local features of the parsed image, +.>Representing the global feature matching loss function of the image before and after parsing, < + >>Represents the local feature matching loss function of the image before and after analysis,are learning coefficients.
5. A virtual fitting method according to claim 3, wherein the specific procedure of step 4 is as follows:
firstly, removing semantic information of the lower body according to different pixel values of semantic information in the analysis guide graph to obtain an analysis graph from which the lower body is removed;
then, by removing the analytic graph and the target gesture of the lower body, the whole outline of the distortion of the clothing image is limited, the forced deformation of the clothing image by the warping network is avoided, and the excessive distortion of the clothing is avoided;
and finally, deforming the clothing image through a warping network and introducing a plane deformation loss function to obtain a distorted clothing image.
6. The virtual fitting method according to claim 5, wherein the plane deformation loss function is as follows:
wherein C is x (x),C y (x) Respectively representing x and y coordinates, |C of the sampling parameter x (x+i,y)-C x (x, y) I represents Euclidean distance between two nodes, i, j are deformation amounts, and gamma, delta are deformation coefficients.
CN202210795212.8A 2022-07-07 2022-07-07 Virtual fitting method based on gesture migration Active CN115272632B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210795212.8A CN115272632B (en) 2022-07-07 2022-07-07 Virtual fitting method based on gesture migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210795212.8A CN115272632B (en) 2022-07-07 2022-07-07 Virtual fitting method based on gesture migration

Publications (2)

Publication Number Publication Date
CN115272632A CN115272632A (en) 2022-11-01
CN115272632B true CN115272632B (en) 2023-07-18

Family

ID=83764879

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210795212.8A Active CN115272632B (en) 2022-07-07 2022-07-07 Virtual fitting method based on gesture migration

Country Status (1)

Country Link
CN (1) CN115272632B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116206332B (en) * 2023-01-31 2023-08-08 北京数美时代科技有限公司 Pedestrian re-recognition method, system and storage medium based on attitude estimation
CN116824002B (en) * 2023-06-19 2024-02-20 深圳市毫准科技有限公司 AI clothing try-on result output method based on fake model and related equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120287122A1 (en) * 2011-05-09 2012-11-15 Telibrahma Convergent Communications Pvt. Ltd. Virtual apparel fitting system and method
JP5605885B1 (en) * 2014-02-27 2014-10-15 木下 泰男 Virtual try-on system and virtual try-on program
CN110211196B (en) * 2019-05-28 2021-06-15 山东大学 Virtual fitting method and device based on posture guidance
CN113297944A (en) * 2020-12-28 2021-08-24 武汉纺织大学 Human body posture transformation method and system for virtual fitting of clothes
CN113052980B (en) * 2021-04-27 2022-10-14 云南大学 Virtual fitting method and system

Also Published As

Publication number Publication date
CN115272632A (en) 2022-11-01

Similar Documents

Publication Publication Date Title
CN115272632B (en) Virtual fitting method based on gesture migration
CN109376582B (en) Interactive face cartoon method based on generation of confrontation network
EP0725364B1 (en) Image processing apparatus
CN103268623B (en) A kind of Static Human Face countenance synthesis method based on frequency-domain analysis
CN117011207A (en) Virtual fitting method based on diffusion model
CN109684973B (en) Face image filling system based on symmetric consistency convolutional neural network
CN110853119B (en) Reference picture-based makeup transfer method with robustness
CN114663552B (en) Virtual fitting method based on 2D image
CN113222875B (en) Image harmonious synthesis method based on color constancy
CN113160033A (en) Garment style migration system and method
CN116168186A (en) Virtual fitting chart generation method with controllable garment length
CN109903320B (en) Face intrinsic image decomposition method based on skin color prior
CN106228590B (en) A kind of human body attitude edit methods in image
CN117315735A (en) Face super-resolution reconstruction method based on priori information and attention mechanism
CN113516604B (en) Image restoration method
CN117593178A (en) Virtual fitting method based on feature guidance
CN112241708A (en) Method and apparatus for generating new person image from original person image
CN111524204A (en) Portrait hair animation texture generation method
Zhang et al. Automatic genaration of sketch-like pencil drawing from image
CN116362995A (en) Tooth image restoration method and system based on standard prior
CN115937429A (en) Fine-grained 3D face reconstruction method based on single image
CN114331894A (en) Face image restoration method based on potential feature reconstruction and mask perception
CN113781372A (en) Deep learning-based opera facial makeup generation method and system
Ren et al. Depth up-sampling via pixel-classifying and joint bilateral filtering
Chen et al. NeuralReshaper: single-image human-body retouching with deep neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant