CN114663552A - Virtual fitting method based on 2D image - Google Patents

Virtual fitting method based on 2D image Download PDF

Info

Publication number
CN114663552A
CN114663552A CN202210573730.5A CN202210573730A CN114663552A CN 114663552 A CN114663552 A CN 114663552A CN 202210573730 A CN202210573730 A CN 202210573730A CN 114663552 A CN114663552 A CN 114663552A
Authority
CN
China
Prior art keywords
image
graph
semantic
clothing
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210573730.5A
Other languages
Chinese (zh)
Other versions
CN114663552B (en
Inventor
姜明华
史衍康
余锋
花爱玲
周昌龙
宋坤芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Textile University
Original Assignee
Wuhan Textile University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Textile University filed Critical Wuhan Textile University
Priority to CN202210573730.5A priority Critical patent/CN114663552B/en
Publication of CN114663552A publication Critical patent/CN114663552A/en
Application granted granted Critical
Publication of CN114663552B publication Critical patent/CN114663552B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/16Cloth

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a virtual fitting method based on a 2D image, and relates to the technical field of virtual fitting. The method comprises the following steps: inputting image data required by virtual fitting, and generating a fitting region semantic graph, a non-fitting region semantic graph and a clothing correction graph through preprocessing operation; generating a human body semantic segmentation graph after fitting under the constraint of introducing an integrated discriminator and a multi-level loss function through an end-to-end semantic splicing network; predicting a deformed clothing image under the constraint of introducing a frequency domain classifier and a regular constraint loss function through a clothing deformation network; splicing the image of the try-on person and the deformed garment image to obtain a rough try-on result picture; and calculating and reducing the artifact area through a normalized network, and finally obtaining a fine fitting result image. The method solves the problem that the garment and the human body are misplaced to generate artifacts in the prior art, so that the generated picture keeps more details, and a high-resolution fitting picture is generated.

Description

Virtual fitting method based on 2D image
Technical Field
The present invention relates to the field of virtual fitting technology, and more particularly, to a virtual fitting method based on 2D images.
Background
In recent years, with the rapid development of digital technology and the continuous upgrade of artificial intelligence technology, important power and support are provided for the development of virtual fitting technology. Particularly, sudden outbreak of new coronary pneumonia epidemic situation, consumers continuously fear close contact of people and people, homeland economy is invisibly prosperous, and daily clothes purchasing is also transferred to the line. The appearance of the virtual fitting can enable sellers to display the advantages and the disadvantages of the clothes more objectively, so that both sides of a transaction can know information more intuitively, the transaction is facilitated, unnecessary workload is reduced, the working efficiency is improved, and the user requirements are met.
At present, many research institutes at home and abroad are conducting relevant research, but the synthetic image aiming at 2D is still limited to low resolution, which is a key factor influencing the satisfaction degree of online consumers. With the increase of the resolution of the image, the artifact of the misaligned area between the deformed clothes and the human body becomes obvious in the final result, and the body part and the texture definition of the clothes in the final result are poor, so that the fitting effect is seriously influenced.
Chinese patent publication No. CN 112232914a discloses a four-stage virtual fitting method and apparatus based on 2D images, which assist in generating a fitting composite image by predicting semantic segmentation images and arm images after fitting, and solve the problem of artifact existing in the prior art that does not match, but the technique cannot realize virtual fitting on high-resolution 2D images, and has poor effect of maintaining the texture definition of high-resolution images.
Disclosure of Invention
Aiming at the defects or the improvement requirements of the prior art, the invention provides a virtual fitting method based on a 2D image, aiming at solving the problem of artifacts in a high-resolution fitting image, keeping more texture details and improving the fitting effect.
To achieve the above object, according to an aspect of the present invention, there is provided a virtual fitting method based on 2D images, comprising the steps of:
(1) inputting a 2D image, a human body semantic segmentation image and a clothing image of a try-on person, dividing the human body semantic segmentation image into a try-on semantic area and a non-try-on semantic area through preprocessing operation, and performing masking and correction processing on the clothing image to generate a clothing mask image and a clothing correction image;
(2) inputting a clothing semantic graph and a non-fitting area semantic graph, and generating a human body semantic segmentation graph after fitting through an end-to-end semantic splicing network under the constraint of introducing an integrated discriminator and a multi-level loss function;
(3) predicting a deformed clothing image under the constraint of introducing a frequency domain classifier and a regular constraint loss function through a clothing deformation network according to the tried-on semantic segmentation image, the human body posture image and the clothing image;
(4) according to the image of the try-on person, coloring the semantic graph of the non-try-on area to obtain a colored graph of the non-try-on area, and splicing the colored graph with the deformed clothing image to obtain a rough try-on result graph;
(5) inputting the rough fitting result image and the semantic segmentation image after fitting into a normalized network, calculating and reducing an artifact area, and finally obtaining a fine fitting result image.
Preferably, in the method for virtual fitting based on 2D images, the preprocessing process in step (1) includes two parts: part, input human semantic segmentationThe graph comprises 7 semantic information in turn: semantic information for hair, face, neck, hands, under-wear, clothing, and arms, and labeling each semantic information with a different pixel value may be expressed as: y = & a 1,a 2, a 3,…, a 7Dividing the human body semantic segmentation graph into a non-try-on area semantic graph according to different semantic information pixel values in the preprocessing process: y is1={a 1, a 2, a 3, a 4, a 5And a semantic map of a try-on area: y is2={a 6,a 7And the semantic graph of the non-fitting area comprises: semantic information of hair, face, neck, hand, lower garment, the fitting region semantic graph includes: semantic information of clothing and arms; and inputting a clothing image, performing mask processing on the clothing image, namely setting the pixel value of a clothing region in the clothing image to be 0 and the pixel value of the background to be 255 to obtain a clothing mask image, and performing affine transformation on the clothing image according to the position information of the pixel points in the semantic graph of the try-on region to enable the outline of the clothing image to be consistent with the outline of the semantic graph of the try-on region, thereby obtaining a clothing correction graph and preparing for subsequent clothing deformation.
Preferably, the semantic splicing network in the virtual fitting method based on the 2D image is an end-to-end network, the input is a clothing semantic map and a non-fitting area semantic map, the image features are extracted through a convolution layer and downsampled, the compressed image is expanded to the original size through a deconvolution layer to obtain a rough fitting human body semantic segmentation map, and finally the rough fitting human body semantic segmentation map is input to an integrated discriminator to perform similarity calculation, and a fitting human body semantic segmentation map with the highest similarity is output after multiple rounds of semantic splicing. And a multi-level loss function is introduced in the semantic splicing process, the prediction result is subjected to fine-grained optimization, and a semantic splicing network is encouraged to generate finer details.
Preferably, the integrated discriminator is formed by splicing n deep convolution networks with the same network structure, the size of n is related to the resolution of an input picture, and each deep convolution network is formed by sequentially connecting a pooling layer, a convolution layer, a normalization layer and an activation layer. The input of the integrated discriminator is a real picture and a prediction picture, and the output is the feature similarity of the two pictures. The n deep convolutional networks respectively distinguish the real image from the predicted image on n different scales, and the influence of the prediction result of each deep convolutional network on the final result is controlled by setting a coefficient, so that the semantic splicing network is encouraged to generate more details. Each deep convolution network corresponds to a characteristic matching loss function, and the characteristic matching loss functions together form a multi-level loss function of the integrated discriminator.
Preferably, the multi-level loss function is represented as:
Figure 100002_DEST_PATH_IMAGE001
in the above formula, the first and second carbon atoms are,
Figure 450876DEST_PATH_IMAGE002
representing deep convolutional networksD 1Generated for semantic splicing networksG(1) The features of the image match the loss function,
Figure 100002_DEST_PATH_IMAGE003
is a coefficient that controls the effect of different scales of deep convolutional network loss on the multi-level loss function, i.e.
Figure 818404DEST_PATH_IMAGE004
Preferably, the garment deformation network in the 2D image-based virtual fitting method is an end-to-end network, the input is a tried-on semantic segmentation graph, a garment image and a human body posture graph, under the constraint of 18 key point positions in the human body posture graph, pixel information of key point positions of the tried-on semantic segmentation graph and the garment image is respectively extracted through two branches, a frequency domain classifier is introduced, pixel information of the deformed garment image is predicted by using an interpolation method, a regular constraint loss function is introduced to control the deformation degree of the garment, and finally the deformed garment image is output.
The canonical constraint loss function is expressed as:
Figure 100002_DEST_PATH_IMAGE005
in the above formula, the first and second carbon atoms are,
Figure 732133DEST_PATH_IMAGE006
representing coordinate points before and after deformation respectivelyxThe value of the one or more of the one,
Figure 100002_DEST_PATH_IMAGE007
representing coordinate points before and after deformation respectivelyyThe value of the one or more of the one,
Figure 679360DEST_PATH_IMAGE008
to representxThe square of the amount of change in the coordinate in the axial direction,
Figure 100002_DEST_PATH_IMAGE009
to representyThe amount of change in the coordinates in the direction,α,βis two learnable parameters for controlling the clothesxShaft andydegree of deformation in the axial direction;
the loss function of the garment deformation network is expressed as:
Figure 119569DEST_PATH_IMAGE010
in the above formula, the first and second carbon atoms are,
Figure 100002_DEST_PATH_IMAGE011
is a conditional-confrontation-loss function,
Figure 798943DEST_PATH_IMAGE012
is a function of the cross-entropy loss of the pixel,
Figure 100002_DEST_PATH_IMAGE013
is a function of the loss of the regular constraint,λ 1λ 2λ 3are coefficients that are used to adjust the effect of the three loss functions on the overall loss function.
Preferably, the frequency domain classifier is to divide the features in the frequency domain space by setting a threshold, on one hand, more attention points are placed on the generated low-frequency feature information to enhance the content information of the image, on the other hand, noise points and high-frequency information are distinguished by means of the tried-on semantic segmentation map to extract more useful high-frequency information, and the texture fineness of the high-frequency region of the image is enhanced by using an interpolation method, so that the quality of the generated image is improved. The specific process of the frequency domain classifier comprises the following steps:
first, a spatial domain feature map of the clothing image is extracted from the convolutional layer, and a frequency domain feature map is obtained by using discrete fourier transform, which can be expressed as:
Figure 202898DEST_PATH_IMAGE014
wherein,S i,j representing features in the spatial domainiGo to the firstjThe spatial feature component of the column is,F i,j representing the first in a frequency domain profileiGo to the firstjFrequency characteristic components of the columns, DFT () being a discrete fourier transform function;
then, learning through a neural network structure to obtain an importance weight of each frequency feature component in the frequency domain feature map, identifying the frequency feature component with the weight greater than a threshold as a low-frequency information feature component, and identifying the frequency feature component with the weight less than the threshold as a high-frequency information feature component, which can be expressed as:
Figure 100002_DEST_PATH_IMAGE015
wherein,
Figure 78450DEST_PATH_IMAGE016
is shown asiGo to the firstjA frequency feature component of column class z, which is represented when z =0Is a low-frequency information characteristic component, when z =1, the frequency characteristic component is a high-frequency information characteristic component,k i,j is shown asiGo to the firstjThe importance weights of the column frequency feature components,λis a threshold value;
and finally, further screening noise information in a high-frequency information characteristic channel through the position information in the tried semantic segmentation image, filtering image noise by using a low-pass filter, and filling image content and texture information by using an interpolation method.
Optionally, the spline interpolation method includes: tension spline interpolation, regular spline interpolation, and thin plate spline interpolation.
Preferably, the step (4) of the virtual fitting method based on the 2D image includes: firstly, extracting pixel point values of a try-on image, assigning the pixel point values to a non-try-on area semantic graph of a corresponding area again, re-coloring the non-try-on area semantic graph to obtain a non-try-on area color graph, and then inputting the non-try-on area color graph and a deformed clothing image into a semantic splicing network together to obtain a rough try-on result graph. In the semantic splicing network, firstly, the color image of a non-fitting area and the characteristics of a deformed garment image are extracted through a convolution layer and down-sampled, then the compressed image is expanded into the original size through a reverse convolution layer to obtain a rough fitting human body semantic segmentation image, finally, the rough fitting result image is input into an integrated discriminator to carry out similarity calculation, and a fitting result image with the highest similarity is output after multiple rounds of semantic splicing. And a multi-level loss function is introduced in the semantic splicing process, the prediction result is subjected to fine-grained optimization, and a semantic splicing network is encouraged to generate finer details.
Preferably, the processing procedure of removing artifacts in the virtual fitting method based on 2D images includes: firstly, extracting image features of a rough fitting result graph through a deep neural network, obtaining a semantic segmentation graph of the rough fitting result graph through up-sampling, and solving a difference value with the semantic segmentation graph after fitting, wherein the difference value can be expressed as:
Artifact=R-S
wherein,Artifactpixel information representing the region of the artifact,Ra graph showing the results of a rough fitting,Srepresenting the semantic segmentation graph after the try-on;
obtaining an image with a small part of pixel values larger than 0, wherein the pixel points with the pixel values larger than 0 are the pixel points of the artifact region, and identifying the region in the rough fitting result image according to the position information of the artifact region; then, two type variable coefficients are learned through a multilayer normalized networkαAndβrespectively controlling the reduction degree of the artifact area in the x-axis direction and the y-axis direction for reducing the size of the artifact area; and finally, sampling the adjusted feature picture into the size of an original picture through an upper sampling layer, and filling lost pixel point values by using an interpolation method to generate a more fine fitting result picture.
Optionally, the interpolation method includes: nearest neighbor interpolation, bilinear interpolation, and bicubic interpolation.
Preferably, the input of the normalization network is a coarse fitting result graph, the output is a fine fitting result graph, and the normalization network comprises two stages of acquiring an artifact semantic graph and eliminating artifact information:
the first stage of acquiring an artifact semantic graph comprises two sub-networks which are set as networks 1-1 and 1-2, wherein the network 1-1 comprises n layers of convolution layers, image features are extracted through the convolution layers, n layers of deconvolution layers map images to be in original sizes through deconvolution, the output of each convolution layer is used as the input of each deconvolution layer, the convolution layers are connected with the deconvolution layers through residual error structures, and results are output after the last layer of deconvolution; the network 1-2 comprises an image difference module, two images with the same size are input, and a result graph obtained by difference of the two images is output. The value of n is related to the size of the input picture, and generally, when the size of the input picture is 1024 × 768, n =3,4 has a good experimental effect.
The second stage of eliminating artifact information includes one network 2-1, with the network 2-1 comprising two parallel branches, the first branch comprising one convolution layer, one activation layer, one area regularization layer, one down sampling layer, one up sampling layer and one addition layer connected serially, and the second branch comprising one single area regularization layer and the final result output after the addition layer.
The region regularization layer is used for standardizing the artifact region, firstly obtaining the spatial position of the artifact region, and then calculating a type variable coefficient through a convolution networkαAndβintroducing a plane deformation loss function to control the reduction degree of the artifact region, and preventing the pixel point information of the human body part from being wrongly modified due to excessive reduction so as to assist in reducing the artifact region; and the addition layer is used for fusing output results of the two branches.
Preferably, the planar deformation loss function is expressed as:
Figure 100002_DEST_PATH_IMAGE017
in the above-mentioned formula, the compound has the following structure,
Figure 248532DEST_PATH_IMAGE018
respectively representing characteristic points of the artefact areax,yThe coordinates of the position of the object to be imaged,
Figure 100002_DEST_PATH_IMAGE019
representing the euclidean distance between two nodes,i,jis a quantity of deformation, and the deformation quantity,α,βis the deformation coefficient.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
(1) the input picture is preprocessed through preprocessing, a fitting area and a non-fitting area are distinguished, pixel information of the non-fitting area can be better reserved, and more details of a fitting result image are reserved; the clothing image is corrected in space, so that subsequent better deformation of the clothing image is facilitated.
(2) The human body posture graph containing more key point information is used for assisting in generating the semantic segmentation image and the clothing deformation image, so that the semantic segmentation precision is higher, the clothing deformation robustness is improved, the regular constraint loss function and the frequency domain classifier are introduced, and the problem of clothing texture distortion caused by abnormal distortion deformation is solved.
(3) The problem that artifacts occur due to the fact that clothes are not matched with a human body under the condition of high resolution is solved by using the normalization network, the virtual fitting effect of the high-resolution 2D image is greatly improved, and the fitting effect and the user experience are improved.
Drawings
Fig. 1 is a schematic flow chart of a virtual fitting method based on a 2D image according to an embodiment of the present invention;
fig. 2 is a normalized network diagram of a virtual fitting method based on a 2D image according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Fig. 1 is a schematic flow chart of a virtual fitting method for a 2D image according to an embodiment; the embodiment provides a virtual fitting method of a 2D image, which comprises the following steps:
(1) inputting a 2D image, a human body semantic segmentation image and a clothing image of a try-on person, dividing the human body semantic segmentation image into a try-on semantic area and a non-try-on semantic area through preprocessing operation, and performing masking and correction processing on the clothing image to generate a clothing mask image and a clothing correction image;
wherein, the pretreatment process comprises two parts: one part, inputting a human body semantic segmentation graph, comprising 7 semantic information which are sequentially: semantic information for hair, face, neck, hand, under-garment, clothing, and arm, and labeling each semantic information with a different pixel value may be expressed as: y = & a 1, a 2, a 3,…, a 7In pretreatmentIn the process, according to the difference of the pixel values of each semantic information, the human body semantic segmentation graph is divided into non-try-on area semantic graphs: y is1={a 1, a 2, a 3, a 4, a 5And a semantic map of a try-on area: y is2={a 6,a 7And the semantic graph of the non-fitting area comprises: semantic information of hair, face, neck, hand, lower garment, the fitting region semantic graph includes: semantic information of clothing and arms; and inputting a clothing image, performing mask processing on the clothing image, namely setting the pixel value of a clothing region in the clothing image to be 0 and the pixel value of the background to be 255 to obtain a clothing mask image, and performing affine transformation on the clothing image according to the position information of the pixel points in the semantic graph of the try-on region to enable the outline of the clothing image to be consistent with the outline of the semantic graph of the try-on region, thereby obtaining a clothing correction graph and preparing for subsequent clothing deformation.
(2) Inputting a clothing semantic graph and a non-fitting area semantic graph, generating a fitted human body semantic segmentation graph through an end-to-end semantic splicing network under the constraint of introducing an integrated discriminator and a multi-level loss function, wherein the clothing semantic graph is prepared in advance and is input data of the network;
the semantic splicing network is an end-to-end network, a clothing semantic graph and a non-fitting area semantic graph are input, image features are extracted through a convolution layer and down-sampling is carried out, a compressed image is expanded into an original size through a reverse convolution layer, a rough fitting human body semantic segmentation graph is obtained, the rough fitting human body semantic segmentation graph is input to an integrated discriminator to carry out similarity calculation, and a fitting human body semantic segmentation graph with the highest similarity is output after multiple rounds of semantic splicing. And a multi-level loss function is introduced in the semantic splicing process, the prediction result is subjected to fine-grained optimization, and a semantic splicing network is encouraged to generate finer details.
The integrated discriminator is formed by splicing n deep convolution networks with the same network structure, the size of n is related to the resolution of an input picture, and the experimental effect of n =3 and 4 is better when the size of the input picture is 1024 × 768. The deep convolutional network is formed by sequentially connecting a pooling layer, a convolutional layer, a normalization layer and an activation layer. The input of the integrated discriminator is a real picture and a prediction picture, and the output is the feature similarity of the two pictures. The n depth convolution networks respectively distinguish real images and predicted images on n different scales, and the influence of the prediction result of each depth convolution network on the final result is controlled by setting a coefficient, so that the semantic splicing network is encouraged to generate more details. Each deep convolution network corresponds to a characteristic matching loss function, and the characteristic matching loss functions together form a multi-level loss function of the integrated discriminator.
Wherein, the multi-level loss function is expressed as:
Figure 692283DEST_PATH_IMAGE020
in the above formula, the first and second carbon atoms are,
Figure DEST_PATH_IMAGE021
representing deep convolutional networksD 1Generated for semantic splicing networksG(1) The features of the image match the loss function (which is existing),
Figure 580604DEST_PATH_IMAGE022
is a coefficient that controls the effect of different scales of deep convolutional network loss on the multi-level loss function, i.e.
Figure DEST_PATH_IMAGE023
(3) Predicting a deformed clothing image under the constraint of introducing a frequency domain classifier and a regular constraint loss function through a clothing deformation network according to the tried-on semantic segmentation image, the human body posture image and the clothing image, wherein the human body posture image is prepared in advance and is input data of the network;
the clothing deformation network is an end-to-end network, the input is a tried-on semantic segmentation graph, a clothing image and a human body posture graph, under the constraint of 18 key point positions in the human body posture graph, pixel information of the key point positions of the tried-on semantic segmentation graph and the clothing image is respectively extracted through two branches, a frequency domain classifier is introduced, the pixel information of the deformed clothing image is predicted by using an interpolation method, a regular constraint loss function is introduced to control the deformation degree of clothing, and finally the deformed clothing image is output, wherein the 18 key points comprise: nose, neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, left ankle, right eye, left eye, right ear, and left ear.
The frequency domain classifier divides the features in a frequency domain space by setting a threshold, on one hand, more attention points are placed on the generated low-frequency feature information to enhance the content information of the image, on the other hand, noise points and high-frequency information are distinguished by means of a tried-on semantic segmentation graph to extract more useful high-frequency information, and the texture fineness of the high-frequency region of the image is enhanced by using an interpolation method, so that the quality of the generated image is improved. The specific process of the frequency domain classifier comprises the following steps:
first, a spatial domain feature map of the clothing image is extracted from the convolutional layer, and a frequency domain feature map is obtained by using discrete fourier transform, which can be expressed as:
Figure 135213DEST_PATH_IMAGE024
wherein,S i,j representing features in the spatial domainiGo to the firstjThe spatial feature component of the column is,F i,j representing the first in a frequency domain profileiGo to the firstjFrequency characteristic components of the columns, DFT () being a discrete fourier transform function;
then, learning through a neural network structure to obtain an importance weight of each frequency feature component in the frequency domain feature map, identifying the frequency feature component with the weight greater than a threshold as a low-frequency information feature component, and identifying the frequency feature component with the weight less than the threshold as a high-frequency information feature component, which can be expressed as:
Figure 753277DEST_PATH_IMAGE015
wherein,
Figure 226983DEST_PATH_IMAGE016
denotes the firstiGo to the firstjA frequency feature component of which column category is z, which is indicated as a low-frequency information feature component when z =0, which is indicated as a high-frequency information feature component when z =1,k i,j is shown asiGo to the firstjThe importance weights of the column frequency feature components,λis a threshold value;
and finally, further screening noise information in a high-frequency information characteristic channel through the position information in the tried-on semantic segmentation image, filtering image noise by using a low-pass filter, and filling image content and texture information by using an interpolation method.
The spline interpolation method comprises the following steps: tension spline interpolation, regular spline interpolation, and thin plate spline interpolation.
Wherein the regular constraint loss function is represented as:
Figure DEST_PATH_IMAGE025
in the above formula, the first and second carbon atoms are,
Figure 337022DEST_PATH_IMAGE026
representing coordinate points before and after deformation respectivelyxThe value of the sum of the values,
Figure DEST_PATH_IMAGE027
representing coordinate points before and after deformation respectivelyyThe value of the one or more of the one,
Figure 429743DEST_PATH_IMAGE028
to representxThe square of the amount of change in the coordinate in the axial direction,
Figure DEST_PATH_IMAGE029
to representyThe amount of change in the coordinates in the direction,α,βis two learnable parameters for controlling the clothesxShaft andydegree of deformation in the axial direction;
wherein the loss function of the garment deformation network is represented as:
Figure 574416DEST_PATH_IMAGE030
in the above formula, the first and second carbon atoms are,
Figure DEST_PATH_IMAGE031
is a conditional opposition loss function (which is existing),
Figure 484603DEST_PATH_IMAGE032
is the pixel cross entropy loss function (which is existing),
Figure DEST_PATH_IMAGE033
is a function of the regular constraint penalty,λ 1λ 2λ 3are coefficients that are used to adjust the effect of the three loss functions on the overall loss function.
(4) And according to the image of the try-on person, coloring the semantic graph of the non-try-on area to obtain a colored graph of the non-try-on area, and splicing the colored graph with the deformed clothing image to obtain a rough try-on result graph.
Firstly, extracting pixel point values of a try-on person image, assigning the pixel point values to a non-try-on area semantic graph of a corresponding area again, re-coloring the non-try-on area semantic graph to obtain a non-try-on area color graph, and then inputting the non-try-on area color graph and a deformed clothing image into a semantic splicing network together to obtain a rough try-on result graph. In the semantic splicing network, firstly, the color image of a non-fitting area and the characteristics of a deformed garment image are extracted through a convolution layer and down-sampled, then the compressed image is expanded into the original size through a reverse convolution layer to obtain a rough fitting human body semantic segmentation image, finally, the rough fitting result image is input into an integrated discriminator to carry out similarity calculation, and a fitting result image with the highest similarity is output after multiple rounds of semantic splicing. And a multi-level loss function is introduced in the semantic splicing process, the prediction result is subjected to fine-grained optimization, and a semantic splicing network is encouraged to generate finer details.
(5) Inputting the rough fitting result image and the fitted semantic segmentation image into a normalized network, calculating and reducing an artifact area, and finally obtaining a fine fitting result image.
The processing procedure for removing the artifact comprises the following steps: firstly, extracting image features of a rough fitting result graph through a deep neural network, obtaining a semantic segmentation graph of the rough fitting result graph through up-sampling, and solving a difference value with the semantic segmentation graph after fitting, wherein the process can be expressed as:
Artifact=R-S
wherein,Artifactpixel information representing the area of the artifact,Ra graph showing the results of a rough fitting,Srepresenting the semantic segmentation graph after the try-on;
obtaining an image with a small part of pixel values larger than 0, wherein the pixel points with the pixel values larger than 0 are the pixel points of the artifact region, and identifying the region in the rough fitting result image according to the position information of the artifact region; then, through a normalized network, two type variable coefficients are learnedαAndβrespectively controlling the reduction degree of the artifact area in the x-axis direction and the y-axis direction for reducing the size of the artifact area; and finally, sampling the adjusted feature picture into the size of the original picture through an upper sampling layer, and filling lost pixel point values by using an interpolation method to generate a more fine fitting result picture.
Wherein the interpolation method comprises: nearest neighbor interpolation, bilinear interpolation, and bicubic interpolation.
The normalization network inputs a rough fitting result graph and outputs a fine fitting result graph, and comprises two stages of acquiring an artifact semantic graph and eliminating artifact information:
the first stage of acquiring an artifact semantic graph comprises two sub-networks which are set as networks 1-1 and 1-2, wherein the network 1-1 comprises n layers of convolution layers, image features are extracted through the convolution layers, n layers of deconvolution layers map images to be in original sizes through deconvolution, the output of each convolution layer is used as the input of each deconvolution layer, the convolution layers are connected with the deconvolution layers through residual error structures, and results are output after the last layer of deconvolution; the network 1-2 comprises an image difference module, two images with the same size are input, and a result graph obtained by difference of the two images is output. The value of n is related to the size of the input picture, and generally, when the size of the input picture is 1024 × 768, n =3, and 4 has a good experimental effect.
The second stage of eliminating artifact information includes one network 2-1, with the network 2-1 comprising two parallel branches, the first branch comprising one convolution layer, one activation layer, one area regularization layer, one down sampling layer, one up sampling layer and one addition layer connected serially, and the second branch comprising one single area regularization layer and the final result output after the addition layer.
The region regularization layer is used for standardizing the artifact region, firstly obtaining the spatial position of the artifact region, and then calculating a type variable coefficient through a convolution networkαAndβintroducing a plane deformation loss function to control the reduction degree of the artifact region, and preventing the pixel point information of the human body part from being wrongly modified due to excessive reduction so as to assist in reducing the artifact region; and the addition layer is used for fusing output results of the two branches.
Wherein the planar deformation loss function is expressed as:
Figure 423216DEST_PATH_IMAGE034
in the above-mentioned formula, the compound has the following structure,
Figure 178682DEST_PATH_IMAGE018
respectively representing characteristic points of the artefact areax,yThe coordinates of the position of the object to be imaged,
Figure DEST_PATH_IMAGE035
representing the euclidean distance between two nodes,i,jis the amount of the deformation,α,βis the type coefficient.
Fig. 2 is a normalized network diagram of a virtual fitting method of a 2D image according to an embodiment;
the normalized network is a trial result graph with coarse input and fine output after removing the false shadow. The normalized network consists of two stages: the first stage is used for acquiring an artifact semantic map and comprises 3 convolutional layers, 3 deconvolution layers and a difference calculating module; the second stage is used for eliminating artifact information and comprises two branches, wherein one branch comprises a convolution layer, an activation layer, a region regularization layer, a down-sampling layer and an up-sampling layer, the other branch only comprises the region regularization layer, and finally, the results of the two branches are fused through an addition layer.
The method not only enables the semantic segmentation precision to be higher, but also increases the robustness of clothing deformation, enables the fitting result image to retain more details, greatly improves the virtual fitting effect of the high-resolution 2D image, and improves the fitting effect and the user experience.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein. The scheme in the embodiment of the application can be implemented by adopting various computer languages, such as object-oriented programming language Java and transliterated scripting language JavaScript.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (7)

1. A virtual fitting method based on 2D images is characterized by comprising the following steps:
inputting a 2D image, a human body semantic segmentation image and a clothing image of a try-on person, dividing the human body semantic segmentation image into a try-on area and a non-try-on area through preprocessing operation, and performing masking and correction processing on the clothing image to generate a clothing mask image and a clothing correction image;
step (2), inputting a clothing semantic graph and a non-try-on area semantic graph, and generating a human body semantic segmentation graph after trial wearing through an end-to-end semantic splicing network under the constraint of introducing an integrated discriminator and a multi-level loss function;
step (3), predicting a deformed clothing image under the constraints of introducing a frequency domain classifier and a regular constraint loss function through a clothing deformation network according to the tried-on semantic segmentation graph, the human body posture graph and the clothing image;
step (4), according to the image of the try-on person, coloring the semantic graph of the non-try-on area to obtain a colored graph of the non-try-on area, and then splicing the colored graph with the deformed clothing image to obtain a rough try-on result graph;
and (5) inputting the rough fitting result image and the semantic segmentation image after fitting into a normalized network, calculating and reducing an artifact area, and finally obtaining a fine fitting result image.
2. The virtual fitting method based on 2D images as claimed in claim 1, wherein: after the 2D image, the human body semantic segmentation map and the clothing image of the try-on person are input in the step (1), the preprocessing process comprises two parts:
one part, inputting a human body semantic segmentation graph, which comprises 7 semantic information: of hair, face, neck, hands, under-wear, clothing, and armsSemantic information, and each semantic information is labeled with a different pixel value, which can be expressed as: y = & a 1,a 2, a 3,…, a 7Dividing the human body semantic segmentation graph into a non-try-on area semantic graph according to different semantic information pixel values in the preprocessing process: y is1={a 1, a 2, a 3, a 4, a 5And a semantic map of a try-on area: y is2={a 6,a 7And the semantic graph of the non-fitting area comprises: semantic information of hair, face, neck, hand, lower garment, the fitting region semantic graph includes: semantic information of clothing and arms;
and inputting a clothing image, performing mask processing on the clothing image, namely setting the pixel value of a clothing region in the clothing image to be 0 and the pixel value of the background to be 255 to obtain a clothing mask image, and performing affine transformation on the clothing image according to the position information of the pixel points in the semantic graph of the try-on region to enable the outline of the clothing image to be consistent with the outline of the semantic graph of the try-on region, thereby obtaining a clothing correction graph and preparing for subsequent clothing deformation.
3. The virtual fitting method based on 2D images as claimed in claim 1, wherein: the semantic splicing network in the step (2) is an end-to-end network, a clothing semantic graph and a non-try-on area semantic graph are input, image features are extracted through a convolution layer and are down-sampled, a compressed image is up-sampled into an original size through a reverse convolution layer, a rough try-on human body semantic segmentation graph is obtained, the rough try-on human body semantic segmentation graph is input to an integrated discriminator to be subjected to similarity calculation, and a try-on human body semantic segmentation graph with the highest similarity is output after multiple rounds of semantic splicing; the integrated discriminator is formed by splicing n depth convolution networks with the same network structure, the size of n is related to the resolution of an input picture, and the depth convolution network is formed by sequentially connecting a pooling layer, a convolution layer, a normalization layer and an activation layer; the input of the integrated discriminator is a real picture and a predicted picture, the output is the feature similarity of the two pictures, the n deep convolutional networks respectively distinguish the real picture and the predicted picture on n different scales, and the influence of the prediction result of each deep convolutional network on the final result is controlled by setting a coefficient, so that the semantic splicing network is encouraged to generate more details; each deep convolutional network corresponds to a characteristic matching loss function and forms a multi-level loss function of the integrated discriminator together, fine-grained optimization is carried out on the prediction result, and the semantic splicing network is encouraged to generate finer details;
the multi-level loss function is expressed as:
Figure DEST_PATH_IMAGE001
in the above formula, the first and second carbon atoms are,
Figure 118160DEST_PATH_IMAGE002
representing deep convolutional networksD 1Generated for semantic splicing networksG(1) The features of the image match the loss function,
Figure DEST_PATH_IMAGE003
is a coefficient that controls the effect of different scales of deep convolutional network loss on the multi-level loss function, i.e.
Figure 339057DEST_PATH_IMAGE004
4. The virtual fitting method based on 2D images as claimed in claim 1, wherein: the clothing deformation network in the step (3) is an end-to-end network, the semantic segmentation graph, the clothing image and the human body posture graph after fitting are input, under the constraint of a plurality of key point positions in the human body posture graph, pixel information of the key point positions of the semantic segmentation graph and the clothing image after fitting is respectively extracted through two branches by a deep convolution network, a frequency domain classifier is introduced, the pixel information of the clothing image after deformation is predicted by using an interpolation method, a regular constraint loss function is introduced to control the deformation degree of the clothing, and finally the clothing image after deformation is output;
the canonical constraint loss function is expressed as:
Figure DEST_PATH_IMAGE005
in the above-mentioned formula, the compound has the following structure,
Figure 645010DEST_PATH_IMAGE006
representing coordinate points before and after deformation respectivelyxThe value of the one or more of the one,
Figure DEST_PATH_IMAGE007
representing coordinate points before and after deformation respectivelyyThe value of the one or more of the one,
Figure 584147DEST_PATH_IMAGE008
to representxThe square of the amount of change in the coordinate in the axial direction,
Figure DEST_PATH_IMAGE009
to representyThe amount of change in the coordinates in the direction,α,βis two learnable parameters for controlling the clothesxShaft andydegree of deformation in the axial direction;
the loss function of the garment deformation network is expressed as:
Figure 455151DEST_PATH_IMAGE010
in the above formula, the first and second carbon atoms are,
Figure DEST_PATH_IMAGE011
is a conditional-confrontation-loss function,
Figure 61713DEST_PATH_IMAGE012
is a function of the cross-entropy loss of the pixel,
Figure DEST_PATH_IMAGE013
is a function of the loss of the regular constraint,λ 1λ 2λ 3are coefficients that are used to adjust the effect of the three loss functions on the overall loss function.
5. The virtual fitting method based on 2D images as claimed in claim 1, wherein: in the step (3), the intermediate frequency domain classifier divides the features in a frequency domain space by setting a threshold, on one hand, more attention points are placed on the generated low frequency feature information to enhance the content information of the image, on the other hand, noise points and high frequency information are distinguished by means of the tried-on semantic segmentation graph to extract more useful high frequency information, and the texture fineness of the high frequency region of the image is enhanced by using an interpolation method, so that the quality of the generated image is improved; the specific process of the frequency domain classifier comprises the following steps:
first, a spatial domain feature map of the clothing image is extracted from the convolutional layer, and a frequency domain feature map is obtained by using discrete fourier transform, which can be expressed as:
Figure 851814DEST_PATH_IMAGE014
wherein,S i,j representing features in the spatial domainiGo to the firstjThe spatial feature component of the column is,F i,j representing the first in a frequency domain profileiGo to the firstjFrequency characteristic components of the columns, DFT () being a discrete fourier transform function;
then, learning through a neural network structure to obtain an importance weight of each frequency feature component in the frequency domain feature map, identifying the frequency feature component with the weight greater than a threshold as a low-frequency information feature component, and identifying the frequency feature component with the weight less than the threshold as a high-frequency information feature component, which can be expressed as:
Figure DEST_PATH_IMAGE015
wherein,
Figure 278248DEST_PATH_IMAGE016
is shown asiGo to the firstjA frequency feature component of which column category is z, which is indicated as a low-frequency information feature component when z =0, which is indicated as a high-frequency information feature component when z =1,k i,j is shown asiGo to the firstjThe importance weights of the column frequency feature components,λis a threshold value;
and finally, further screening noise information in a high-frequency information characteristic channel through the position information in the tried semantic segmentation image, filtering image noise by using a low-pass filter, and filling image content and texture information by using an interpolation method.
6. The virtual fitting method based on 2D images as claimed in claim 1, wherein: the processing procedure for removing the artifacts in the step (5) comprises the following steps: firstly, extracting image features of a rough fitting result graph through a deep convolutional network, obtaining a semantic segmentation graph of the rough fitting result graph through up-sampling, and solving a difference value with the semantic segmentation graph after fitting, wherein the process can be expressed as:
Artifact=R-S
wherein,Artifactpixel information representing the area of the artifact,Ra graph showing the results of a rough fitting,Srepresenting a semantic segmentation graph after fitting;
obtaining an image with only a small part of pixel values larger than 0, wherein the pixel points with the pixel values larger than 0 are the pixel points of the artifact region, and identifying the region in the rough fitting result image according to the position information of the artifact region; then, through a normalized network, two type variable coefficients are learnedαAndβrespectively controlling the reduction degree of the artifact area in the x-axis direction and the y-axis direction for reducing the size of the artifact area; finally, the adjusted characteristic picture is sampled into a picture through an upper sampling layerAnd (4) filling lost pixel point values by using an interpolation method according to the size of the original image to generate a more fine fitting result image.
7. The virtual fitting method based on 2D images as claimed in claim 1, wherein: in the step (5), the input of the normalization network is a rough fitting result graph, the output is a fine fitting result graph, and the normalization network comprises two stages of acquiring an artifact semantic graph and eliminating artifact information:
the first stage of acquiring an artifact semantic graph comprises two sub-networks which are set as networks 1-1 and 1-2, wherein the network 1-1 comprises n layers of convolution layers, image features are extracted through the convolution layers, n layers of deconvolution layers map images to be in original sizes through deconvolution, the output of each convolution layer is used as the input of each deconvolution layer, the convolution layers are connected with the deconvolution layers through residual error structures, and results are output after the last layer of deconvolution; the network 1-2 comprises an image difference calculating module, two images with the same size are input, and a result graph obtained by calculating the difference of the two images is output; wherein the value of n is related to the size of the input picture;
the second stage of eliminating artifact information comprises a network 2-1, wherein the network 2-1 is composed of two parallel branches, wherein the first branch is formed by sequentially connecting a convolution layer, an activation layer, a region regularization layer, a down sampling layer, an up sampling layer and an addition layer in series, the second branch is formed by a single region regularization layer, and the final result is output after the addition layer;
the region regularization layer is used for standardizing the artifact region, firstly obtaining the spatial position of the artifact region, and then calculating a type variable coefficient through a convolution networkαAndβintroducing a plane deformation loss function to control the reduction degree of the artifact region, and preventing the pixel point information of the human body part from being wrongly modified due to excessive reduction so as to assist in reducing the artifact region; the addition layer is used for fusing output results of the two branches;
wherein the planar deformation loss function is expressed as:
Figure DEST_PATH_IMAGE017
in the above formula, the first and second carbon atoms are,
Figure 687363DEST_PATH_IMAGE018
respectively representing characteristic points of the artefact areax,yThe coordinates of the position of the object to be imaged,
Figure DEST_PATH_IMAGE019
representing the euclidean distance between two nodes,i,jis the amount of the deformation,α,βis the deformation coefficient.
CN202210573730.5A 2022-05-25 2022-05-25 Virtual fitting method based on 2D image Active CN114663552B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210573730.5A CN114663552B (en) 2022-05-25 2022-05-25 Virtual fitting method based on 2D image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210573730.5A CN114663552B (en) 2022-05-25 2022-05-25 Virtual fitting method based on 2D image

Publications (2)

Publication Number Publication Date
CN114663552A true CN114663552A (en) 2022-06-24
CN114663552B CN114663552B (en) 2022-08-16

Family

ID=82038409

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210573730.5A Active CN114663552B (en) 2022-05-25 2022-05-25 Virtual fitting method based on 2D image

Country Status (1)

Country Link
CN (1) CN114663552B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115496990A (en) * 2022-11-18 2022-12-20 武汉纺织大学 Deep learning-based garment image analysis method
CN115761143A (en) * 2022-12-07 2023-03-07 武汉纺织大学 3D virtual reloading model generation method and device based on 2D image
CN115937964A (en) * 2022-06-27 2023-04-07 北京字跳网络技术有限公司 Method, device, equipment and storage medium for attitude estimation

Citations (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103578004A (en) * 2013-11-15 2014-02-12 西安工程大学 Method for displaying virtual fitting effect
CN103597519A (en) * 2011-02-17 2014-02-19 麦特尔有限公司 Computer implemented methods and systems for generating virtual body models for garment fit visualization
KR20140125507A (en) * 2013-04-19 2014-10-29 정상학 Virtual fitting apparatus and method using digital surrogate
CN104156966A (en) * 2014-08-11 2014-11-19 石家庄铁道大学 Pseudo 3D real-time virtual fitting method based on mobile terminal
CN104813340A (en) * 2012-09-05 2015-07-29 体通有限公司 System and method for deriving accurate body size measures from a sequence of 2d images
GB201510752D0 (en) * 2015-06-18 2015-08-05 Morris Gary Personalized garment image processing
CN105354876A (en) * 2015-10-20 2016-02-24 何家颖 Mobile terminal based real-time 3D fitting method
JP2016038812A (en) * 2014-08-08 2016-03-22 株式会社東芝 Virtual try-on apparatus, virtual try-on method and a program
WO2016109884A1 (en) * 2015-01-05 2016-07-14 Valorbec Limited Partnership Automated recommendation and virtualization systems and methods for e-commerce
EP3091510A1 (en) * 2015-05-06 2016-11-09 Reactive Reality GmbH Method and system for producing output images and method for generating image-related databases
US20170024928A1 (en) * 2015-07-22 2017-01-26 Adidas Ag Computer-implemented method and apparatus for generating an image of a person wearing a selectable article of apparel
US20170084035A1 (en) * 2015-09-18 2017-03-23 Xiaofeng Han Systems and methods for evaluating suitability of an article for an individual
CN109740529A (en) * 2018-12-29 2019-05-10 广州二元科技有限公司 A kind of virtual fit method drawn based on neural network
WO2019193467A1 (en) * 2018-04-05 2019-10-10 Page International Fz Llc Method and device for the virtual try-on of garments based on augmented reality with multi-detection
US20190333267A1 (en) * 2008-08-15 2019-10-31 Brown University Method and apparatus for estimating body shape
US20200015575A1 (en) * 2017-07-13 2020-01-16 Shiseido Americas Corporation Systems and Methods for Virtual Facial Makeup Removal and Simulation, Fast Facial Detection and Landmark Tracking, Reduction in Input Video Lag and Shaking, and a Method for Recommending Makeup
CN110852941A (en) * 2019-11-05 2020-02-28 中山大学 Two-dimensional virtual fitting method based on neural network
US20200151807A1 (en) * 2018-11-14 2020-05-14 Beijing Jingdong Shangke Information Technology Co., Ltd. System and method for automatically generating three-dimensional virtual garment model using product description
JP2020097803A (en) * 2018-12-18 2020-06-25 成衛 貝田 Virtual fitting system
WO2020131518A1 (en) * 2018-12-19 2020-06-25 Seddi, Inc. Learning-based animation of clothing for virtual try-on
CN111709874A (en) * 2020-06-16 2020-09-25 北京百度网讯科技有限公司 Image adjusting method and device, electronic equipment and storage medium
CN111768472A (en) * 2020-05-29 2020-10-13 北京沃东天骏信息技术有限公司 Virtual fitting method and device and computer-readable storage medium
CN111787242A (en) * 2019-07-17 2020-10-16 北京京东尚科信息技术有限公司 Method and apparatus for virtual fitting
CN112233222A (en) * 2020-09-29 2021-01-15 深圳市易尚展示股份有限公司 Human body parametric three-dimensional model deformation method based on neural network joint point estimation
CN112232914A (en) * 2020-10-19 2021-01-15 武汉纺织大学 Four-stage virtual fitting method and device based on 2D image
CN112330580A (en) * 2020-10-30 2021-02-05 北京百度网讯科技有限公司 Method, device, computing equipment and medium for generating human body clothes fusion image
US20210049811A1 (en) * 2019-08-13 2021-02-18 Texel Llc Method and System for Remote Clothing Selection
CN112613439A (en) * 2020-12-28 2021-04-06 湖南大学 Novel virtual fitting network
US20210133919A1 (en) * 2019-11-04 2021-05-06 Adobe Inc. Cloth Warping Using Multi-Scale Patch Adversarial Loss
CN113012303A (en) * 2021-03-10 2021-06-22 浙江大学 Multi-variable-scale virtual fitting method capable of keeping clothes texture characteristics
CN113052980A (en) * 2021-04-27 2021-06-29 云南大学 Virtual fitting method and system
KR20210099353A (en) * 2020-02-04 2021-08-12 엔에이치엔 주식회사 Clothing virtual try-on service method on deep-learning and apparatus thereof
CN113781164A (en) * 2021-08-31 2021-12-10 深圳市富高康电子有限公司 Virtual fitting model training method, virtual fitting method and related device
WO2022002961A1 (en) * 2020-06-29 2022-01-06 L'oréal Systems and methods for improved facial attribute classification and use thereof
CN114419335A (en) * 2022-01-06 2022-04-29 百果园技术(新加坡)有限公司 Training and texture migration method of texture recognition model and related device

Patent Citations (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190333267A1 (en) * 2008-08-15 2019-10-31 Brown University Method and apparatus for estimating body shape
CN103597519A (en) * 2011-02-17 2014-02-19 麦特尔有限公司 Computer implemented methods and systems for generating virtual body models for garment fit visualization
CN104813340A (en) * 2012-09-05 2015-07-29 体通有限公司 System and method for deriving accurate body size measures from a sequence of 2d images
KR20140125507A (en) * 2013-04-19 2014-10-29 정상학 Virtual fitting apparatus and method using digital surrogate
CN103578004A (en) * 2013-11-15 2014-02-12 西安工程大学 Method for displaying virtual fitting effect
JP2016038812A (en) * 2014-08-08 2016-03-22 株式会社東芝 Virtual try-on apparatus, virtual try-on method and a program
CN104156966A (en) * 2014-08-11 2014-11-19 石家庄铁道大学 Pseudo 3D real-time virtual fitting method based on mobile terminal
WO2016109884A1 (en) * 2015-01-05 2016-07-14 Valorbec Limited Partnership Automated recommendation and virtualization systems and methods for e-commerce
EP3091510A1 (en) * 2015-05-06 2016-11-09 Reactive Reality GmbH Method and system for producing output images and method for generating image-related databases
GB201510752D0 (en) * 2015-06-18 2015-08-05 Morris Gary Personalized garment image processing
US20170024928A1 (en) * 2015-07-22 2017-01-26 Adidas Ag Computer-implemented method and apparatus for generating an image of a person wearing a selectable article of apparel
US20170084035A1 (en) * 2015-09-18 2017-03-23 Xiaofeng Han Systems and methods for evaluating suitability of an article for an individual
CN105354876A (en) * 2015-10-20 2016-02-24 何家颖 Mobile terminal based real-time 3D fitting method
US20200015575A1 (en) * 2017-07-13 2020-01-16 Shiseido Americas Corporation Systems and Methods for Virtual Facial Makeup Removal and Simulation, Fast Facial Detection and Landmark Tracking, Reduction in Input Video Lag and Shaking, and a Method for Recommending Makeup
WO2019193467A1 (en) * 2018-04-05 2019-10-10 Page International Fz Llc Method and device for the virtual try-on of garments based on augmented reality with multi-detection
US20200151807A1 (en) * 2018-11-14 2020-05-14 Beijing Jingdong Shangke Information Technology Co., Ltd. System and method for automatically generating three-dimensional virtual garment model using product description
JP2020097803A (en) * 2018-12-18 2020-06-25 成衛 貝田 Virtual fitting system
WO2020131518A1 (en) * 2018-12-19 2020-06-25 Seddi, Inc. Learning-based animation of clothing for virtual try-on
CN109740529A (en) * 2018-12-29 2019-05-10 广州二元科技有限公司 A kind of virtual fit method drawn based on neural network
CN111787242A (en) * 2019-07-17 2020-10-16 北京京东尚科信息技术有限公司 Method and apparatus for virtual fitting
WO2021008166A1 (en) * 2019-07-17 2021-01-21 北京京东尚科信息技术有限公司 Method and apparatus for virtual fitting
US20210049811A1 (en) * 2019-08-13 2021-02-18 Texel Llc Method and System for Remote Clothing Selection
US20210133919A1 (en) * 2019-11-04 2021-05-06 Adobe Inc. Cloth Warping Using Multi-Scale Patch Adversarial Loss
CN110852941A (en) * 2019-11-05 2020-02-28 中山大学 Two-dimensional virtual fitting method based on neural network
KR20210099353A (en) * 2020-02-04 2021-08-12 엔에이치엔 주식회사 Clothing virtual try-on service method on deep-learning and apparatus thereof
CN111768472A (en) * 2020-05-29 2020-10-13 北京沃东天骏信息技术有限公司 Virtual fitting method and device and computer-readable storage medium
CN111709874A (en) * 2020-06-16 2020-09-25 北京百度网讯科技有限公司 Image adjusting method and device, electronic equipment and storage medium
WO2022002961A1 (en) * 2020-06-29 2022-01-06 L'oréal Systems and methods for improved facial attribute classification and use thereof
CN112233222A (en) * 2020-09-29 2021-01-15 深圳市易尚展示股份有限公司 Human body parametric three-dimensional model deformation method based on neural network joint point estimation
CN112232914A (en) * 2020-10-19 2021-01-15 武汉纺织大学 Four-stage virtual fitting method and device based on 2D image
CN112330580A (en) * 2020-10-30 2021-02-05 北京百度网讯科技有限公司 Method, device, computing equipment and medium for generating human body clothes fusion image
CN112613439A (en) * 2020-12-28 2021-04-06 湖南大学 Novel virtual fitting network
CN113012303A (en) * 2021-03-10 2021-06-22 浙江大学 Multi-variable-scale virtual fitting method capable of keeping clothes texture characteristics
CN113052980A (en) * 2021-04-27 2021-06-29 云南大学 Virtual fitting method and system
CN113781164A (en) * 2021-08-31 2021-12-10 深圳市富高康电子有限公司 Virtual fitting model training method, virtual fitting method and related device
CN114419335A (en) * 2022-01-06 2022-04-29 百果园技术(新加坡)有限公司 Training and texture migration method of texture recognition model and related device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115937964A (en) * 2022-06-27 2023-04-07 北京字跳网络技术有限公司 Method, device, equipment and storage medium for attitude estimation
CN115937964B (en) * 2022-06-27 2023-12-15 北京字跳网络技术有限公司 Method, device, equipment and storage medium for estimating gesture
CN115496990A (en) * 2022-11-18 2022-12-20 武汉纺织大学 Deep learning-based garment image analysis method
CN115761143A (en) * 2022-12-07 2023-03-07 武汉纺织大学 3D virtual reloading model generation method and device based on 2D image
CN115761143B (en) * 2022-12-07 2023-09-19 武汉纺织大学 3D virtual reloading model generation method and device based on 2D image

Also Published As

Publication number Publication date
CN114663552B (en) 2022-08-16

Similar Documents

Publication Publication Date Title
CN114663552B (en) Virtual fitting method based on 2D image
Li et al. Low-light image enhancement via progressive-recursive network
CN112634137B (en) Hyperspectral and panchromatic image fusion method for extracting multiscale spatial spectrum features based on AE
CN110097609B (en) Sample domain-based refined embroidery texture migration method
CN108830818B (en) Rapid multi-focus image fusion method
CN108932536A (en) Human face posture method for reconstructing based on deep neural network
CN110853119A (en) Robust reference picture-based makeup migration method
CN113222875B (en) Image harmonious synthesis method based on color constancy
WO2023066173A1 (en) Image processing method and apparatus, and storage medium and electronic device
CN112232914A (en) Four-stage virtual fitting method and device based on 2D image
CN113160033A (en) Garment style migration system and method
Gao et al. Improving the performance of infrared and visible image fusion based on latent low-rank representation nested with rolling guided image filtering
CN113077545B (en) Method for reconstructing clothing human body model from image based on graph convolution
CN116402691B (en) Image super-resolution method and system based on rapid image feature stitching
CN113192055A (en) Harmonious method and model for synthesizing image
CN115272632B (en) Virtual fitting method based on gesture migration
Zhou et al. Linear contrast enhancement network for low-illumination image enhancement
CN116168186A (en) Virtual fitting chart generation method with controllable garment length
CN117593178A (en) Virtual fitting method based on feature guidance
CN113920014A (en) Neural-networking-based combined trilateral filter depth map super-resolution reconstruction method
Kim et al. Detail restoration and tone mapping networks for x-ray security inspection
CN116362972B (en) Image processing method, device, electronic equipment and storage medium
CN116645451A (en) High-precision garment texture virtual fitting method and system
Wu et al. SyFormer: Structure-Guided Synergism Transformer for Large-Portion Image Inpainting
CN114331894A (en) Face image restoration method based on potential feature reconstruction and mask perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant