CN115272632B - Virtual fitting method based on gesture migration - Google Patents
Virtual fitting method based on gesture migration Download PDFInfo
- Publication number
- CN115272632B CN115272632B CN202210795212.8A CN202210795212A CN115272632B CN 115272632 B CN115272632 B CN 115272632B CN 202210795212 A CN202210795212 A CN 202210795212A CN 115272632 B CN115272632 B CN 115272632B
- Authority
- CN
- China
- Prior art keywords
- image
- clothing
- analysis
- network
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 230000005012 migration Effects 0.000 title claims abstract description 8
- 238000013508 migration Methods 0.000 title claims abstract description 8
- 238000010586 diagram Methods 0.000 claims abstract description 7
- 230000009466 transformation Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 37
- 230000008569 process Effects 0.000 claims description 15
- 238000005070 sampling Methods 0.000 claims description 15
- 238000010606 normalization Methods 0.000 claims description 14
- 230000004927 fusion Effects 0.000 claims description 9
- 230000008439 repair process Effects 0.000 claims description 9
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 2
- 230000001537 neural effect Effects 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 14
- 239000004744 fabric Substances 0.000 abstract description 5
- 238000003062 neural network model Methods 0.000 abstract 1
- 230000011218 segmentation Effects 0.000 description 5
- 210000003423 ankle Anatomy 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 210000003127 knee Anatomy 0.000 description 4
- 210000000707 wrist Anatomy 0.000 description 4
- 238000012546 transfer Methods 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/20—Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0641—Shopping interfaces
- G06Q30/0643—Graphical representation of items or shoppers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/04—Indexing scheme for image data processing or generation, in general involving 3D image data
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Architecture (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及基于姿态迁移的虚拟试衣方法,包括:获取原始解析图和试穿者图像,先提取试穿者图像中的服装像素信息,再进行纹理修复得到精细的服装图像;将原始解析图和目标姿势输入到解析引导网络,得到解析引导图;根据解析引导图,对服装的扭曲范围做初步限定;获取目标姿态并预处理得到去除了下半身的解析图,通过服装曲翘网络获得扭曲后的服装图像;根据解析引导图、扭曲后的服装图像、目标姿态和试穿者图像,生成目标姿态的试穿结果。本发明通过将解析引导图、扭曲的目标服装图像、目标姿态和试穿者图像同时输入到神经网络模型,得到目标姿态的试衣效果图,提升了试衣效果,解决了试穿姿态变换导致皮肤与布料像素混淆的问题。
The present invention relates to a virtual fitting method based on posture migration, comprising: obtaining an original analysis image and an image of a person trying on clothes, first extracting clothing pixel information in the image of the person trying on, and then performing texture restoration to obtain a fine clothing image; converting the original analysis image Input the target posture and the target pose into the analysis guide network to obtain the analysis guide map; according to the analysis guide map, make a preliminary limit on the distortion range of the clothing; obtain the target pose and preprocess to obtain the analysis map with the lower body removed, and obtain the distortion through the clothing warp network The clothing image of the image; according to the analysis guide map, the distorted clothing image, the target pose and the image of the tryer, the try-on result of the target pose is generated. The present invention simultaneously inputs the analysis guide map, the distorted target clothing image, the target posture and the image of the try-on person to the neural network model to obtain the fitting effect diagram of the target posture, which improves the fitting effect and solves the problem caused by the transformation of the try-on posture. Issue with skin and cloth pixels getting mixed up.
Description
技术领域technical field
本发明属于服装图像处理领域,具体涉及一种基于姿态迁移的虚拟试衣方法。The invention belongs to the field of clothing image processing, in particular to a virtual fitting method based on posture transfer.
背景技术Background technique
近年来,随着购物方式从线下向线上转变,线上服装购物方式受到了消费者的青睐,然而却存在不能试穿的问题,消费者无法体验到服装穿在自己身上的效果。虚拟试衣的出现可以使卖方更客观地展示服饰优点,使交易双方可以更为直观地了解信息,促成交易,减少不必要的工作,提高工作效率,满足用户需求。In recent years, with the transformation of shopping methods from offline to online, online clothing shopping has been favored by consumers. However, there is a problem that consumers cannot try on clothes, and consumers cannot experience the effect of clothing on themselves. The emergence of virtual fittings can enable the seller to display the advantages of clothing more objectively, so that both parties to the transaction can understand information more intuitively, facilitate transactions, reduce unnecessary work, improve work efficiency, and meet user needs.
目前,现有技术的将虚拟试衣和姿态迁移进行融合以实现多姿势的虚拟试衣方法,主要分为基于2D图像和基于3D重构两种方式,对于直接基于2D图像的多姿态试衣的技术比较少,而且试衣结果存在皮肤与布料像素混淆,细节丢失等现象;基于3D重构方式的效果较好,但是对设备算力、性能和生成的模型质量要求相对较高,不利于技术的推广和普及。At present, the virtual fitting method of the existing technology that combines virtual fitting and posture migration to realize multi-pose virtual fitting is mainly divided into two methods based on 2D images and 3D reconstruction. For multi-pose fitting methods directly based on 2D images There are relatively few technologies, and there are skin and cloth pixel confusion and loss of details in the fitting results; the effect based on 3D reconstruction is better, but the requirements for equipment computing power, performance and the quality of the generated model are relatively high, which is not conducive to Promotion and popularization of technology.
公开号为CN 108734787A的中国专利公开了“一种基于多姿态和部位分解的图片合成虚拟试衣方法”,通过使用多姿态和部位的分解来进行合成,而不是单纯的将整张衣服图片进行简单的合成,能够更真实的达到虚拟试衣的效果,但该技术没有考虑姿态变换导致的皮肤与布料像素混淆,细节丢失等问题,大大影响了试衣效果。The Chinese patent with the publication number CN 108734787A discloses "a virtual fitting method for picture synthesis based on multi-pose and part decomposition", which is synthesized by using multi-pose and part decomposition instead of simply taking the whole clothes picture Simple synthesis can more realistically achieve the effect of virtual fitting, but this technology does not consider the confusion of skin and cloth pixels and loss of details caused by pose changes, which greatly affects the fitting effect.
发明内容Contents of the invention
本发明的目的是针对上述问题,提供一种基于姿态迁移的虚拟试衣方法,利用解析引导图对目标服装图像的扭曲范围限定,避免目标服装图像在随目标姿态变换时过度扭曲;根据目标姿态以及去除了下半身的解析图,利用服装曲翘网络得到跟随目标姿态扭曲的目标服装图像,将试穿者图像、随目标姿态扭曲的目标服装图像、目标姿态和解析引导图同时输入试穿图像生成网络,利用试穿图像生成网络得到目标姿态的试穿结果,提高试衣效果,避免试穿姿态变换导致皮肤与布料像素混淆的问题,并保持更多的服装纹理细节。The purpose of the present invention is to address the above-mentioned problems, provide a kind of virtual fitting method based on posture transfer, utilize the analytical guide map to limit the distortion range of target clothing image, avoid excessive distortion when target clothing image changes with target posture; According to target posture And remove the analysis map of the lower body, use the clothing warping network to obtain the target clothing image distorted with the target posture, input the try-on image, the target clothing image distorted with the target posture, the target posture and the analysis guide map at the same time to generate Network, using the try-on image generation network to obtain the try-on result of the target pose, improve the try-on effect, avoid the problem of skin and cloth pixel confusion caused by the change of try-on pose, and maintain more clothing texture details.
本发明的技术方案是基于姿态迁移的虚拟试衣方法,包括以下步骤:The technical solution of the present invention is a virtual fitting method based on posture migration, comprising the following steps:
步骤1,获取原始解析图和试穿者图像,先提取试穿者图像中的服装像素信息,得到简陋的服装图像,再进行纹理修复得到精细的服装图像;Step 1. Obtain the original analysis image and the image of the wearer, first extract the clothing pixel information in the image of the tryer to obtain a simple clothing image, and then perform texture restoration to obtain a fine clothing image;
步骤2,将原始解析图、目标服装和目标姿势输入到解析引导网络,得到解析引导图;Step 2, input the original analytical map, target clothing and target pose into the analytical guidance network to obtain the analytical guidance map;
步骤3,根据解析引导图,对目标服装的扭曲范围做初步限定;Step 3: Preliminarily limit the distortion range of the target clothing according to the analysis guide map;
步骤4,获取目标姿态并预处理得到去除了下半身的解析图,通过服装曲翘网络获得扭曲后的目标服装图像;Step 4: Obtain the target pose and preprocess it to obtain the analytical image with the lower body removed, and obtain the distorted target clothing image through the clothing warping network;
步骤5,根据解析引导图、扭曲后的目标服装图像、目标姿态和试穿者图像,通过图像生成网络生成目标姿态的试穿结果。Step 5: According to the analysis guide map, the distorted target clothing image, the target pose and the image of the person trying on the clothes, the try-on result of the target pose is generated through the image generation network.
进一步地,步骤1对服装图像进行像素级修复,具体修复过程包括:Further, step 1 performs pixel-level repair on the clothing image, and the specific repair process includes:
先通过卷积神经层学习服装图像的边缘信息特征,关注像素值剧烈变化的区域;再使用插值法对像素值剧烈变化的区域进行像素修复,确保服装边缘平滑且与背景自然过渡。First learn the edge information features of the clothing image through the convolutional neural layer, and focus on the area where the pixel value changes drastically; then use the interpolation method to repair the pixel in the area where the pixel value changes rapidly, to ensure that the edge of the clothing is smooth and transitions naturally with the background.
优选地,步骤1包括以下子步骤:Preferably, step 1 includes the following sub-steps:
首先,根据原始解析图中的服装语义信息,提取试穿者图像中对应区域的像素信息,得到初步的服装图像,服装图像存在图像边缘模糊或图像边缘有缺口的情形;First, according to the clothing semantic information in the original analysis image, the pixel information of the corresponding area in the try-on image is extracted to obtain a preliminary clothing image. The clothing image has blurred image edges or gaps in the image edges;
然后,使用插值法对服装图像中的模糊和缺口区域进行像素补齐或填充,得到更加精细的服装图像。Then, the interpolation method is used to fill or fill the blurred and gap areas in the clothing image to obtain a more detailed clothing image.
进一步地,步骤2的具体过程如下:Further, the specific process of step 2 is as follows:
首先,将原始解析图和目标姿势输入到解析引导网络,利用解析引导网络的多层卷积网络提取图像特征,并在解析引导网络中加入残差模块和小波采样层,用于提取更高级的语义结构,使得解析引导网络深入学习人体各个部位之间的关系细节,其中,小波采样层是通过小波变换将特征图转换到频域进行下采样,可以更好的保留纹理信息;First, input the original analysis image and target pose into the analysis-guided network, use the multi-layer convolutional network of the analysis-guided network to extract image features, and add a residual module and wavelet sampling layer in the analysis-guided network to extract more advanced The semantic structure enables the analysis-guided network to deeply learn the details of the relationship between various parts of the human body. Among them, the wavelet sampling layer converts the feature map into the frequency domain through wavelet transform for down-sampling, which can better retain texture information;
然后,将提取到的图像特征输入到解析引导网络的多层反卷积网络中,对图像进行上采样,并在反卷积之间加入归一化层,用来增强全局特征和局部特征之间的特征融合,并引入归一化约束损失函数,控制上采样过程中保留更多的语义细节;Then, the extracted image features are input into the multi-layer deconvolution network of the analytical guidance network, the image is up-sampled, and a normalization layer is added between deconvolutions to enhance the relationship between global features and local features. Feature fusion among them, and introduce a normalized constraint loss function to control the retention of more semantic details in the upsampling process;
最后,将生成的解析引导图与目标状态进行空间位置比较,确保各语义部分与对应的姿态关键点在位置上贴合,更好的处理手臂和服装之间重叠时的关系,并对语义位置进行微调得到更规整的解析引导图。Finally, compare the spatial position of the generated analytic guide map with the target state to ensure that each semantic part fits in position with the corresponding posture key points, better handle the overlapping relationship between the arm and the clothing, and correct the semantic position Perform fine-tuning to obtain a more regular analytical guide map.
优选地,所述归一化约束损失函数如下:Preferably, the normalized constraint loss function is as follows:
式中,表示归一化约束损失函数,G表示图像的全局特征,G′表示解析后图像的全局特征,L表示图像的局部特征,L′表示解析后图像的局部特征,/>表示解析前后图像全局特征匹配损失函数,/>表示解析前后图像局部特征匹配损失函数,均为学习系数,用于调整全局特征和局部特征的重要程度。In the formula, Represents the normalized constraint loss function, G represents the global feature of the image, G' represents the global feature of the analyzed image, L represents the local feature of the image, L' represents the local feature of the parsed image, /> Represents the image global feature matching loss function before and after parsing, /> Represents the image local feature matching loss function before and after parsing, Both are learning coefficients, which are used to adjust the importance of global features and local features.
解析引导图包含语义分割信息,具体包括:面部、头发、脖子、上衣区域、左手臂、右手臂、左手、右手、左肩膀、右肩膀、下衣区域。The parsing guide map contains semantic segmentation information, including: face, hair, neck, upper garment area, left arm, right arm, left hand, right hand, left shoulder, right shoulder, lower garment area.
优选地,目标姿态包含18个关键点,具体包括:鼻子、脖子、右肩、右肘、右手腕、左肩、左肘、左手腕、右臀、右膝、右脚踝、左臀、左膝、左脚踝、右眼、左眼、右耳、左耳。Preferably, the target pose contains 18 key points, specifically including: nose, neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, Left ankle, right eye, left eye, right ear, left ear.
进一步地,步骤4的具体过程如下:Further, the specific process of step 4 is as follows:
首先,根据解析引导图中各语义信息像素值的不同,去除下半身的语义信息,得到去除了下半身的解析图;First, remove the semantic information of the lower body according to the difference in pixel values of each semantic information in the analysis guide map, and obtain the analysis map with the lower body removed;
然后,通过去除了下半身的解析图和目标姿态,限制服装图像扭曲的整体轮廓,避免曲翘网络对服装图像进行强行变形,避免服装过度扭曲;Then, by removing the analysis map and the target pose of the lower body, the overall contour of the clothing image is limited, and the warping network is prevented from forcibly deforming the clothing image, avoiding excessive distortion of the clothing;
最后,经过曲翘网络并引入平面变形损失函数对服装图像进行变形,得到扭曲后的服装图像。Finally, the clothing image is deformed through the warping network and the plane deformation loss function is introduced to obtain the warped clothing image.
优选地,所述平面变形损失函数如下:Preferably, the plane deformation loss function is as follows:
式中,Cx(x),Cy(x)分别表示采样参数的x,y坐标,|Cx(x+i,y)-Cx(x,y)|表示两个节点之间的欧氏距离,i,j均为形变量,γ,δ均为形变系数。In the formula, C x (x), C y (x) represent the x and y coordinates of the sampling parameters respectively, and |C x (x+i,y)-C x (x,y)| represents the distance between two nodes Euclidean distance, i, j are deformation variables, γ, δ are deformation coefficients.
进一步地,步骤5中,试穿图像生成网络是端到端的网络,它包括生成器和判别器,生成器输入是解析引导图、扭曲后的服装图像和试穿者图像,在解析引导图的限定下,根据扭曲后服装图像和试穿者图像的像素信息生成粗糙试穿结果图,再经过判别器并引入特征点损失函数,判定粗糙试穿结果图是否符合目标姿态,并提取更多手臂区域特征,不断加强粗糙试穿结果图的细节,提高图像清晰度。Further, in step 5, the try-on image generation network is an end-to-end network, which includes a generator and a discriminator. The input of the generator is the analysis guide map, the distorted clothing image and the try-on image. Under the constraints, a rough try-on result map is generated according to the pixel information of the distorted clothing image and the try-on image, and then the discriminator and the feature point loss function are introduced to determine whether the rough try-on result map conforms to the target pose and extract more arms Regional features, continuously enhance the details of the rough try-on result image, and improve image clarity.
优选地,所述特征点匹配损失函数如下:Preferably, the feature point matching loss function is as follows:
式中,表示特征点匹配损失函数,W表示粗糙试穿结果图中人体姿态坐标点,M表示目标姿态的坐标点,Wi(x)表示粗糙试穿结果图中坐标点i的横坐标,Mi(x)表示目标姿态图中坐标点i的横坐标,n表示特征点总个数,|Wi(x)-Mi(x)|表示相同部位关键点在x轴的欧式距离,α,β均为调整系数,且α+β=1。In the formula, Represents the feature point matching loss function, W represents the coordinate point of the human body posture in the rough try-on result image, M represents the coordinate point of the target posture, W i (x) represents the abscissa of the coordinate point i in the rough try-on result graph, M i ( x) represents the abscissa of the coordinate point i in the target posture graph, n represents the total number of feature points, |W i (x)-M i (x)| represents the Euclidean distance of the key points of the same part on the x-axis, α, β Both are adjustment coefficients, and α+β=1.
相比现有技术,本发明的有益效果包括:Compared with the prior art, the beneficial effects of the present invention include:
(1)本发明通过将包含语义分割信息的解析引导图、随目标姿态扭曲的目标服装图像、目标姿态和试穿者图像同时输入到试穿图像生成网络,利用试穿图像生成网络得到试穿者目标姿态的试衣效果图,大幅度提升了试衣效果,解决了试穿姿态变换导致皮肤与布料像素混淆的问题,并在试衣效果图中保持更多的服装纹理细节,提高了虚拟试穿的体验度。(1) In the present invention, the analysis guide map containing semantic segmentation information, the target clothing image distorted with the target posture, the target posture and the image of the try-on person are input into the try-on image generation network at the same time, and the try-on image generation network is used to obtain the try-on The fitting effect picture of the target pose of the user, which greatly improves the fitting effect, solves the problem of confusion between the skin and the cloth pixels caused by the change of the try-on pose, and maintains more clothing texture details in the fitting effect map, which improves the virtual The experience of trying on.
(2)本发明利用包含语义分割信息的解析引导图对目标服装图像的扭曲范围进行限定,避免目标服装图像在随目标姿态变换时过度扭曲,使虚拟试穿效果更加逼真。(2) The present invention limits the distortion range of the target clothing image by using the analysis guide map containing semantic segmentation information, avoiding excessive distortion of the target clothing image when changing with the target posture, and making the virtual try-on effect more realistic.
(3)本发明从试穿者图像中获取服装图像,并对服装图像中的模糊和缺口区域进行纹理细化,得到更精细的服装图像,解决了训练数据集中缺少服装图像的问题,有助于试穿图像生成网络、解析引导网络、服装曲翘网络的训练强化,增强了试衣方法的鲁棒性。(3) The present invention obtains the clothing image from the image of the wearer, and refines the texture of the blur and gap regions in the clothing image to obtain a finer clothing image, which solves the problem of lack of clothing images in the training data set, and helps The robustness of the fitting method is enhanced by strengthening the training of the try-on image generation network, the analysis-guided network, and the clothing warping network.
(4)本发明在得到解析引导图的解析引导过程中,引入归一化层和归一化约束损失函数,在增强全局特征和局部特征融合的同时,控制上采样过程中保留了更多的语义细节。(4) The present invention introduces a normalization layer and a normalization constraint loss function during the analysis guidance process of obtaining the analysis guidance map, and while enhancing the fusion of global features and local features, more features are retained in the control upsampling process Semantic details.
(5)本发明在试穿图像生成网络中引入特征点匹配损失函数,判定初步的试穿结果图是否符合目标姿态,有效避免手臂与服装交叉遮挡的问题,进一步提高了虚拟试衣效果。(5) The present invention introduces a feature point matching loss function in the try-on image generation network to determine whether the preliminary try-on result map conforms to the target posture, effectively avoids the problem of cross-occlusion of arms and clothing, and further improves the effect of virtual fitting.
附图说明Description of drawings
下面结合附图和实施例对本发明作进一步说明。The present invention will be further described below in conjunction with drawings and embodiments.
图1为本发明实施例的虚拟试衣方法的流程示意图。FIG. 1 is a schematic flowchart of a virtual fitting method according to an embodiment of the present invention.
图2为本发明实施例的虚拟试衣方法的解析引导网络结构图。Fig. 2 is a structural diagram of the analytical guidance network of the virtual fitting method according to the embodiment of the present invention.
图3为本发明实施例的虚拟试衣方法的服装曲翘网络结构图。Fig. 3 is a structural diagram of the warping network of the virtual fitting method of the embodiment of the present invention.
图4为本发明实施例的虚拟试衣方法的试穿图像生成网络结构图。Fig. 4 is a network structure diagram of a virtual fitting image generation network of a virtual fitting method according to an embodiment of the present invention.
图5为本发明实施例的虚拟试衣系统的示意图。Fig. 5 is a schematic diagram of a virtual fitting system according to an embodiment of the present invention.
具体实施方式Detailed ways
实施例一Embodiment one
如图1所示,基于姿态迁移的虚拟试衣方法,包括以下步骤:As shown in Figure 1, the virtual fitting method based on gesture migration includes the following steps:
(1)获取原始解析图和试穿者图像,先提取试穿者图像中的服装像素信息,得到简陋的服装图像,再进行纹理修复得到精细的服装图像;(1) Obtain the original analysis image and the image of the wearer, first extract the clothing pixel information in the image of the tryer to obtain a simple clothing image, and then perform texture restoration to obtain a fine clothing image;
其中,服装图像的获取过程如下:首先,根据原始解析图中的服装语义信息,提取试穿者图像中对应区域的像素信息,得到粗糙的服装图像,服装图像的边缘模糊且存在缺口。然后,对服装图像进行纹理修复,使用插值法对粗糙服装图像中的模糊和缺口区域进行像素补齐或填充,得到更加精细的服装图像。Among them, the acquisition process of the clothing image is as follows: First, according to the clothing semantic information in the original analysis image, the pixel information of the corresponding area in the try-on image is extracted to obtain a rough clothing image with blurred edges and gaps. Then, the texture of the clothing image is repaired, and the blurred and gap areas in the rough clothing image are filled or filled by interpolation method to obtain a finer clothing image.
原始解析图包含试穿者各部位的语义信息,包含:面部、头发、脖子、上衣区域、左手臂、右手臂、左手、右手、左肩膀、右肩膀、下衣区域。The original parsing image contains semantic information of each part of the wearer, including: face, hair, neck, upper garment area, left arm, right arm, left hand, right hand, left shoulder, right shoulder, and lower garment area.
其中,纹理修复先通过卷积神经网络学习服装图像的边缘信息特征,关注于像素值剧烈变化的区域,再使用插值法对像素值剧烈变化的区域进行像素修复,确保服装边缘平滑且与背景自然过渡。Among them, the texture repair first learns the edge information characteristics of the clothing image through the convolutional neural network, focuses on the area where the pixel value changes drastically, and then uses the interpolation method to perform pixel repair on the area where the pixel value changes sharply to ensure that the edge of the clothing is smooth and natural to the background. transition.
(2)获取原始解析图和目标姿势,输入到解析引导网络,得到解析引导图;(2) Obtain the original analytical map and the target pose, input it to the analytical guidance network, and obtain the analytical guidance map;
其中,解析引导图显示试穿者姿态变换后的语义分割信息,包含面部、头发、脖子、上衣、手臂、下装的信息。Among them, the analytical guide map shows the semantic segmentation information after the pose transformation of the try-on user, including information on the face, hair, neck, top, arm, and bottom.
目标姿态由18个关键点构成,包含鼻子、脖子、右肩、右肘、右手腕、左肩、左肘、左手腕、右臀、右膝、右脚踝、左臀、左膝、左脚踝、右眼、左眼、右耳、左耳。The target pose consists of 18 key points, including nose, neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, left ankle, right Eye, left eye, right ear, left ear.
如图2所示,解析引导网络由多层卷积网络和多层反卷积网络组成,解析引导网络的输入是原始解析图、目标服装和目标姿势,输出是解析引导图。As shown in Figure 2, the parsing-guiding network consists of a multi-layer convolutional network and a multi-layer deconvolution network. The input of the parsing-guiding network is the original parsing map, target clothing and target pose, and the output is the parsing-guiding map.
其中,解析引导过程具体如下:首先,输入原始解析图和目标姿势,经过多层卷积网络提取图像特征,并在解析引导网络中加入残差模块和小波采样层,用于提取更高级的语义结构,使得解析引导网络深入学习人体各个部位之间的关系细节,其中,小波采样层是通过小波变换将特征图转换到频域进行下采样,可更好地保留纹理信息;然后,将提取到的图像特征输入到多层反卷积网络中,对图像进行上采样,并在反卷积之间加入归一化层,用来增强全局特征和局部特征之间的特征融合,并引入归一化约束损失函数,控制上采样过程中保留更多的语义细节。最后,将生成的解析引导图与目标状态进行空间位置比较,确保各语义部分与对应的姿态关键点在位置上贴合,更好的处理手臂和服装之间重叠时的关系,并对语义位置进行微调得到更规整的解析引导图。Among them, the analysis guidance process is as follows: First, input the original analysis image and target pose, extract image features through a multi-layer convolutional network, and add a residual module and wavelet sampling layer to the analysis guidance network to extract higher-level semantics The structure enables the analysis-guided network to deeply learn the details of the relationship between various parts of the human body. Among them, the wavelet sampling layer converts the feature map to the frequency domain through wavelet transform for down-sampling, which can better retain texture information; then, the extracted The image features of the image are input into the multi-layer deconvolution network, the image is upsampled, and a normalization layer is added between deconvolutions to enhance the feature fusion between global features and local features, and introduce normalization The constraint loss function is optimized to control the preservation of more semantic details during the upsampling process. Finally, compare the spatial position of the generated analytic guide map with the target state to ensure that each semantic part fits in position with the corresponding posture key points, better handle the overlapping relationship between the arm and the clothing, and correct the semantic position Perform fine-tuning to obtain a more regular analytical guide map.
其中,在归一化层中将前一层反卷积得到的特征看作局部特征,将后一层反卷积得到的特征看作全局特征,通过引入归一化约束损失函数控制当前局部特征和全局特征对后续融合结果的影响。Among them, in the normalization layer, the features obtained by the previous layer of deconvolution are regarded as local features, and the features obtained by the latter layer of deconvolution are regarded as global features, and the current local features are controlled by introducing a normalization constraint loss function and the impact of global features on subsequent fusion results.
其中,归一化约束损失函数表示为:Among them, the normalized constraint loss function is expressed as:
式中,表示归一化约束损失函数,G表示图像的全局特征,G′表示解析后图像的全局特征,L表示图像的局部特征,L′表示解析后图像的局部特征,/>表示解析前后图像全局特征匹配损失函数,/>表示解析前后图像局部特征匹配损失函数,均为学习系数,用于调整全局特征和局部特征的重要程度。In the formula, Represents the normalized constraint loss function, G represents the global feature of the image, G' represents the global feature of the analyzed image, L represents the local feature of the image, L' represents the local feature of the parsed image, /> Represents the image global feature matching loss function before and after parsing, /> Represents the image local feature matching loss function before and after parsing, Both are learning coefficients, which are used to adjust the importance of global features and local features.
(3)根据解析引导图,对服装的扭曲范围做初步限定;(3) According to the analysis guide map, make a preliminary limit on the distortion range of the clothing;
(4)获取目标姿态并预处理得到去除了下半身的解析图,通过服装曲翘网络获得扭曲后的服装图像,如图3所示;(4) Obtain the target pose and preprocess it to obtain the analytical image with the lower body removed, and obtain the distorted clothing image through the clothing warping network, as shown in Figure 3;
其中,获取扭曲服装图像的具体过程如下:首先,根据解析引导图中各语义信息像素值的不同,去除下半身的语义信息,得到去除了下半身的解析图;然后,通过去除了下半身的解析图和目标姿态,限制服装图像扭曲的整体轮廓,避免曲翘网络对服装图像进行强行变形,避免服装过度扭曲;最后,经过曲翘网络并引入平面变形损失函数对服装图像进行变形,得到扭曲后的服装图像。Among them, the specific process of obtaining the distorted clothing image is as follows: First, according to the difference in the semantic information pixel value of the analysis guide image, the semantic information of the lower body is removed to obtain the analysis image with the lower body removed; then, by removing the analysis image of the lower body and The target pose limits the overall outline of the clothing image distortion, avoiding the warping network to forcibly deform the clothing image, and avoiding excessive distortion of the clothing; finally, the warping network and the introduction of the plane deformation loss function are used to deform the clothing image to obtain the distorted clothing image.
其中,平面变形损失函数表示为:Among them, the plane deformation loss function is expressed as:
式中,Cx(x),Cy(x)分别表示采样参数的x,y坐标,|Cx(x+i,y)-Cx(x,y)|表示两个节点之间的欧氏距离,i,j均为形变量,γ,δ均为形变系数。In the formula, C x (x), C y (x) represent the x and y coordinates of the sampling parameters respectively, and |C x (x+i,y)-C x (x,y)| represents the distance between two nodes Euclidean distance, i, j are deformation variables, γ, δ are deformation coefficients.
(5)根据解析引导图、扭曲后的服装图像、目标姿态和试穿者图像,通过图像生成网络生成目标姿态的试穿结果;(5) According to the analysis guide map, the distorted clothing image, the target pose and the image of the try-on person, the try-on result of the target pose is generated through the image generation network;
如图4所示,图像生成网络是端到端的网络,由生成器和判别器两部分构成,生成器由编码器和解码器构成,生成器输入是解析引导图、扭曲后的服装图像和试穿者图像,在解析引导图的限定下,根据扭曲后服装图像和试穿者图像的像素信息生成粗糙试穿结果图,并得到粗糙试穿结果图的人体姿态图,再经过判别器并引入特征点匹配损失函数,判定粗糙试穿结果图是否符合目标姿态,并鼓励提取更多手臂区域特征,不断加强粗糙试穿结果图的细节,提高图像清晰度。As shown in Figure 4, the image generation network is an end-to-end network consisting of a generator and a discriminator. The generator is composed of an encoder and a decoder. The image of the wearer, under the limitation of the analysis guide map, generates a rough try-on result map according to the pixel information of the distorted clothing image and the try-on image, and obtains the human body pose map of the rough try-on result map, and then passes through the discriminator and introduces The feature point matching loss function determines whether the rough try-on result map conforms to the target pose, and encourages the extraction of more arm region features, continuously enhances the details of the rough try-on result map, and improves image clarity.
其中,特征点匹配损失函数如下所示:Among them, the feature point matching loss function is as follows:
式中,表示特征点匹配损失函数,W表示粗糙试穿结果图中人体姿态坐标点,M表示目标姿态的坐标点,Wi(x)表示粗糙试穿结果图中坐标点i的横坐标,Mi(x)表示目标姿态图中坐标点i的横坐标,n表示特征点总个数,|Wi(x)-Mi(x)|表示相同部位关键点在x轴的欧式距离,α,β均为调整系数,且α+β=1。In the formula, Represents the feature point matching loss function, W represents the coordinate point of the human body posture in the rough try-on result image, M represents the coordinate point of the target posture, W i (x) represents the abscissa of the coordinate point i in the rough try-on result graph, M i ( x) represents the abscissa of the coordinate point i in the target posture graph, n represents the total number of feature points, |W i (x)-M i (x)| represents the Euclidean distance of the key points of the same part on the x-axis, α, β Both are adjustment coefficients, and α+β=1.
实施例二Embodiment two
如图5所示,姿态迁移的虚拟试衣系统包括解析引导模块、服装匹配模块、图像融合模块。As shown in Figure 5, the virtual fitting system for posture transfer includes an analysis guidance module, a clothing matching module, and an image fusion module.
解析引导模块,用于根据原始解析图、试穿者图像和目标姿势,先进行像素提取和纹理修复,再经过解析引导网络生成解析引导图;The analysis guidance module is used to perform pixel extraction and texture restoration according to the original analysis image, the image of the wearer and the target pose, and then generate the analysis guidance image through the analysis guidance network;
服装匹配模块,用于根据解析引导图、目标姿态和去除了下半身的解析图,通过服装曲翘网络获得扭曲后的服装图像;The clothing matching module is used to obtain the distorted clothing image through the clothing warping network according to the analysis guide map, the target pose and the analysis map with the lower body removed;
图像融合模块,根据解析引导图、扭曲后的服装图像、目标姿态和试穿者图像,通过试穿图像生成网络生成目标姿态的试穿结果。The image fusion module generates the try-on result of the target pose through the try-on image generation network according to the analysis guide map, the distorted clothing image, the target pose and the try-on image.
如图2所示,语义解析网络的输入是原始解析图和目标姿势图,输出是解析引导图,即姿态迁移后的解析图。原始解析图和目标姿势图分别经依次连接的5个残差块处理,每个残差块使用3×3的卷积提取特征,残差块之间经小波层连接,小波层在频域空间对特征图下采样;末端的残差块连接归一化层,用来增强全局特征和局部特征之间的特征融合,并引入归一化约束损失函数,控制上采样过程中保留更多的语义细节;归一化之后经依次连接的5个反卷积层处理,相邻的反卷积层之间经逆小波层连接,逆小波层用于上采样,末端的反卷积层输出解析引导图。As shown in Figure 2, the input of the semantic parsing network is the original parsing map and the target pose map, and the output is the parsing guide map, that is, the parsing map after pose migration. The original parsing image and the target pose image are respectively processed by sequentially connected five residual blocks, and each residual block uses 3×3 convolution to extract features, and the residual blocks are connected by a wavelet layer. Downsample the feature map; the residual block at the end is connected to the normalization layer to enhance the feature fusion between global features and local features, and introduce a normalized constraint loss function to control more semantics during the upsampling process Details; After normalization, it is processed by 5 deconvolution layers connected in sequence, adjacent deconvolution layers are connected by inverse wavelet layer, the inverse wavelet layer is used for upsampling, and the final deconvolution layer outputs analytical guidance picture.
如图3所示,服装曲翘网络的输入是解析引导图和服装图像,输出是扭曲后的服装图像。首先,解析引导图和服装图像分别经编码器进行编码,分别提取两者的图像特征;然后,通过两者的图像特征计算形变系数θ,并通过解析引导图和目标姿态,限制服装图像扭曲的整体轮廓,避免曲翘网络对服装图像进行强行变形,避免服装过度扭曲;最后,经扭曲操作并引入平面变形损失函数对服装图像进行变形,得到扭曲后的服装图像。As shown in Figure 3, the input of the clothing warping network is the parsing guide map and the clothing image, and the output is the warped clothing image. First, the analysis guide map and the clothing image are encoded by the encoder, and the image features of the two are extracted respectively; then, the deformation coefficient θ is calculated through the image features of the two, and the distortion of the clothing image is limited by analyzing the guide map and the target pose. The overall outline avoids the warping network from forcibly deforming the clothing image and avoiding excessive distortion of the clothing; finally, the clothing image is deformed by the warping operation and the introduction of the plane deformation loss function to obtain the distorted clothing image.
如图4所示,试穿图像生成网络的输入是解析引导图、扭曲后的服装图像和试穿者图像,输出是试穿图像。试穿图像生成网络是端到端的网络,它包括生成器和判别器,生成器由编码器和解码器构成,生成器输入是解析引导图、扭曲后的服装图像和试穿者图像,在解析引导图的限定下,根据扭曲后服装图像和试穿者图像的像素信息生成粗糙试穿结果图,再经过判别器并引入特征点损失函数,判定粗糙试穿结果图是否符合目标姿态,并提取更多手臂区域特征,加强粗糙试穿结果图的细节,提高图像清晰度。As shown in Figure 4, the input of the try-on image generation network is the parsing guide map, the distorted clothing image and the try-on image, and the output is the try-on image. The try-on image generation network is an end-to-end network. It includes a generator and a discriminator. The generator is composed of an encoder and a decoder. Under the limitation of the guide map, the rough try-on result map is generated according to the pixel information of the distorted clothing image and the try-on image, and then the discriminator and the feature point loss function are introduced to determine whether the rough try-on result map conforms to the target pose, and extract More arm region features, enhanced details of the rough try-on result image, and improved image clarity.
姿态迁移的虚拟试衣系统采用和实施例一相同的虚拟试衣方法。The virtual fitting system for posture migration adopts the same virtual fitting method as that of the first embodiment.
实施结果表明本发明不仅使得语义分割的精度更高,而且增加了服装变形的鲁棒性,使试穿结果图像保留更多的细节大大提高了高分辨率2D图像的虚拟试穿效果,提高了试穿效果和用户体验。The implementation results show that the invention not only makes the accuracy of semantic segmentation higher, but also increases the robustness of clothing deformation, makes the try-on result image retain more details, greatly improves the virtual try-on effect of high-resolution 2D images, and improves the Try-on effect and user experience.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210795212.8A CN115272632B (en) | 2022-07-07 | 2022-07-07 | Virtual fitting method based on gesture migration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210795212.8A CN115272632B (en) | 2022-07-07 | 2022-07-07 | Virtual fitting method based on gesture migration |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115272632A CN115272632A (en) | 2022-11-01 |
CN115272632B true CN115272632B (en) | 2023-07-18 |
Family
ID=83764879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210795212.8A Active CN115272632B (en) | 2022-07-07 | 2022-07-07 | Virtual fitting method based on gesture migration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115272632B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116206332B (en) * | 2023-01-31 | 2023-08-08 | 北京数美时代科技有限公司 | Pedestrian re-recognition method, system and storage medium based on attitude estimation |
CN116824002B (en) * | 2023-06-19 | 2024-02-20 | 深圳市毫准科技有限公司 | AI clothing try-on result output method based on fake model and related equipment |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120287122A1 (en) * | 2011-05-09 | 2012-11-15 | Telibrahma Convergent Communications Pvt. Ltd. | Virtual apparel fitting system and method |
JP5605885B1 (en) * | 2014-02-27 | 2014-10-15 | 木下 泰男 | Virtual try-on system and virtual try-on program |
CN110211196B (en) * | 2019-05-28 | 2021-06-15 | 山东大学 | A kind of virtual try-on method and device based on posture guidance |
CN113297944A (en) * | 2020-12-28 | 2021-08-24 | 武汉纺织大学 | Human body posture transformation method and system for virtual fitting of clothes |
CN113052980B (en) * | 2021-04-27 | 2022-10-14 | 云南大学 | Virtual fitting method and system |
-
2022
- 2022-07-07 CN CN202210795212.8A patent/CN115272632B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN115272632A (en) | 2022-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110853119B (en) | Reference picture-based makeup transfer method with robustness | |
CN115272632B (en) | Virtual fitting method based on gesture migration | |
CN113222875B (en) | Image harmonious synthesis method based on color constancy | |
CN114663552B (en) | Virtual fitting method based on 2D image | |
Li et al. | Line drawing guided progressive inpainting of mural damage | |
CN118608910A (en) | Virtual try-on method and system based on adaptive multimodal fusion and dynamic feature enhancement | |
CN118314038A (en) | A method for image shadow removal based on mask thinning | |
CN117893673A (en) | Method and system for generating an animated three-dimensional head model from a single image | |
CN117196935A (en) | Drama makeup migration method based on UV space mapping | |
CN113516604B (en) | Image restoration method | |
CN118037898B (en) | Text generation video method based on image guided video editing | |
CN112241708A (en) | Method and apparatus for generating new person image from original person image | |
CN117593178A (en) | Virtual fitting method based on feature guidance | |
CN117333604A (en) | A method of character facial reenactment based on semantic perception neural radiation field | |
CN116824669A (en) | A face de-occlusion method based on feature reconstruction | |
CN116645451A (en) | High-precision garment texture virtual fitting method and system | |
Chen et al. | A robust transformer GAN for unpaired data makeup transfer | |
Zhang et al. | Norest-net: Normal estimation neural network for 3-D noisy point clouds | |
Chen et al. | NeuralReshaper: single-image human-body retouching with deep neural networks | |
Wang et al. | Uncouple generative adversarial networks for transferring stylized portraits to realistic faces | |
Li et al. | HUMOD: High-Quality Human Modeling From Monocular Virtual Try-On Image | |
CN119888028B (en) | A digital human reconstruction method focusing on the face | |
CN119206101B (en) | Editable facial three-dimensional reconstruction method, system and storage medium | |
CN117274059B (en) | Low-resolution image reconstruction method and system based on image coding and decoding | |
CN118446930B (en) | Monocular video dressing human body space-time feature learning method based on nerve radiation field |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |