CN115311403B - Training method of deep learning network, virtual image generation method and device - Google Patents

Training method of deep learning network, virtual image generation method and device Download PDF

Info

Publication number
CN115311403B
CN115311403B CN202211037100.2A CN202211037100A CN115311403B CN 115311403 B CN115311403 B CN 115311403B CN 202211037100 A CN202211037100 A CN 202211037100A CN 115311403 B CN115311403 B CN 115311403B
Authority
CN
China
Prior art keywords
hairline
hair
node
image
distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211037100.2A
Other languages
Chinese (zh)
Other versions
CN115311403A (en
Inventor
彭昊天
陈睿智
赵晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202211037100.2A priority Critical patent/CN115311403B/en
Publication of CN115311403A publication Critical patent/CN115311403A/en
Application granted granted Critical
Publication of CN115311403B publication Critical patent/CN115311403B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Computer Graphics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The disclosure provides a training method, an avatar generation method and device, equipment, media and products of a deep learning network, relates to the field of artificial intelligence, in particular to the technical fields of deep learning, computer vision, virtual/augmented reality and image processing, and can be applied to scenes such as virtual digital people, metauniverse and the like. The specific implementation scheme comprises the following steps: determining hairline layout features and hairline distribution features of a hair region in a training sample image; outputting hairline node distribution characteristics matched with hairline distribution characteristics according to hairline layout characteristics and hairline distribution characteristics of a hairline area by utilizing a deep learning network to be trained; determining similar distances between the distribution characteristics of the hairline nodes and preset distribution labels of the hairline nodes, and obtaining characteristic loss values based on the similar distances; and adjusting model parameters of the deep learning network according to the characteristic loss value to obtain the trained target deep learning network.

Description

Training method of deep learning network, virtual image generation method and device
Technical Field
The present disclosure relates to the field of artificial intelligence, and in particular, to the fields of deep learning, computer vision, virtual/augmented reality, and image processing, which may be applied to scenes such as virtual digital people, metauniverse, and the like.
Background
The avatar has wide application in social, live, or game scenes, etc. Virtual hairstyle generation is an important content for constructing an avatar, but in some scenes, the virtual hairstyle generation process has the phenomena of high cost and poor generation effect of the virtual hairstyle.
Disclosure of Invention
The disclosure provides a training method and device for a deep learning network, an avatar generation method and device, equipment, media and products.
According to an aspect of the present disclosure, there is provided a training method of a deep learning network, including: determining hairline layout features and hairline distribution features of a hair region in a training sample image; outputting hairline node distribution characteristics matched with hairline distribution characteristics according to hairline layout characteristics and hairline distribution characteristics of a hairline area by utilizing a deep learning network to be trained; determining similar distances between the distribution characteristics of the hairline nodes and preset distribution labels of the hairline nodes, and obtaining characteristic loss values based on the similar distances; and adjusting model parameters of the deep learning network according to the characteristic loss value to obtain the trained target deep learning network.
According to another aspect of the present disclosure, there is provided an avatar generation method including: determining hair layout features of hair regions in the target image; determining hairline node distribution characteristics matched with hairline distribution characteristics according to hairline layout characteristics and preset hairline distribution characteristics; and generating a target avatar matching the target image based on the preset avatar and the hairline node distribution characteristics.
According to another aspect of the present disclosure, there is provided a training apparatus of a deep learning network, including: the training sample image processing module is used for determining hairline layout characteristics and hairroot distribution characteristics of the hair area in the training sample image; the hairline node distribution characteristic determining module is used for outputting hairline node distribution characteristics matched with hairline distribution characteristics according to hairline layout characteristics and hairline distribution characteristics of a hair area by utilizing a deep learning network to be trained; the characteristic loss value determining module is used for determining similar distances between the distribution characteristics of the hairline nodes and preset hairline node distribution labels to obtain characteristic loss values based on the similar distances; and the model parameter adjustment module is used for adjusting model parameters of the deep learning network according to the characteristic loss value to obtain a trained target deep learning network.
According to another aspect of the present disclosure, there is provided an avatar generating apparatus including: the hair layout feature determining module is used for determining hair layout features of the hair area in the target image; the hairline node distribution characteristic determining module is used for determining hairline node distribution characteristics matched with hairline distribution characteristics according to hairline layout characteristics and preset hairline distribution characteristics; and a target avatar generation module for generating a target avatar matched with the target image based on the preset avatar and the hairline node distribution characteristics.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor and a memory communicatively coupled to the at least one processor. Wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the training method or avatar generation method of the deep learning network described above.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the training method or the avatar generation method of the deep learning network described above.
According to another aspect of the present disclosure, there is provided a computer program product stored on at least one of a readable storage medium and an electronic device, comprising a computer program which, when executed by a processor, implements the training method or avatar generation method of a deep learning network described above.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 schematically illustrates a system architecture of a training method and apparatus of a deep learning network according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a training method of a deep learning network according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of a training method of a deep learning network according to another embodiment of the present disclosure;
fig. 4 schematically illustrates a flowchart of an avatar generation method according to an embodiment of the present disclosure;
FIG. 5 schematically illustrates a schematic diagram of an image processing process according to an embodiment of the present disclosure;
FIG. 6 schematically illustrates a block diagram of a training apparatus of a deep learning network, according to an embodiment of the present disclosure;
fig. 7 schematically illustrates a block diagram of an avatar generating apparatus according to an embodiment of the present disclosure;
fig. 8 schematically illustrates a block diagram of an electronic device for performing deep learning network training in accordance with an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
The embodiment of the disclosure provides a training method of a deep learning network. The training method of the deep learning network comprises the following steps: determining hairline layout characteristics and hairline distribution characteristics of a hair region in a training sample image, outputting hairline node distribution characteristics matched with the hairline distribution characteristics according to the hairline layout characteristics and the hairline distribution characteristics of the hair region by utilizing a deep learning network to be trained, determining similar distances between the hairline node distribution characteristics and preset hairline node distribution labels, obtaining a characteristic loss value based on the similar distances, and adjusting model parameters of the deep learning network according to the characteristic loss value to obtain a trained target deep learning network.
Fig. 1 schematically illustrates a system architecture of a training method and apparatus of a deep learning network according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios.
The system architecture 100 according to this embodiment may include a data end 101, a network 102, and a server 103. Network 102 is the medium used to provide communications links between data terminals 101 and servers 103. Network 102 may include various connection types such as wired, wireless communication links, or fiber optic cables, among others. The server 103 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud computing, network service, and middleware service.
The server 103 may be a server providing various services, for example, a server performing deep learning network training based on training sample images provided by the data terminal 101.
For example, the server 103 is configured to determine hair layout features and hair root distribution features of a hair region in a training sample image, output hair node distribution features matching the hair root distribution features according to the hair layout features and the hair root distribution features of the hair region by using a deep learning network to be trained, determine a similarity distance between the hair node distribution features and a preset hair node distribution label, obtain a feature loss value based on the similarity distance, and adjust model parameters of the deep learning network according to the feature loss value to obtain a trained target deep learning network.
It should be noted that, the training method of the deep learning network provided by the embodiments of the present disclosure may be performed by the server 103. Accordingly, the image processing apparatus provided by the embodiments of the present disclosure may be provided in the server 103. The training method of the deep learning network provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 103 and is capable of communicating with the data terminal 101 and/or the server 103. Accordingly, the image processing apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 103 and is capable of communicating with the data terminal 101 and/or the server 103.
It should be understood that the number of data ends, networks, and servers in fig. 1 are merely illustrative. There may be any number of data ends, networks, and servers, as desired for implementation.
The embodiment of the present disclosure provides a training method of a deep learning network, and the training method of the deep learning network according to an exemplary embodiment of the present disclosure is described below with reference to fig. 2 to 3 in conjunction with the system architecture of fig. 1. The training method of the deep learning network of the embodiment of the present disclosure may be performed by the server 103 shown in fig. 1, for example.
Fig. 2 schematically illustrates a flow chart of a training method of a deep learning network according to an embodiment of the present disclosure.
As shown in fig. 2, the training method 200 of the deep learning network of the embodiment of the present disclosure may include, for example, operations S210 to S240.
In operation S210, hair line layout features and hair root distribution features of hair regions in the training sample image are determined.
In operation S220, a hairline node distribution feature matching the hairline distribution feature is output according to the hairline layout feature and the hairline distribution feature of the hair region using the deep learning network to be trained.
In operation S230, a similar distance between the hairline node distribution characteristic and a preset hairline node distribution label is determined, and a characteristic loss value based on the similar distance is obtained.
In operation S240, model parameters of the deep learning network are adjusted according to the feature loss value, so as to obtain a trained target deep learning network.
The following illustrates respective operation example flows of the training method of the deep learning network of the present embodiment.
Illustratively, the training sample image includes a subject head region, which may include a face region and a hair region. Before the training sample image is identified, facial feature points in the training sample image can be extracted, and the training sample image is subjected to image registration according to the facial feature points and the preset reference occupation feature, so that the registered training sample image is obtained. The reference occupancy feature indicates the reference occupancy of the facial feature points relative to the image in which they are located. For example, facial feature points of the object in the training sample image may be identified by using a facial feature point detection algorithm, which may be, for example, TCDCN (Tasks-Constrained Deep Convolutional Network, task constraint depth convolution network), ASM (Active Shape Model, dynamic shape model) algorithm, which is not described herein.
For example, the training sample image may be registered to the object template indicated by the reference placeholder feature by a flat-shrink operation according to the reference placeholder feature and the facial feature points in the training sample image, resulting in a registered training sample image. In the process of registering the training sample image to the object template, the translation amount t and the scaling amount s of the training sample image are assumed, and the hair area in the training sample image is also applicable to the translation amount t and the scaling amount s.
Determining hair layout features of hair regions in the registered training sample image, which may include, for example, at least one of the following features: hair strike characteristics, hair length characteristics, hair depth characteristics, and hair local density.
For example, a global bounding box for a hair region may be determined, and the number of subdivisions on each side of the bounding box may be determined based on a preset voxelization scale and bounding box side length. And voxelization processing is carried out on the bounding box according to the subdivision number of each side of the bounding box, M voxels corresponding to the bounding box are obtained, and the M voxels are sequenced and numbered according to the coordinate sequence. Voxels, which are short for volume elements, may constitute the smallest unit of segmentation in three-dimensional space.
For any hair in the hair region, a voxel through which each hair segment passes may be calculated from the root node, and a unit vector capable of representing the direction of the hair segment may be recorded in the corresponding voxel. And determining the average unit vector associated with each voxel according to the unit vector and the number of the unit vectors in each voxel so as to realize the conversion of the hair area into a volume vector field.
And obtaining the local density of the hairline of the hair region according to the number of the unit vectors in each voxel. And determining the trend characteristics of the hair in the hair area according to the mean unit vector associated with each voxel. Recording the number of voxels through which the hairline passes, and determining the hairline length characteristic of the corresponding hairline according to the number of voxels. The object pose depth in the training sample image may be determined based on the distance between the training sample image and the imaging device, and may constitute hair depth features of the hair region.
The deep learning network to be trained can be utilized to output hairline node distribution characteristics matched with hairline distribution characteristics according to hairline layout characteristics and hairline distribution characteristics of a hair region in the training sample image. And determining the similar distance between the hairline node distribution characteristics and the hairline node distribution labels according to the preset hairline node distribution labels matched with the hairline root distribution characteristics, and obtaining the characteristic loss value based on the similar distance. And adjusting model parameters of the deep learning network according to the characteristic loss value to obtain the trained target deep learning network.
According to the embodiment of the disclosure, the hairline layout characteristics of the hair region in the training sample image are determined, the deep learning network to be trained is utilized, the hairline node distribution characteristics matched with the hairline distribution characteristics are output according to the hairline layout characteristics and the hairline distribution characteristics of the hair region, the deep learning network with three-dimensional hairline understanding capability can be trained, the hairline layout analysis difficulty is reduced, the virtual hairline reconstruction cost can be effectively controlled, and the cost consumption of virtual image generation can be effectively reduced.
Fig. 3 schematically illustrates a flow chart of a training method of a deep learning network according to another embodiment of the present disclosure.
As shown in fig. 3, the method 300 may include, for example, operations S310 to S340.
In operation S310, hair layout features of hair regions in the training sample image are determined.
In operation S320, a hair node prediction coordinate sequence matched with each hair node coordinate is output as a hair node distribution feature according to the hair layout feature of the hair region and the plurality of hair node coordinates using the deep learning network to be trained.
In operation S330, a similar distance between the hairline node distribution characteristic and a preset hairline node distribution label is determined, and a characteristic loss value based on the similar distance is obtained.
In operation S340, the model parameters of the deep learning network are adjusted according to the feature loss value, so as to obtain the trained target deep learning network.
The following illustrates an example flow of each operation of the training method of the deep learning network of the present embodiment.
For example, the dense hair data may be rendered based on the dense hair data of the training sample image, resulting in a rendered hair image. Rendering the hair image can form a three-dimensional hair trend graph of the hair region, and the three-dimensional hair trend graph can be subjected to feature extraction by utilizing a convolution layer of a deep learning network to be trained, and the extracted hidden vector is used as hair layout features of dense hair data.
In one example, hair regions in the training sample image may be rendered according to the hair node distribution labels, resulting in a rendered hair image. The hair trend feature indicated by the pixel color values in the rendered hair image is taken as a hair layout feature.
The hairline node distribution label indicates a hairline node truth coordinate sequence that matches the hairline node coordinates. Pixel color values that match hair nodes may be determined from node coordinates of adjacent hair nodes indicated by the hair node distribution labels. And obtaining a rendered hair image according to the pixel color values matched with the hair nodes.
According to the hair node distribution labels, the hair areas in the training sample images are rendered, the obtained rendered hair images can form a three-dimensional hair trend graph of the training sample images, the difficulty in hair layout analysis can be effectively reduced, sufficient hair layout information can be obtained based on a single face image, and a deep learning network with three-dimensional hair understanding capability can be generated.
Illustratively, hair nodes i-1 and i form adjacent hair nodes, i.ltoreq.i.ltoreq.m, where m represents the total number of hair nodes in the corresponding hair. Node coordinate vector V according to i-1, i-th hair node indicated by hair node distribution label i-1 、V i A pixel color value is determined that matches the ith hairline node.
For example, a pixel color value L (i) matching the ith hairline node can be calculated by equation (1),
L(i)=(norm(V i -V i-1 ).xy+1)/2*255 (1)
V i -V i-1 representing the difference in coordinate vector between the current hair node and the previous hair node, norm representing the processing of the difference in coordinate vector as a unit direction vector, xy representing the projection of the unit direction vector in a two-dimensional plane, norm (V) i -V i-1 ) Xy is within the range of [ -1,1]。
According to the node coordinates of the adjacent hair nodes, the pixel color values matched with the hair nodes are determined, and the mapping of at least one hair node in the hair to the color space can be realized. The color value of the pixel matching at least one hair node may reflect the hair trend characteristics of the corresponding hair, generally the lighter the color of the pixel of the hair node that is farther forward in position, the darker the color of the pixel of the hair node that is farther backward in position.
The hair region in each training sample image may be considered to have normalized root node coordinates. In one example, dense hair data of a training sample image may be downsampled to obtain sparse hair data. The root node coordinates in the sparse hair data may constitute the root distribution characteristics of the hair region.
The deep learning network to be trained can be utilized to output a hairline node prediction coordinate sequence matched with each hairline node coordinate according to the hairline layout characteristics of the hair region and a plurality of hairline node coordinates, and the hairline node prediction coordinate sequence is used as a hairline node distribution characteristic. For example, using a deep learning network to be trained, a hair node predicted coordinate sequence matching each hair node coordinate is output according to the hair layout features and a plurality of hair node coordinates in the sparse hair data.
The hairline node coordinates in the sparse hairline data can be used as a hairline node distribution label. And coding the coordinates of the root node in the sparse hairline data by using an encoder in the deep learning network to obtain the position coding data of the root node. And outputting a hairline node prediction coordinate sequence matched with the hairline node coordinates according to the hairline layout characteristics and the position coding data of the hairline node by using a decoder in the deep learning network.
For example, the hair node predicted coordinate sequence can be represented by equation (2),
f(Latent n ,PE(Root m ))=[Node n,m,0 ,Node n,m,1 ,..,Node n,m,i ] (2)
Latent n representing hair layout features of an nth training sample image, namely hidden vectors, root, of a three-dimensional hair trend graph representing an nth hairstyle image m Node coordinates representing the mth Root node, PE (Root m ) Position-coded data representing an mth root node. Node n,m,i Representing predicted coordinates of hairline nodes matching an mth hairline Node, [ Node ] n,m,0 ,Node n,m,1 ,..,Node n,m,i ]And forming a hair node prediction coordinate sequence matched with the mth hair root node.
The hairline node distribution label indicates a hairline node truth coordinate sequence that matches the respective hairline node coordinates. The similarity distance between the hairline node prediction coordinate sequence and the hairline node true value coordinate sequence matched with the coordinates of each hairline node can be determined, and the characteristic loss value based on the similarity distance is obtained. Illustratively, for a plurality of hair root nodes that match any of the hair root nodes, a similarity distance between predicted coordinates and truth coordinates based on each of the hair root nodes is determined. Feature loss values for guiding deep learning network training are determined based on the similar distances associated with each hairline node.
And adjusting model parameters of the deep learning network according to the characteristic loss value to obtain the trained target deep learning network. The hairline layout understanding capability of the trained deep learning network can be effectively guaranteed, the prediction generalization capability of the deep learning network for hairline node distribution can be improved, and the virtual hairstyle reconstruction effect can be effectively guaranteed.
A distribution prediction model for hairline nodes may be derived based on a trained target deep learning network. For example, the target deep learning network may be used as a distribution prediction model, or the target deep learning network may be adaptively adjusted, and the adjusted deep learning network may be used as the distribution prediction model. The adaptation of the target deep learning network may include, for example, an adjustment for the content of the model parameter weight, the model structure, and the like, which is not limited in this embodiment.
In one example, hair layout features of hair regions in a target image may be determined, and hair node distribution features matching the hair root distribution features are determined from the hair layout features of the hair regions and preset hair root distribution features using a target deep learning network to generate a target avatar matching the target image based on the preset avatar and the hair node distribution features.
According to the embodiment of the disclosure, the hairline layout characteristics of the hair region in the training sample image are determined, and the hairline node prediction coordinate sequence matched with the hairline node coordinates is output according to the hairline layout characteristics of the hair region and the hairline node coordinates by utilizing the deep learning network to be trained. The three-dimensional hairline understanding capability of the deep learning network can be effectively improved, the hairline layout analysis difficulty can be effectively reduced, the hairline layout reconstruction effect can be effectively guaranteed, the virtual hairline reconstruction efficiency can be improved, the virtual hairline reconstruction cost can be reduced, the virtual image generating capability can be effectively improved, and the individual diversified demands of users can be effectively met.
Fig. 4 schematically illustrates a flowchart of an avatar generation method according to an embodiment of the present disclosure.
As shown in fig. 4, the method 400 may include, for example, operations S410 to S430.
In operation S410, hair layout characteristics of a hair region in a target image are determined.
In operation S420, a hairline node distribution characteristic matching the hairline distribution characteristic is determined according to the hairline layout characteristic and the preset hairline distribution characteristic.
In operation S430, a target avatar matching the target image is generated based on the preset avatar and the hairline node distribution characteristics.
An exemplary flow of each operation of the avatar generation method of the present embodiment is illustrated below.
The target image includes an object head image, and before determining hair layout features of the object head image, an object head pose in the target image may be determined, and the target image is registered according to the object head pose and a reference head pose of a preset avatar, so as to obtain a registered target image. Or, based on a three-dimensional head model of a preset virtual image, estimating pose transformation parameters of the head model of the object in the target image by a least square optimization method, and registering the target image according to the pose transformation parameters to obtain a registered target image.
Hair placement features of hair regions in the registered target image are determined. In one example, pixel gradient information for a hair region in a registered target image may be determined, and hair trend characteristics associated with the hair region may be determined as hair layout characteristics based on the pixel gradient information. The method is beneficial to reducing the analysis difficulty of the hairline layout, can effectively ensure the reconstruction effect of the virtual hairstyle, and is beneficial to improving the generalization capability and application range of the virtual image generation.
A preset gradient operator may be employed to calculate pixel gradient information in the hair region. For any hairline node in the hair area, assuming that the gradient of the pixel corresponding to the hairline node in the x-axis direction is Gx and the gradient in the y-axis direction is Gy, the hairline trend at the hairline node can be denoted as θ, θ=arctan (Gy/Gx). The preset gradient operator may include, for example, a Sobel operator, a Roberts operator, a Prewitt operator, a Lapacian operator, etc., which is not limited in this embodiment.
The directional filter function may also be used to filter the hair region in the target image, for example. The directional filter function may be formed by N symmetric Gabor kernel functions, where N may take on a value of 32, for example. The Gabor kernel functions are gaussian kernel functions modulated by sine plane waves, and the directions corresponding to the N Gabor kernel functions are uniformly distributed among [0, pi ].
The response value F (x, y, θ) of the pixel (x, y) in the direction of the angle θ can be detected by the Gabor kernel function K (θ), and the F (x, y, θ) can be calculated by using the existing algorithm, which is not described in detail herein. And determining the hair trend characteristics of the hair nodes corresponding to the pixels (x, y) based on the maximum response values of the pixels (x, y) in a plurality of angle directions.
And determining hairline node distribution characteristics matched with the hairline distribution characteristics according to the hairline layout characteristics and preset hairline distribution characteristics. The hairline node distribution characteristics indicate hairline node coordinates of a preset virtual image, and a hairline node coordinate sequence matched with each hairline node coordinate can be determined as the hairline node distribution characteristics according to the hairline layout characteristics and a plurality of hairline node coordinates of the preset virtual image.
In one example approach, a pre-trained target deep learning network is utilized to output hair node distribution features that match the hair root distribution features based on the hair layout features and the hair root distribution features. For example, the hairline layout feature and the multiple hairline node coordinates of the preset avatar may be used as input data of the pre-trained target deep learning network to obtain a hairline node coordinate sequence matched with each hairline node coordinate. The method can effectively reduce the analysis difficulty of the hairline layout, effectively improve the reconstruction efficiency of the hairline layout, and is beneficial to ensuring the reconstruction effect of the hairstyle in the generation of the virtual image.
The target deep learning network may be obtained based on the training method of the deep learning network above, for example, may be obtained based on the following training method: determining hairline layout features and hairline distribution features of a hair region in a training sample image; outputting hairline node distribution characteristics matched with hairline distribution characteristics according to hairline layout characteristics and hairline distribution characteristics of a hairline area by utilizing a deep learning network to be trained; determining similar distances between the distribution characteristics of the hairline nodes and preset distribution labels of the hairline nodes, and obtaining characteristic loss values based on the similar distances; and adjusting model parameters of the deep learning network according to the characteristic loss value to obtain the trained target deep learning network.
The virtual hair style image may be generated according to each root node coordinate and a corresponding hair node coordinate sequence, and the target avatar may be generated based on the preset avatar and the virtual hair style image. For example, rendering adjustments may be made to the virtual hair style image resulting in an adjusted virtual hair style image. And generating a target avatar based on the preset avatar and the adjusted avatar image.
Rendering adjustments may include, for example, at least one of base color adjustment, scatter adjustment, highlight adjustment, backlight adjustment, tangent adjustment, ambient light adjustment, and depth offset adjustment. For example, a linear gradient function may be used to adjust the base color map, the diffuse color map, the highlight map, etc. in the virtual hair style image. Or multiplying the basic color map in the virtual hair style image with the sampling noise map, and adding the multiplication result with a preset tangent vector to obtain the virtual hair style image with the tangent adjusted. By performing tangential adjustment on the virtual hair style image, the micro-plane of real hair can be effectively simulated based on the virtual hair style image, and the hairiness of the virtual hair style image can be effectively enhanced.
According to the embodiment of the disclosure, hairline layout characteristics of a hair area in a target image are determined, hairline node distribution characteristics matched with the hairline node distribution characteristics are determined according to the hairline layout characteristics and hairline root distribution characteristics of a preset virtual image, and the target virtual image matched with the target image is generated based on the hairline node distribution characteristics and the preset virtual image. The hairstyle reconstruction method has the advantages that the hairstyle layout analysis capability during the generation of the virtual image can be effectively improved, the hairstyle reconstruction efficiency during the generation of the virtual image can be effectively improved, the calculation cost and the cost consumption of the reconstruction of the virtual hairstyle can be reduced, the virtual hairstyle construction effect is good, and the user diversified virtual image generation requirements can be effectively met.
Fig. 5 schematically illustrates a schematic diagram of an image processing procedure according to an embodiment of the present disclosure.
As shown in fig. 5, a hair region in the training sample image 501 is rendered to obtain a rendered hair image 502, and the rendered hair image 502 may form a three-dimensional hair trend graph of the training sample image 501. The feature extraction is performed on the rendered hair image 502 by using the convolution layer 503 of the deep learning network to be trained, and the extracted hidden vector can form the hair trend feature of the training sample image 501.
And outputting hairline node distribution characteristics matched with the hairline distribution characteristics according to the hairline trend characteristics and the hairline distribution characteristics of the training sample image 501 by utilizing the hairline coding layer 504 of the deep learning network to be trained. The hair root distribution feature indicates the hair root node coordinates of the hair region in the training sample image 501, and the hair root node distribution feature indicates the hair root node predicted coordinate sequence 505 that matches the hair root node coordinates.
And determining the similar distance between the distribution characteristics of the hairline nodes and the preset distribution labels of the hairline nodes, and obtaining the characteristic loss value based on the similar distance. The hair node distribution label indicates a hair node truth coordinate sequence that matches the hair node coordinates, which may be determined by downsampling dense hair data of the training sample image 501. And adjusting model parameters of the deep learning network based on the characteristic loss value to obtain the trained target deep learning network.
The trained target deep learning network may be utilized to output hair node distribution characteristics matching the target image based on hair layout characteristics of hair regions in the target image. Illustratively, a gabor filter process may be performed on the target image 506 to obtain a two-dimensional hair trend graph 507 of the target image 506. The two-dimensional hair trend graph 507 is extracted by using a convolution layer 508 of the target deep learning network, and the extracted hidden vector can form the hair layout feature of the target image 506. The hairline node distribution characteristics matched with the hairline node distribution characteristics are output by using the hairline coding layer 509 of the target deep learning network according to the hairline layout characteristics of the target image 506 and the hairline node distribution characteristics of the preset avatar, and the hairline node distribution characteristics can indicate the hairline node prediction coordinate sequence 510 of the preset avatar. A target avatar matching the target image 506 is generated according to the preset avatar and hairline node distribution characteristics.
Fig. 6 schematically illustrates a block diagram of a training apparatus of a deep learning network according to an embodiment of the present disclosure.
As shown in fig. 6, an image processing apparatus 600 of an embodiment of the present disclosure includes, for example, a training sample image processing module 610, a hairline node distribution feature determination module 620, a feature loss value determination module 630, and a model parameter adjustment module 640.
A training sample image processing module 610 for determining hair layout features and hair root distribution features of hair regions in the training sample image; the hairline node distribution feature determining module 620 is configured to output hairline node distribution features matched with hairline distribution features according to hairline layout features and hairline distribution features of a hair area by using a deep learning network to be trained; the feature loss value determining module 630 is configured to determine a similar distance between the distribution feature of the hairline node and a preset distribution label of the hairline node, so as to obtain a feature loss value based on the similar distance; and a model parameter adjustment module 640, configured to adjust model parameters of the deep learning network according to the feature loss value, so as to obtain a trained target deep learning network.
According to the embodiment of the disclosure, the hairline layout characteristics of the hair region in the training sample image are determined, the deep learning network to be trained is utilized, the hairline node distribution characteristics matched with the hairline distribution characteristics are output according to the hairline layout characteristics and the hairline distribution characteristics of the hair region, the deep learning network with three-dimensional hairline understanding capability can be trained, the hairline layout analysis difficulty is reduced, the virtual hairline reconstruction cost can be effectively controlled, and the cost consumption of virtual image generation can be effectively reduced.
According to an embodiment of the present disclosure, a training sample image processing module includes: the hair region rendering sub-module is used for rendering the hair region according to the hair node distribution labels to obtain a rendered hair image; and a hair layout feature determination sub-module for taking as a hair layout feature a hair trend feature indicated by pixel color values in the rendered hair image.
According to an embodiment of the present disclosure, a hair region rendering submodule includes: a pixel color value determining unit for determining a pixel color value matched with the hairline node according to the node coordinates of the adjacent hairline node indicated by the hairline node distribution label; and a rendered hair image determining unit for obtaining a rendered hair image according to the pixel color values matched with the hair nodes.
According to an embodiment of the present disclosure, the root distribution feature indicates root node coordinates in the hair region; the hairline node distribution characteristic determining module comprises: and the hairline node coordinate prediction sub-module is used for outputting a hairline node prediction coordinate sequence matched with each hairline node coordinate according to the hairline layout characteristics of the hair region and the multiple hairline node coordinates by utilizing the deep learning network to be trained, so as to serve as a hairline node distribution characteristic.
According to an embodiment of the present disclosure, the hairline node distribution tag indicates a hairline node truth value coordinate sequence that matches each hairline node coordinate; the feature loss value determination module includes: and the similarity distance determining sub-module is used for determining the similarity distance between the hairline node prediction coordinate sequence matched with the coordinates of each hairline node and the hairline node true value coordinate sequence to obtain a characteristic loss value based on the similarity distance.
According to an embodiment of the present disclosure, the training sample image processing module is configured to: determining at least one of the following characteristics of the hair area: hair strike characteristics, hair length characteristics, hair depth characteristics, and hair local density.
According to an embodiment of the present disclosure, the apparatus further comprises a training sample image registration module for: extracting facial feature points of an object in the training sample image; performing image registration on the training sample image according to the facial feature points and the preset reference occupation features to obtain a registered training sample image; the training sample image processing module is used for: and determining hairline layout features and hairroot distribution features of a hair region in the registered training sample image, wherein the reference occupation feature indicates the reference occupation of the facial feature points relative to the image.
Fig. 7 schematically illustrates a block diagram of an avatar generating apparatus according to an embodiment of the present disclosure.
As shown in fig. 7, the avatar generating apparatus 700 of the embodiment of the present disclosure includes, for example, a hairline layout feature determining module 710, a hairline node distribution feature determining module 720, and a target avatar generating module 730.
A hair layout feature determination module 710 for determining hair layout features of hair regions in the target image; the hairline node distribution characteristic determining module 720 is configured to determine hairline node distribution characteristics matched with hairline root distribution characteristics according to hairline layout characteristics and preset hairline root distribution characteristics; and a target avatar generation module 730 for generating a target avatar matched with the target image based on the preset avatar and the hairline node distribution characteristics.
According to the embodiment of the disclosure, hairline layout characteristics of a hair area in a target image are determined, hairline node distribution characteristics matched with the hairline node distribution characteristics are determined according to the hairline layout characteristics and hairline root distribution characteristics of a preset virtual image, and the target virtual image matched with the target image is generated based on the hairline node distribution characteristics and the preset virtual image. The hairstyle reconstruction method has the advantages that the hairstyle layout analysis capability during the generation of the virtual image can be effectively improved, the hairstyle reconstruction efficiency during the generation of the virtual image can be effectively improved, the calculation cost and the cost consumption of the reconstruction of the virtual hairstyle can be reduced, the virtual hairstyle construction effect is good, and the user diversified virtual image generation requirements can be effectively met.
According to an embodiment of the present disclosure, a hairline layout feature determination module includes: a pixel gradient information determination submodule for determining pixel gradient information of a hair region in the target image; and the hair trend characteristic determining submodule is used for determining hair trend characteristics associated with the hair area according to the pixel gradient information to serve as hair layout characteristics.
According to an embodiment of the present disclosure, the root distribution feature indicates a root node coordinate of a preset avatar; the hairline node distribution characteristic determining module comprises: and the hairline node coordinate determination submodule is used for determining a hairline node coordinate sequence matched with each hairline node coordinate according to the hairline layout characteristics and a plurality of hairline node coordinates of the preset virtual image, and taking the hairline node coordinate sequence as a hairline node distribution characteristic.
According to an embodiment of the present disclosure, the target avatar generation module includes: the virtual hairstyle image generation sub-module is used for generating a virtual hairstyle image according to the coordinates of the hairroot node and the corresponding coordinate sequence of the hairline node; and the target avatar generation sub-module is used for generating a target avatar based on the preset avatar and the virtual hairstyle image.
According to an embodiment of the present disclosure, the target avatar generation submodule includes: the virtual hairstyle image adjusting unit is used for rendering and adjusting the virtual hairstyle image to obtain an adjusted virtual hairstyle image; a target avatar generating unit for generating a target avatar based on the preset avatar and the adjusted avatar image, wherein the rendering adjustment includes at least one of the following adjustment contents: basic color adjustment, scattering adjustment, highlight adjustment, backlight adjustment, tangential adjustment, ambient light adjustment, and depth offset adjustment.
According to an embodiment of the disclosure, the apparatus further comprises a target image registration module for: determining the pose of the head of the object in the target image; registering the target image according to the position and the pose of the head of the object and the position and pose of the reference head of the preset virtual image to obtain a registered target image; the hairline layout characteristic determining module is used for: hair placement features of hair regions in the registered target image are determined.
According to an embodiment of the present disclosure, the hairline node distribution characteristic determination module includes: and the hairline node coordinate prediction sub-module is used for outputting hairline node distribution characteristics matched with the hairline distribution characteristics according to the hairline layout characteristics and the hairline distribution characteristics by utilizing the pre-trained target deep learning network.
It should be noted that, in the technical solution of the present disclosure, the related processes of information collection, storage, use, processing, transmission, provision, disclosure and the like all conform to the rules of relevant laws and regulations, and do not violate the public welcome.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 8 schematically illustrates a block diagram of an electronic device for performing deep learning network training in accordance with an embodiment of the present disclosure.
Fig. 8 illustrates a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic device 800 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The computing unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
Various components in device 800 are connected to I/O interface 805, including: an input unit 808 such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, or the like. The communication unit 806 allows the device 800 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.
The computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 801 performs the respective methods and processes described above, for example, a training method of a deep learning network. For example, in some embodiments, the training method of the deep learning network may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 806. When the computer program is loaded into RAM 803 and executed by computing unit 801, one or more steps of the training method of the deep learning network described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the training method of the deep learning network in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with an object, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a subject; and a keyboard and pointing device (e.g., a mouse or trackball) by which an object can provide input to the computer. Other kinds of devices may also be used to provide for interaction with an object; for example, feedback provided to the subject may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the subject may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., an object computer having a graphical object interface or a web browser through which an object can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (22)

1. A training method of a deep learning network, comprising:
determining hairline layout features and hairline distribution features of a hair region in a training sample image;
outputting hairline node distribution characteristics matched with the hairline distribution characteristics according to the hairline layout characteristics and the hairline root distribution characteristics of the hair region by utilizing a deep learning network to be trained;
determining similar distances between the hairline node distribution characteristics and preset hairline node distribution labels, and obtaining characteristic loss values based on the similar distances; and
According to the characteristic loss value, adjusting model parameters of the deep learning network to obtain a trained target deep learning network;
wherein the root distribution feature indicates root node coordinates in the hair region, the root node coordinates in the sparse hair data of the training sample image constituting the root distribution feature;
the deep learning network to be trained is utilized to output hairline node distribution characteristics matched with the hairline distribution characteristics according to the hairline layout characteristics and the hairline distribution characteristics of the hair area, and the deep learning network comprises:
outputting a hairline node prediction coordinate sequence matched with each hairline node coordinate according to the hairline layout characteristics of the hair region and a plurality of hairline node coordinates in the sparse hairline data by using the deep learning network to be trained, and taking the hairline node prediction coordinate sequence as the hairline node distribution characteristics;
wherein the hairline node distribution tag indicates a hairline node truth value coordinate sequence matched with each hairline node coordinate; and
the step of determining the similar distance between the hairline node distribution characteristics and the preset hairline node distribution labels to obtain the characteristic loss value based on the similar distance comprises the following steps:
And determining a similar distance between the hairline node predicted coordinate sequence matched with each hairline node coordinate and the hairline node true value coordinate sequence, and obtaining the characteristic loss value based on the similar distance.
2. The method of claim 1, wherein the determining hair layout features of hair regions in a training sample image comprises:
rendering the hair area according to the hair node distribution labels to obtain a rendered hair image; and
and taking the hair trend characteristic indicated by the pixel color value in the rendered hair image as the hair layout characteristic.
3. The method of claim 2, wherein said rendering the hair region according to the hair node distribution label results in a rendered hair image, comprising:
determining a pixel color value matched with the hairline node according to the node coordinates of adjacent hairline nodes indicated by the hairline node distribution label; and
and obtaining the rendered hair image according to the pixel color value matched with the hair node.
4. The method of claim 1, wherein the determining hair layout features of hair regions in a training sample image comprises:
Determining at least one of the following characteristics of the hair region:
hair strike characteristics, hair length characteristics, hair depth characteristics, and hair local density.
5. The method of any one of claims 1 to 4, further comprising:
extracting facial feature points of an object in the training sample image;
performing image registration on the training sample image according to the facial feature points and the preset reference occupation feature to obtain a registered training sample image; and
the determining hair layout features and hair root distribution features of the hair region in the training sample image comprises:
determining the hair placement features and the root distribution features of hair regions in the registered training sample image,
wherein the reference occupancy feature indicates a reference occupancy of the facial feature point relative to the image in which it is located.
6. An avatar generation method, comprising:
determining hair layout features of hair regions in the target image;
determining hairline node distribution characteristics matched with the hairline distribution characteristics according to the hairline layout characteristics and preset hairline distribution characteristics; and
generating a target avatar matched with the target image based on a preset avatar and the hairline node distribution characteristics;
The root distribution feature indicates the root node coordinates of the preset virtual image, and the root node coordinates in the sparse hair data of the preset virtual image form the root distribution feature;
the step of determining the distribution characteristics of the hairline nodes matched with the distribution characteristics of the hairline according to the distribution characteristics of the hairline and the preset distribution characteristics of the hairline comprises the following steps:
determining a hairline node coordinate sequence matched with each hairline node coordinate according to the hairline layout characteristics and a plurality of hairline node coordinates in the sparse hairline data of the preset virtual image, and taking the hairline node coordinate sequence as the hairline node distribution characteristics;
the step of determining the distribution characteristics of the hairline nodes matched with the distribution characteristics of the hairlines according to the distribution characteristics of the hairlines and the preset distribution characteristics of the hairlines comprises the following steps:
and outputting the hairline node distribution characteristics matched with the hairline root distribution characteristics according to the hairline layout characteristics and the hairline root distribution characteristics by utilizing a pre-trained target deep learning network.
7. The method of claim 6, wherein the determining hair layout features of hair regions in the target image comprises:
Determining pixel gradient information of a hair region in the target image; and
and determining hair trend characteristics associated with the hair area according to the pixel gradient information to serve as the hair layout characteristics.
8. The method of claim 6, wherein the generating the target avatar matching the target image based on the preset avatar and the hairline node distribution characteristics comprises:
generating a virtual hairstyle image according to the hairroot node coordinates and the corresponding hairline node coordinate sequence;
and generating the target avatar based on the preset avatar and the virtual hairstyle image.
9. The method of claim 8, wherein the generating the target avatar based on the avatar image and the preset avatar comprises:
rendering and adjusting the virtual hairstyle image to obtain an adjusted virtual hairstyle image;
generating the target avatar based on the preset avatar and the adjusted avatar image, wherein the rendering adjustment includes at least one of the following adjustment contents:
basic color adjustment, scattering adjustment, highlight adjustment, backlight adjustment, tangential adjustment, ambient light adjustment, and depth offset adjustment.
10. The method of any of claims 6 to 9, further comprising:
determining the pose of the head of the object in the target image;
registering the target image according to the object head pose and the reference head pose of the preset virtual image to obtain a registered target image; and
the determining hair layout features of the hair region in the target image includes:
the hair layout features of hair regions in the registered target image are determined.
11. A training apparatus for a deep learning network, comprising:
the training sample image processing module is used for determining hairline layout characteristics and hairroot distribution characteristics of the hair area in the training sample image;
the hairline node distribution characteristic determining module is used for outputting hairline node distribution characteristics matched with the hairline distribution characteristics according to the hairline layout characteristics and the hairline root distribution characteristics of the hair area by utilizing a deep learning network to be trained;
the characteristic loss value determining module is used for determining similar distances between the distribution characteristics of the hairline nodes and preset hairline node distribution labels to obtain characteristic loss values based on the similar distances; and
The model parameter adjustment module is used for adjusting model parameters of the deep learning network according to the characteristic loss value to obtain a trained target deep learning network;
wherein the root distribution feature indicates root node coordinates in the hair region, the root node coordinates in the sparse hair data of the training sample image constituting the root distribution feature;
the hairline node distribution characteristic determining module comprises:
the hairline node coordinate prediction sub-module is used for outputting a hairline node prediction coordinate sequence matched with each hairline node coordinate as the hairline node distribution characteristic according to the hairline layout characteristic of the hairline area and a plurality of hairline node coordinates in the sparse hairline data by utilizing the deep learning network to be trained;
the hairline node distribution label indicates a hairline node truth value coordinate sequence matched with each hairline node coordinate; and
the feature loss value determining module includes:
and the similarity distance determining submodule is used for determining the similarity distance between the hairline node prediction coordinate sequence matched with the hairline node coordinates and the hairline node true value coordinate sequence, and obtaining the characteristic loss value based on the similarity distance.
12. The apparatus of claim 11, wherein the training sample image processing module comprises:
the hair region rendering sub-module is used for rendering the hair region according to the hair node distribution labels to obtain a rendered hair image; and
and the hairline layout feature determination submodule is used for taking hairline trend features indicated by pixel color values in the rendered hair image as the hairline layout features.
13. The apparatus of claim 12, wherein,
the hair region rendering submodule includes:
a pixel color value determining unit configured to determine a pixel color value matched with the hairline node according to node coordinates of adjacent hairline nodes indicated by the hairline node distribution label; and
and the rendered hair image determining unit is used for obtaining the rendered hair image according to the pixel color value matched with the hairline node.
14. The apparatus of claim 11, wherein the training sample image processing module is to:
determining at least one of the following characteristics of the hair region:
hair strike characteristics, hair length characteristics, hair depth characteristics, and hair local density.
15. The apparatus of any of claims 11 to 14, further comprising a training sample image registration module to:
extracting facial feature points of an object in the training sample image;
performing image registration on the training sample image according to the facial feature points and the preset reference occupation feature to obtain a registered training sample image; and
the training sample image processing module is used for:
determining the hair placement features and the root distribution features of hair regions in the registered training sample image,
wherein the reference occupancy feature indicates a reference occupancy of the facial feature point relative to the image in which it is located.
16. An avatar generation apparatus comprising:
the hair layout feature determining module is used for determining hair layout features of the hair area in the target image;
the hairline node distribution characteristic determining module is used for determining hairline node distribution characteristics matched with the hairline root distribution characteristics according to the hairline layout characteristics and preset hairline root distribution characteristics; and
a target avatar generation module for generating a target avatar matched with the target image based on a preset avatar and the hairline node distribution characteristics;
The root distribution feature indicates the root node coordinates of the preset virtual image, and the root node coordinates in the sparse hair data of the preset virtual image form the root distribution feature;
the hairline node distribution characteristic determining module comprises:
a hairline node coordinate determining sub-module, configured to determine a hairline node coordinate sequence matched with each hairline node coordinate according to the hairline layout feature and a plurality of hairline node coordinates in the sparse hairline data of the preset avatar, as the hairline node distribution feature;
wherein, the hairline node distribution characteristic determining module includes:
and the hairline node coordinate prediction sub-module is used for outputting the hairline node distribution characteristics matched with the hairline root distribution characteristics according to the hairline layout characteristics and the hairline root distribution characteristics by utilizing a pre-trained target deep learning network.
17. The apparatus of claim 16, wherein the hair layout feature determination module comprises:
a pixel gradient information determination sub-module for determining pixel gradient information of a hair region in the target image; and
and the hair trend characteristic determining submodule is used for determining hair trend characteristics associated with the hair area according to the pixel gradient information to serve as the hair layout characteristics.
18. The apparatus of claim 16, wherein the target avatar generation module comprises:
the virtual hairstyle image generation sub-module is used for generating a virtual hairstyle image according to the hairroot node coordinates and the corresponding hairline node coordinate sequence;
and the target avatar generation sub-module is used for generating the target avatar based on the preset avatar and the virtual hair style image.
19. The apparatus of claim 18, wherein the target avatar generation submodule comprises:
the virtual hair style image adjusting unit is used for rendering and adjusting the virtual hair style image to obtain an adjusted virtual hair style image;
a target avatar generating unit for generating the target avatar based on the preset avatar and the adjusted avatar image, wherein the rendering adjustment includes at least one of the following adjustment contents:
basic color adjustment, scattering adjustment, highlight adjustment, backlight adjustment, tangential adjustment, ambient light adjustment, and depth offset adjustment.
20. The apparatus of any of claims 16 to 19, further comprising a target image registration module to:
Determining the pose of the head of the object in the target image;
registering the target image according to the object head pose and the reference head pose of the preset virtual image to obtain a registered target image; and
the hairline layout characteristic determining module is used for:
the hair layout features of hair regions in the registered target image are determined.
21. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 5 or to perform the method of any one of claims 6 to 10.
22. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-5 or to perform the method of any one of claims 6-10.
CN202211037100.2A 2022-08-26 2022-08-26 Training method of deep learning network, virtual image generation method and device Active CN115311403B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211037100.2A CN115311403B (en) 2022-08-26 2022-08-26 Training method of deep learning network, virtual image generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211037100.2A CN115311403B (en) 2022-08-26 2022-08-26 Training method of deep learning network, virtual image generation method and device

Publications (2)

Publication Number Publication Date
CN115311403A CN115311403A (en) 2022-11-08
CN115311403B true CN115311403B (en) 2023-08-08

Family

ID=83863948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211037100.2A Active CN115311403B (en) 2022-08-26 2022-08-26 Training method of deep learning network, virtual image generation method and device

Country Status (1)

Country Link
CN (1) CN115311403B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030185A (en) * 2022-12-02 2023-04-28 北京百度网讯科技有限公司 Three-dimensional hairline generating method and model training method

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2487239A1 (en) * 2009-10-05 2012-08-15 Kao Corporation Susceptibility gene for hair shapes
CN103035030A (en) * 2012-12-10 2013-04-10 西北大学 Hair model modeling method
CN103606186A (en) * 2013-02-02 2014-02-26 浙江大学 Virtual hair style modeling method of images and videos
CN107808136A (en) * 2017-10-31 2018-03-16 广东欧珀移动通信有限公司 Image processing method, device, readable storage medium storing program for executing and computer equipment
CN109064547A (en) * 2018-06-28 2018-12-21 北京航空航天大学 A kind of single image hair method for reconstructing based on data-driven
CN109408653A (en) * 2018-09-30 2019-03-01 叠境数字科技(上海)有限公司 Human body hair style generation method based on multiple features retrieval and deformation
US10489683B1 (en) * 2018-12-17 2019-11-26 Bodygram, Inc. Methods and systems for automatic generation of massive training data sets from 3D models for training deep learning networks
CN111354079A (en) * 2020-03-11 2020-06-30 腾讯科技(深圳)有限公司 Three-dimensional face reconstruction network training and virtual face image generation method and device
CN111583367A (en) * 2020-05-22 2020-08-25 构范(厦门)信息技术有限公司 Hair simulation method and system
CN112036266A (en) * 2020-08-13 2020-12-04 北京迈格威科技有限公司 Face recognition method, device, equipment and medium
CN113744286A (en) * 2021-09-14 2021-12-03 Oppo广东移动通信有限公司 Virtual hair generation method and device, computer readable medium and electronic equipment
CN114187633A (en) * 2021-12-07 2022-03-15 北京百度网讯科技有限公司 Image processing method and device, and training method and device of image generation model
CN114202597A (en) * 2021-12-07 2022-03-18 北京百度网讯科技有限公司 Image processing method and apparatus, device, medium, and product
KR102386828B1 (en) * 2020-10-28 2022-04-14 부산대학교 산학협력단 System and Method for Predicting of facial profile after prosthodontic treatment using deep learning
CN114494784A (en) * 2022-01-28 2022-05-13 北京百度网讯科技有限公司 Deep learning model training method, image processing method and object recognition method
CN114723888A (en) * 2022-04-08 2022-07-08 北京百度网讯科技有限公司 Three-dimensional hair model generation method, device, equipment, storage medium and product
CN114863000A (en) * 2022-03-22 2022-08-05 网易(杭州)网络有限公司 Method, device, medium and equipment for generating hairstyle

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10699463B2 (en) * 2016-03-17 2020-06-30 Intel Corporation Simulating the motion of complex objects in response to connected structure motion
CN112541963B (en) * 2020-11-09 2023-12-26 北京百度网讯科技有限公司 Three-dimensional avatar generation method, three-dimensional avatar generation device, electronic equipment and storage medium

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2487239A1 (en) * 2009-10-05 2012-08-15 Kao Corporation Susceptibility gene for hair shapes
CN103035030A (en) * 2012-12-10 2013-04-10 西北大学 Hair model modeling method
CN103606186A (en) * 2013-02-02 2014-02-26 浙江大学 Virtual hair style modeling method of images and videos
CN107808136A (en) * 2017-10-31 2018-03-16 广东欧珀移动通信有限公司 Image processing method, device, readable storage medium storing program for executing and computer equipment
CN109064547A (en) * 2018-06-28 2018-12-21 北京航空航天大学 A kind of single image hair method for reconstructing based on data-driven
CN109408653A (en) * 2018-09-30 2019-03-01 叠境数字科技(上海)有限公司 Human body hair style generation method based on multiple features retrieval and deformation
US10489683B1 (en) * 2018-12-17 2019-11-26 Bodygram, Inc. Methods and systems for automatic generation of massive training data sets from 3D models for training deep learning networks
CN111354079A (en) * 2020-03-11 2020-06-30 腾讯科技(深圳)有限公司 Three-dimensional face reconstruction network training and virtual face image generation method and device
CN111583367A (en) * 2020-05-22 2020-08-25 构范(厦门)信息技术有限公司 Hair simulation method and system
CN112036266A (en) * 2020-08-13 2020-12-04 北京迈格威科技有限公司 Face recognition method, device, equipment and medium
KR102386828B1 (en) * 2020-10-28 2022-04-14 부산대학교 산학협력단 System and Method for Predicting of facial profile after prosthodontic treatment using deep learning
CN113744286A (en) * 2021-09-14 2021-12-03 Oppo广东移动通信有限公司 Virtual hair generation method and device, computer readable medium and electronic equipment
CN114187633A (en) * 2021-12-07 2022-03-15 北京百度网讯科技有限公司 Image processing method and device, and training method and device of image generation model
CN114202597A (en) * 2021-12-07 2022-03-18 北京百度网讯科技有限公司 Image processing method and apparatus, device, medium, and product
CN114494784A (en) * 2022-01-28 2022-05-13 北京百度网讯科技有限公司 Deep learning model training method, image processing method and object recognition method
CN114863000A (en) * 2022-03-22 2022-08-05 网易(杭州)网络有限公司 Method, device, medium and equipment for generating hairstyle
CN114723888A (en) * 2022-04-08 2022-07-08 北京百度网讯科技有限公司 Three-dimensional hair model generation method, device, equipment, storage medium and product

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于粒子系统的头发建模;曹艳芳;姜昱明;;计算机工程与设计(第10期);2493-2495 *

Also Published As

Publication number Publication date
CN115311403A (en) 2022-11-08

Similar Documents

Publication Publication Date Title
CN108229296B (en) Face skin attribute identification method and device, electronic equipment and storage medium
CN114140603B (en) Training method of virtual image generation model and virtual image generation method
CN114187633B (en) Image processing method and device, and training method and device for image generation model
CN108205803B (en) Image processing method, and training method and device of neural network model
CN111325851B (en) Image processing method and device, electronic equipment and computer readable storage medium
CN112862807B (en) Hair image-based data processing method and device
CN114842123B (en) Three-dimensional face reconstruction model training and three-dimensional face image generation method and device
CN114792359B (en) Rendering network training and virtual object rendering method, device, equipment and medium
CN114723888B (en) Three-dimensional hair model generation method, device, equipment, storage medium and product
CN114549710A (en) Virtual image generation method and device, electronic equipment and storage medium
CN113591566A (en) Training method and device of image recognition model, electronic equipment and storage medium
CN116309983B (en) Training method and generating method and device of virtual character model and electronic equipment
CN116363261A (en) Training method of image editing model, image editing method and device
CN115311403B (en) Training method of deep learning network, virtual image generation method and device
CN116416376A (en) Three-dimensional hair reconstruction method, system, electronic equipment and storage medium
CN112907569A (en) Head image area segmentation method and device, electronic equipment and storage medium
CN114708374A (en) Virtual image generation method and device, electronic equipment and storage medium
CN113052962B (en) Model training method, information output method, device, equipment and storage medium
CN113766117B (en) Video de-jitter method and device
CN115359166B (en) Image generation method and device, electronic equipment and medium
CN116524162A (en) Three-dimensional virtual image migration method, model updating method and related equipment
CN108256477B (en) Method and device for detecting human face
CN114529649A (en) Image processing method and device
CN114419182A (en) Image processing method and device
CN114758391B (en) Hair style image determining method, device, electronic equipment, storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant