CN114202629A

CN114202629A - Human body model establishing method, system, equipment and storage medium

Info

Publication number: CN114202629A
Application number: CN202010876584.4A
Authority: CN
Inventors: 张胜凯; 周润楠; 焦年红; 郑天祥; 闫浩男; 杨超杰
Original assignee: Beijing Momo Information Technology Co Ltd
Current assignee: Beijing Momo Information Technology Co Ltd
Priority date: 2020-08-27
Filing date: 2020-08-27
Publication date: 2022-03-18

Abstract

The invention discloses a method for establishing a human body model, which comprises the steps of obtaining a two-dimensional image of a target human body; substituting the two-dimensional human body contour image into a first neural network subjected to deep learning to carry out regression of the joint points; obtaining information such as a node map of a target human body; substituting the parameters into a second neural network subjected to deep learning to carry out regression on the human body posture and body type parameters; obtaining three-dimensional human body parameters including three-dimensional human body action posture parameters and three-dimensional human body shape parameters; combining the mathematical model to construct a three-dimensional standard human body model, namely a basic mannequin; inputting the obtained groups of base and skeleton parameters into a standard three-dimensional standard human body model for fitting; and obtaining the three-dimensional human body model with the same posture and body type as the target human body. According to the invention, through the establishment of the self-standard human body model, the physique base suitable for the characteristics of the Asian body type is selected, and the parameters corresponding to 20 physique bases and 170 skeletons are used, so that the three-dimensional human body model which is closer to the Asian body type than the SMPL model of Mapu institute and has better independent operation and controllability is generated.

Description

Human body model establishing method, system, equipment and storage medium

Technical Field

The invention belongs to the field of human body three-dimensional modeling, and particularly relates to a human body modeling method, in particular to a method, a system, equipment and a storage medium for establishing a human body model for extracting relevant information from a photo based on machine learning.

Background

With the development of internet technology, online shopping is more and more popular. Compared with shopping in a physical store, online shopping has the advantages of multiple commodity types, convenience in shopping and the like. However, there are some problems that are not easy to solve when purchasing commodities on the internet, and most importantly, the commodities to be purchased cannot be checked on the spot. The problem of clothing is most prominent in all commercial varieties. Compared with the method that the clothes effect can be changed and checked in real time in the shopping of a physical store, the online clothes shopping can not provide an effect picture aiming at a consumer, only can provide a picture of model fitting, and even has no fitting picture, so that the consumer can not intuitively obtain the matching degree of the clothes and the body figure of the consumer in real time. Resulting in a large amount of returns.

In response to this problem, operators have attempted to solve this problem by providing simulated fitting effects for consumers using virtual fitting techniques. Of course, there are other situations in reality where virtual fitting and changing techniques can be used, such as in network games. Therefore, this technology has been developed more rapidly.

The virtual fitting refers to a technical application that a user can check the dressing change effect on a terminal screen in real time without actually changing clothes with the wearing effect. The existing dressing change technology mainly comprises a plane fitting technology and a three-dimensional virtual fitting technology. The former basically collects pictures of users, collects pictures of clothes, and then cuts and splices the pictures to form images after dressing, but the images have poor reality due to a simple and rough image processing mode, the actual body type of the users is not considered at all, and the requirements of the users cannot be met only by carrying and hardbanding clothes on the pictures of the users. The latter usually collects the three-dimensional information of the person through a three-dimensional collecting device and combines the characteristics of the clothes, or manually inputs the body data information provided by the user, virtually generates a human body three-dimensional model according to a certain rule, and then combines the human body three-dimensional model with the clothes map. Overall, such three-dimensional virtual fitting requires a large amount of data acquisition or three-dimensional data calculation, and has high hardware cost and is not easy to popularize among ordinary users.

With the development of cloud computing technology, artificial intelligence technology and intelligent terminal processing capacity, a two-dimensional virtual fitting technology is generated. Such techniques essentially comprise three steps: (1) processing the personal body information provided by the user to obtain a target human body model; (2) processing the clothing information to obtain a clothing model; (3) the human body model and the clothing model are fused together to generate a simulated figure of the clothing worn by a person.

However, due to the accumulation of many uncertain factors such as the process design, the model parameter selection, the neural network training method, and the like, the quality of the finally generated clothes changing picture is not as good as that of the traditional three-dimensional virtual fitting technology, wherein the establishment of the human body model is the basic step, and the subsequent dressing process also needs to be based on the previously generated human body model, so once the human body model is generated inaccurately, the problems of overlarge body type difference between the human body model and the fitting person, skin texture loss, body part loss, and the like are easily caused, and the effect of the finally generated clothes changing picture is influenced.

In the field of image processing, three-dimensional reconstruction refers to establishing a mathematical model suitable for computer representation and processing on a three-dimensional object, is the basis for processing, operating and analyzing the properties of the three-dimensional object in a computer environment, is also a key technology for establishing virtual reality expressing an objective world in a computer, and is widely applied to the fields of computer animation, virtual reality, industrial detection and the like.

In the general field of computer vision, there are many initial starting points for human body modeling, which generally include three major categories, namely omni-directional scanning of a real human body by using a 3D scanning device, a three-dimensional reconstruction method based on multi-view depth-of-field photography, and a method of combining a given image with a human body model to achieve three-dimensional reconstruction. The 3D scanning equipment is used for carrying out omnibearing scanning on a real human body to obtain the most accurate information, but the equipment is expensive usually and needs high cooperation of a human body model, and the whole processing process has high requirements on the processing equipment, so the equipment is generally applied to some professional fields; secondly, the multi-view three-dimensional reconstruction method needs to provide images with multiple overlapped views of a reconstructed human body and establish a space conversion relation among the images, multiple groups of cameras are used for shooting multiple images, a 3D model is spliced, the operation is relatively simplified, the calculation complexity is still high, and in most cases, only people participating in the scene can obtain multi-angle images. A model obtained by splicing the pictures taken by the depth camera in the multi-angle shooting method does not have body scale data and cannot provide a basis for 3D perception. Thirdly, only one image needs to be provided by the method of combining a single image with the human body model, the weight and the threshold value which can be used for describing curves of the neck, the chest, the waist, the hip and other parts of the human body are obtained by the neural network based three-dimensional human body characteristic curve intelligent generation method based on the neural network, and then the predicted human body model can be obtained by directly generating the human body three-dimensional curve which is matched with the real human body shape according to the size parameter information of the girth, the width, the thickness and the like of the human body section. However, the method still needs to consume a large amount of calculation due to a small amount of input information, so that the final model effect is not satisfactory.

Based on the internet technology and the network environment characteristics of the internet technology, the mode of directly outputting the final human body model from a single image is undoubtedly preferable, the convenience is the best, and the user does not need to visit the scene and only needs one photo to complete the whole clothes changing process. The problem then comes to be that it will become the mainstream as long as it can be guaranteed that the resulting photo effect obtained by it is substantially equivalent to the real 3D simulated dressing change. Among them, it is important to obtain a human body model closest to the real state of the human body through one photograph.

In the prior art, methods for constructing a human body model generally have several types: (1) the method is based on regression, a human body model represented by voxels is reconstructed through a convolutional neural network, the algorithm firstly estimates the position of a main joint point of a human body according to an input picture, then in a given voxel grid with a specified size according to the position of a key point, and the shape of the reconstructed human body is described by the whole shape of the internally occupied voxels according to whether each unit voxel in the voxel grid is occupied or not; (2) the method comprises the steps of roughly marking simple human skeleton key points on an image, and then carrying out initial matching and fitting on a human model according to the rough key points to obtain the approximate shape of the human body. (3) Representing the human skeleton by 23 skeleton nodes, then representing the posture of the whole human body by the rotation of each skeleton node, simultaneously representing the shape of the human body by 6890 vertex positions, giving the positions of the skeleton nodes in the fitting process, and simultaneously fitting the parameters of the shape and the posture so as to reconstruct the three-dimensional human body; or the CNN model is used for predicting key points on the image, and then the SMPL model is used for fitting to obtain an initial human body model. And then, the shape parameters obtained by fitting are used for back and forth normalizing the bounding boxes of the individual body joints, one bounding box corresponds to each joint, and the bounding boxes are represented by axial length and radius. And finally, combining the initial model and the bounding box obtained by regression to obtain the three-dimensional human body reconstruction. The method has the problems of low modeling speed, insufficient modeling precision and strong dependence on the created body and posture database on the reconstruction effect.

In the prior art, a human body modeling method based on body measurement data is disclosed, as shown in fig. 1, the method includes: acquiring body measurement data; performing linear regression on a pre-established human body model through a pre-trained prediction model according to the body measurement data, and fitting to obtain a prediction human body model, wherein the pre-established human body model comprises a plurality of groups of pre-defined marking feature points and corresponding standard shape bases, and the body measurement data comprises measurement data corresponding to each group of marking feature points; and obtaining a target human body model according to the prediction human body model, wherein the target human body model comprises measurement data, a target shape base and a target shape coefficient. However, this method has very high requirements for the body measurement data, including body length data and girth data, such as height, arm length, shoulder width, leg length, calf length, thigh length, foot length, head circumference, chest circumference, waist circumference, thigh circumference, etc., and not only the measurement but also the calculation. The calculation amount is actually saved, but the user experience is very poor, and the procedure is very complicated. In addition, the training mode of the SMPL model is referred to in the training of the human body model.

The SMPL model is a parameterized human body model, is a human body modeling method proposed by Mapu, and can carry out arbitrary human body modeling and animation driving. The biggest difference between the method and the traditional LBS is that the method for imaging the body surface morphology of the human body posture can simulate the protrusion and the depression of human muscles in the limb movement process. Therefore, the surface distortion of the human body in the motion process can be avoided, and the shapes of the muscle stretching and contraction motion of the human body can be accurately depicted. In the method, beta and theta are input parameters, wherein beta represents 10 parameters of human body with high and low fat and thin body, head-to-body ratio and the like, and theta represents 75 parameters of human body overall motion pose and 24 joint relative angles. However, the core of the model generation method is the accumulation of a large amount of training data to obtain the relationship between the body type and the shape bases, but the relationship has strong correlation, each shape base cannot be independently controlled, and decoupling operation is not easy to perform, for example, a certain correlation relationship also exists between arms and legs, the legs theoretically move along with the movement of the arms when the arms move, and improvement aiming at different characteristic body types on the SMPL model is difficult to realize.

The second prior art discloses a 3D human body modeling method based on a single photo, which comprises the following steps: acquiring a photo, analyzing the photo, marking key points of a human body in the photo, and calculating space coordinates of the key points; acquiring the distance between a skeleton point in a pre-created standard human body model and a key point in a picture, aligning the skeleton point and the key point, and generating a basic human body model; acquiring a basic chartlet in a pre-created standard human body model, calculating a difference value between the basic chartlet and skin texture of a human face in a photo, and fusing by using an edge channel to generate basic texture data; and generating a 3D human body model according to the basic human body model and the basic texture data. 3D human body modeling is realized through a picture, and the model is supported by a skeleton and a muscle system, so that expressions and actions can be generated. However, in the method, after the distances between the key points of the user picture and the key points of the standard mannequin are matched, the distances are adjusted to achieve the posture of the target human body, and then the final human body model can be obtained after difference calculation and fusion are carried out through skin textures in the basic chartlet and the picture.

The third prior art discloses a method for generating a three-dimensional human body model, which comprises the following steps: acquiring a two-dimensional human body image; inputting the two-dimensional human body image into a three-dimensional standard human body model to obtain three-dimensional human body parameters corresponding to the two-dimensional human body image; inputting the training sample into a neural network for training to obtain a three-dimensional standard human body model, wherein the training sample comprises: inputting the standard two-dimensional human body image in the training sample into the neural network to obtain a predicted three-dimensional human body parameter corresponding to the standard two-dimensional human body image; adjusting a three-dimensional flexible deformable model according to the predicted three-dimensional human body parameters to obtain a predicted three-dimensional human body model; and obtaining the position of the predicted joint point in the standard two-dimensional human body image through reverse mapping according to the position of the joint point in the predicted three-dimensional human body model. In the modeling mode, only joint parameters are used for judging by using a model and finally parameters output by a neural network, and then the parameters are subjected to detail adjustment consistent with the target human posture by using the mature body type of the SMPL model, although the calculated amount is reduced, because the input parameters are less and the adjustment can be completed only on the basis of the SMPL prediction model, the human model which is particularly ideal and highly consistent with the target human posture is difficult to output.

Therefore, in order to match with the development trend of the internet industry, in the subdivision field of virtual fitting, particularly human body modeling, the minimum input information, the minimum calculation amount and the optimal effect are three basic targets which are always pursued. An optimal balance point needs to be found among the three, and the human body modeling method which can achieve simple input, has the calculated amount not exceeding the bearing capacity of the terminal equipment and has the effect close to that of professional equipment is provided.

Disclosure of Invention

Based on the above problems, the present invention provides a method, system, device and storage medium for constructing a human body model, which overcomes the above problems.

The invention provides a method for establishing a human body model, which comprises the following steps: 1) acquiring a two-dimensional image of a target human body; 2) processing to obtain a two-dimensional human body outline image of the target human body; 3) acquiring three-dimensional human body parameters of a target human body model through a two-stage neural network, wherein the three-dimensional human body parameters comprise three-dimensional human body action posture parameters and three-dimensional human body shape parameters; 4) combining the mathematical model to construct a three-dimensional standard human body model; 6) the obtained three-dimensional human body posture and body type parameters correspond to a plurality of basic and skeleton parameters of the three-dimensional standard human body model; 7) inputting the obtained groups of base and skeleton parameters into a standard three-dimensional standard human body model for fitting; 8) and obtaining a three-dimensional target human body model grid with the same posture and body type as the target human body.

Further, the two-dimensional human body contour image is obtained by using a target detection algorithm, wherein the target detection algorithm is a target area rapid generation network based on a convolutional neural network. Before the two-dimensional human body image is input into the first neural network model, the method further comprises a process of training the neural network, the training sample comprises a standard two-dimensional human body image marking the position of an original joint point, and the position of the original joint point is marked on the two-dimensional human body image with high accuracy by manual work.

Further, the step of obtaining the three-dimensional human body parameters of the target human body model further comprises the step of substituting the two-dimensional human body contour image into the first neural network subjected to deep learning to carry out regression of the joint points. The step of obtaining the three-dimensional human body parameters of the target human body model further comprises the steps of obtaining a joint point diagram of the target human body; obtaining semantic segmentation maps of all parts of a human body; body key points and/or body skeletal points. And substituting the generated joint point graph, semantic segmentation graph, body skeleton point and key point information of the target human body into a second neural network subjected to deep learning to carry out regression of human body posture and body type parameters.

Furthermore, the three-dimensional standard human body model is composed of parameters of a plurality of shape bases and a plurality of skeleton parameters, the shape bases form the whole human body model, and each shape base is independently controlled and changed by the parameters of the bases without mutual influence.

Further, the method comprises the following steps of 1) obtaining position coordinates of the initial posture and the target posture; 2) generating an animation sequence moving from an initial pose to a target pose; 3) in the process of generating the animation sequence, processing in a grid mesh frame inserting mode; 4) the frame interpolation speed adopts non-uniform frame interpolation speed, and is set to be slow in position from an initial point and a target point and fast in the middle movement process; 5) reducing the driving speed to the end of the movement, and keeping still for several frames at the final target posture to obtain the whole animation sequence; 6) completing the driving of the bone from the initial pose to the target pose.

Furthermore, if the posture of the target human body is complex, accurate fitting is completed in a multi-stage fitting mode.

Furthermore, the three-dimensional human body model has a mathematical weight relation between the skeleton points and the model mesh, and the determination of the skeleton points can be associated with the human body model for determining the target human body posture.

In addition, a human body model building system is also disclosed, comprising: 1) a two-dimensional human body image acquisition module; 2) a node map obtaining module; 3) a semantic segmentation map obtaining module; 4) a regression module of human body posture and body type parameters; 5) a standard three-dimensional standard human body model fitting module; 6) a three-dimensional human body model data processing and refining module; 7) and a target human body model grid output module.

A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method steps of any of the preceding claims.

An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus; a memory for storing a computer program; a processor for implementing any of the above method steps when executing a program stored in the memory.

The invention has the beneficial effects that:

1. the user operation is simple. The invention provides a method for analyzing a whole-body picture of a human body through a deep neural network to obtain accurate human body three-dimensional model parameters, only one common picture is needed to quickly model the human body, the characteristics and the trend of the Internet era are well adapted, and the method is simple and fast. The user does not need any preparation, and uploading a photo is all the work the user needs to complete. If the invention is applied to scenes such as entertainment small programs or network shopping, the experience and the viscosity of the user can be greatly enhanced. The 3D model obtained without a depth of field camera or a plurality of groups of cameras corresponds to the real shape of the human body, and provides a wide application scene for various industries, such as clothes, health and the like.

2. High frequencies use deep neural networks. The invention fully utilizes the advantages of the deep learning network and can restore the posture and the body type of the human body in various complex scenes with high precision. Different neural networks are respectively used for different purposes, and the neural network models with different input conditions and training modes are utilized, so that accurate contour separation of the human body under a complex background, semantic segmentation of the human body and determination of key points and joint points are realized, the influence of loose clothes and hairstyle is eliminated, and the real body type and shape of the human body are approached to the maximum extent. In the prior art, a neural network model is also used, but the functions and functions of the neural network model are greatly different due to different input conditions, input parameters and training modes.

3. The neural network model is more scientific and targeted. In the prior art, some image processing methods are too pursuit to simply straighten out a model, time is not spent on polishing the details of the model, mapping from a 2D picture to a 3D body model is completed purely through training of mass image data, although the efficiency is high, the processing flow is too simple, a three-dimensional human body model is generated completely by depending on a neural network model, the consistency and the effect of the proportion and the detail part of the body are not satisfactory, and the subsequent further processing is not helpful at all, and the method can become an obstacle which is difficult to be crossed by a subsequent program. The human body contour, human body semantic segmentation, key points and joint points of the neural network at the previous stage can be used as input items, model parameters can be generated from multiple angles, parameters output by the neural network at the next stage comprise two categories of postures and body types, action postures and body types can be controlled respectively, and the postures and the body types of the human body model can be accurately copied by combining the reference model.

4. The human body model is accurate and controllable. The currently popular human body reconstruction methods based on single images are mainly divided into the reconstruction of parameterized human body models. The most commonly used parameterized model is the mapau SMPL model, which contains two sets of 72 parameters for describing body posture and body size. Aiming at the problem of single picture reconstruction, the position of a two-dimensional joint is estimated from a picture, and then the SMPL parameter is obtained by optimizing through the minimum projection distance between a three-dimensional joint and a two-dimensional plane joint, so that the human body is obtained. However, the SMPL model is mainly subjected to deep learning and training through a large number of human body model examples, the relationship between the body shape and the shape base is an overall association relationship, the decoupling difficulty is high, the body part to be controlled cannot be controlled at will, the generated model cannot achieve high consistency with the real human body posture and the body shape, and in addition, if the SMPL model is further applied to the subsequent dressing process, the representation capability of the geometric details of the human body surface is limited, and the detailed texture of the clothes on the human body surface cannot be well reconstructed. However, the human body model is not obtained through training, and the parameters have corresponding relations based on the mathematical principle, that is, the parameters of each group are independent without mutual involvement, so that the model is more explanatory in the transformation process and can better represent the shape change of a certain part of the body. Generally speaking, the human body is in the shape of thousands of people, the proportion of thighs and shanks of many people does not meet a certain accurate proportion, and the model can control the thighs and the shanks and adjust the lengths of the thighs and the shanks respectively by controlling input parameters so as to accurately determine the proportion of the legs.

5. Is more suitable for the body types of Asians. Body modeling typically involves the design of a number of standard body models, so-called standard body models or base mannequins. The control of the human body, namely the control from the initial posture to the target posture can be realized through the self-built standard human body model, and the part of work is the basis that the clothes after being finished change along with the change of the human body posture. The specific process of the clothing to reach the target posture along with the human body can be calculated only if the human body accurately reaches the target posture. In the process, a set of standard human body models (standard mannequins) which are more in line with the Asian human body types are built by self instead of using a Marcap SMPL model and training a plurality of basic human body models by relying on European human body type data. The set of human body tables can comprise 170 skeleton and 20 physique-based parameters, and the detailed parts of the human body model are greatly enriched, and the detailed expression exceeds that of the SMPL model. And in combination with the characteristics of independent control of each base, each part of the mannequin can be independently and accurately controlled and modified according to requirements, so that the effect of more attractive appearance of each mannequin is achieved. In addition, the local part of the mannequin is manually adjusted in a later period, such as the number of top points and the number of faces, which are functions which cannot be completed by other models represented by the SMPL model. Except that the model height can be adjusted accurately, other types such as the size is fat thin, arm length, the proportion of shank, waist length and waistline etc. all can carry out accurate control to make the people's platform more conform to user's size.

6. The model fitting speed is high, and the accuracy of the fitted posture is high. In order to fit the posture of a target human body quickly and vividly, a multi-stage fitting method is adopted, in the first fitting, only key parts or parts with large influence on the posture are fitted, and an approximate posture of the target human body model can be fitted quickly due to few parameters participating in calculation, so that the posture of the human body model can be guaranteed to be basically consistent with the target human body. Then, in order to make the fitting accuracy of the human body model higher, we need to perform quadratic fitting on the human body model, and calculate by substituting basically all parameters, since the model has been basically moved in place, although the parameters of this step are increased a lot, the calculated amount is greatly reduced because of only fine adjustment. After that, we can basically obtain the target human body model with consistency over 95%, if the effect is not achieved, a third fine fitting can be carried out, the fitting is carried out on the basis of the last fitting according to the key parts, the key parts and the interested areas, and the imperfect parts are basically eliminated. This saves more than half the time than performing the simulation calculations for all the attitude parameters from start to finish. After actual measurement, the fact that the fitting reality degree of the model is very high is found, the effect of simulating the clothes and the human body postures is better than that of the traditional method, and the processing time can be reduced considerably.

7. In order to realistically obtain the posture of a target human body, in the change process from an initial posture (posture) to a target posture, an optimized frame interpolation method is adopted for completion, compared with the traditional frame interpolation method, the skeleton information of the target posture is obtained by model regression prediction, and meanwhile, an animation sequence moving from the initial posture to the target posture is generated; and forming a bone information time sequence from the initial posture to the target posture through frame interpolation modes such as linear interpolation, nearest neighbor interpolation and the like. In the process of generating the animation sequence, processing in a grid mesh frame inserting mode; the frame interpolation speed is set to be slow in the positions of the front and back distance initial points and the target point, and the middle movement process is fast; more importantly, the model is static for a plurality of frames when reaching the final target posture, so that the model obtains an effective buffer after moving at a high speed and before being static, further obtains the whole complete animation sequence, and the posture fitting accuracy of the human body model is higher. Compared with the method of inserting frames at a constant speed, the method is closer to the real physical world motion law, the simulation effect of the clothes and the human body posture is better, and the equivalent processing time can be reduced.

According to the invention, through the establishment of a self-standard human body model, a physique base suitable for the characteristics of the Asian human body is selected, and a secondary neural network model is used for generating parameters corresponding to 20 physique bases and 170 bones, so that a three-dimensional human body model which is closer to the Asian human body than the SMPL model of Mapuji and has better independent operation and controllability is generated.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort.

FIG. 1 is an overall process flow diagram of one embodiment;

FIG. 2 is a process flow diagram of a model parameter acquisition module of an embodiment;

FIG. 3 is a flowchart of a human model fitting process according to an embodiment;

FIG. 4 is a schematic diagram of the system of the present invention.

Detailed Description

Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by illustrating examples of the present invention.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The following describes a method for processing a human body image according to an embodiment of the present invention in detail with reference to the accompanying drawings.

As shown in fig. 1, embodiments of the present invention provide a method, a system, a device, and a storage medium for generating a three-dimensional human body model, which can greatly reduce the amount of computation and complexity of generating the three-dimensional human body model through advanced deep learning and simulated motion setting, and can achieve an effect far exceeding the reality of the existing 2D image generated human body model.

The modeling method comprises the following steps: 1) acquiring a two-dimensional image of a target human body; 2) processing to obtain a two-dimensional human body outline image of the target human body; 3) acquiring three-dimensional human body parameters of a target human body model through a two-stage neural network, wherein the three-dimensional human body parameters comprise three-dimensional human body action posture parameters and three-dimensional human body shape parameters; 4) combining the mathematical model to construct a three-dimensional standard human body model; 6) the obtained three-dimensional human body posture and body type parameters correspond to a plurality of basic and skeleton parameters of the three-dimensional standard human body model; 7) inputting the obtained groups of base and skeleton parameters into a standard three-dimensional standard human body model for fitting; 8) and obtaining a three-dimensional target human body model grid with the same posture and body type as the target human body.

The human body model modeling method of the invention roughly comprises the following three steps: firstly, obtaining parameters of a target posture human body model; secondly, generating a standard human body model; and thirdly, fitting and driving the standard human body model and the target human body model to obtain the human body model consistent with the posture and the body type of the target human body.

The first part is to process the acquired human body image to obtain the parameter information needed to generate the human body model. Previously, the selection of these skeletal key points is usually performed manually, but this method is inefficient and not suitable for the requirement of fast pace in the internet era, so that today when the neural network is in the way, it is a trend to use the deep-learning neural network to replace the manual selection of the key points. However, how to efficiently utilize the neural network is a problem that needs further research. In general, the idea of secondary neural network plus data refinement is adopted to construct the parameter acquisition system. As shown in fig. 2, we use a deep-learning neural network to generate these parameters, which mainly includes the following sub-steps: 1) acquiring a two-dimensional image of a target human body; 2) processing to obtain a two-dimensional human body outline image of a target human body; 3) substituting the two-dimensional human body contour image into a first neural network subjected to deep learning to carry out regression of the joint points; 4) obtaining a joint point diagram of a target human body; obtaining semantic segmentation maps of all parts of a human body; body key points; a body bone point; 5) substituting the generated joint point graph, semantic segmentation graph, body skeleton point and key point information of the target human body into a second neural network subjected to deep learning to carry out regression on human body posture and body type parameters; 6) and obtaining output three-dimensional human body parameters including three-dimensional human body action posture parameters and three-dimensional human body shape parameters.

The two-dimensional image of the target human body may be a two-dimensional image including a human body image in any posture and in any dressing.

The acquisition of the two-dimensional human body contour image utilizes a target detection algorithm, which is a target area fast generation network based on a convolutional neural network.

Before the two-dimensional human body image is input into the first neural network model, the method further comprises a process of training the neural network, the training sample comprises a standard two-dimensional human body image marking the position of an original joint point, and the position of the original joint point is marked on the two-dimensional human body image with high accuracy by manual work. Here, a target image is first acquired, and human body detection is performed on the target image using a target detection algorithm. Human detection is not the detection of a real human body by using a measuring instrument, but in the invention, the actual detection means that for any given image, usually a two-dimensional picture containing enough information, such as a human face, the four limbs and the body requirements of a human are all included in the picture. Then, a certain strategy is adopted to search the given image so as to determine whether the given image contains the human body, and if the given image contains the human body, parameters such as the position and the size of the human body are given. In this embodiment, before acquiring key points of a human body in a target image, human body detection needs to be performed on the target image to acquire a human body frame indicating a human body position in the target image, and since an image input by a user can be any image, there are inevitable backgrounds of some non-human body images, such as a table chair, a large-tree automobile building, and the like, and these useless backgrounds are removed through some mature algorithms.

Meanwhile, semantic segmentation, joint point detection, bone detection and edge detection are carried out, and good foundation can be laid for generating a 3D human body model later by collecting the 1D point information and the 2D surface information. A first stage neural network is used to generate a map of the joints of the human body, alternatively, a target detection algorithm may rapidly generate a network for a target area based on a convolutional neural network. The first neural network needs to carry out massive data training, some photos collected from the network are labeled by manpower, then the photos are input into the neural network for training, the neural network through deep learning can basically achieve the purpose that the joint point graph with the same accuracy and effect as those of the artificially labeled joint points can be immediately obtained after the photos are input, and meanwhile, the efficiency is tens of times or even hundreds of times that of the artificially labeled joint points. Human body joint points generally exist as human body key points, including important key points, such as a vertex key point, a neck key point, a shoulder key point, an elbow key point, a wrist key point, a hand key point, an underarm key point, a waist key point, a hip key point, a knee key point, an ankle key point, a toe top key point, and a two-leg center key point. Optionally, the key points of the human body include 1 key point of the top of the head, 2 key points of the neck, 2 key points of the shoulder, 4 key points of the elbow, 4 key points of the wrist, 2 key points of the hand, 2 key points of the underarm, 2 key points of the waist, 2 key points of the hip, 4 key points of the knee, 4 key points of the ankle, 2 key points of the top of the toes, and 1 key point of the center of the two legs. Optionally, the number of the key points of the human body is 68. For key points except important key points in the human body key points, the determination can be carried out according to the positions of the important key points. For example, a keypoint may be identified as the bisected point of two significant keypoints in a two-dimensional plane over the planar distance.

In the invention, the position of the joint point of the human body in the picture is obtained, only the first step is completed, 1D point information is obtained, and 2D surface information is generated according to the 1D point information, and the work can be completed through a neural network model and a mature algorithm in the prior art. The invention redesigns the working process and intervention time of the neural network model, reasonably designs various conditions and parameters, makes the parameter generation work more efficient, reduces the degree of manual participation, is very suitable for the internet application scene, for example, in the virtual reloading program, the user can obtain the reloading result in a basically instant manner without waiting, and plays a vital role in improving the attraction of the program to the user.

After the relevant 1D point information and 2D surface information are obtained, the parameters or results, namely the key point map, the semantic segmentation map, the body skeleton points and the key point information of the target human body can be taken as input items to be substituted into a second neural network subjected to deep learning to carry out regression of the human body posture and body type parameters. Through the regression calculation of the second neural network, a plurality of groups of three-dimensional human body parameters including three-dimensional human body action posture parameters and three-dimensional human body shape parameters can be immediately output. Preferably, the loss function of the neural network is designed based on a three-dimensional standard human body model (base human body model), a predicted three-dimensional human body model, a standard two-dimensional human body image in which the positions of the original joint points are labeled, and a standard two-dimensional human body image including the positions of the predicted joint points.

The second part is that the pre-design models some of the underlying mannequins. The main working contents are as follows: and combining the mathematical model to construct a three-dimensional standard human body model, namely a basic mannequin.

The SMPL human body model of Mapu can avoid surface distortion of a human body in the motion process, and can accurately depict the shapes of muscle stretching and contraction motions of the human body. In the method, beta and theta are input parameters, wherein beta represents 10 parameters of human body with high and low fat and thin body, head-to-body ratio and the like, and theta represents 75 parameters of human body overall motion pose and 24 joint relative angles. The beta parameter is a body type Blend posture parameter, the change of the body shape can be controlled by 10 incremental templates, and specifically, the change of the body shape controlled by each parameter can be depicted by a kinegram. By studying the continuous animation of parameter change, we can clearly see that the continuous change of each control human body form parameter can cause local and even whole linkage change of the human body model, and in order to reflect the movement of human muscle tissues, the linear change of each parameter of the SMPL human body model can cause large-area grid change. Figuratively speaking, for example, when adjusting the parameter of β 1, the model may directly understand the parameter change of β 1 as the whole change of the body, and you may only want to adjust the proportion of the waist, but the model may force the fat and thin of the legs, chest and even hands to adjust together. Although the working mode can greatly simplify the working process and improve the efficiency, the project pursuing the modeling effect is really very inconvenient. Because the SMPL human body model is a model which is trained by Western body pictures and measurement data and accords with the body type of a Western person, the body change rule basically accords with the common change curve of the Western person, and when the SMPL human body model is applied to modeling of a human body model of an Asian person, a plurality of problems can occur, such as the proportion of arms and legs, the proportion of waist and body, the proportion of neck, the length of legs, the length of arms and the like. Through our research, the aspects have large difference, and if the SMPL human body model is used in a hard way, the final generation effect can not meet our requirements.

Therefore, the effect is improved by adopting a human body model self-made mode. The core of the method is that a human body blenshape base is built to realize accurate independent control of a human body.

The three-dimensional standard human body model (basic human body platform) is composed of parameters of 20 physique bases and 170 skeleton parameters. The body bases form each part of the whole human body model, each body base is set to represent a fixed physical parameter of the body, and each body base is independently controlled and changed by the base parameters without mutual influence. For example, the neck size, length, arm thickness, arm length, leg length, thigh thickness, and body weight. It seems that there are only a few more parameters than the SMPL human model, but the content is obviously different, and we are 20 independent sets of control parameters, not ten sets of parameters linked with each other.

So-called accurate control, on the one hand has increased the parameter of control, does not continue to use ten beta control parameters of mapplet, and like this, the parameter that can adjust is except general fat thin, has still added the length of arm, the length of shank, the fat thin of waist, buttock and chest etc. has improved the parameter more than one time in the aspect of the bone parameter, has richened the scope that can adjust the parameter greatly, provides good basis for the design standard manikin that becomes more meticulous. The independent control means that each base is independently controlled, such as waist, legs, hands, head and the like, each skeleton can be independently adjusted in length and is independent from each other, and physical linkage is not generated, so that fine adjustment of the human body model can be better realized. The model is no longer foolproof and cannot be adjusted to the form satisfied by the designer.

Our existing homemade three-dimensional standard mannequin is constructed using mathematical models. The mathematical model comprises a linear interpolation mathematical model. The three-dimensional standard human body model has mathematical weight relations between skeleton points and model grids, the determination of the skeleton points can be associated with the grids of the standard human body model, and after the relations and parameters exist, the virtual reloading can be simpler and more real. The self-made model embodies a corresponding relation in mathematical principle, which is actually equivalent to that a model is redesigned from two parts of artificial aesthetics and data statistical analysis, so that the model is generated into a correct model which is considered to be in accordance with the body type of an Asian person according to the design rule of the model, and the model is obviously different from a big data training model of an SMPL (human body model), so that parameter transformation of the model is more interpretable, local body change of the body model can be better represented, the change is based on a mathematical principle mode, all parameters are not influenced, arms and legs are kept in a completely independent state, and when the length of a small arm is adjusted, the length of the large arm and the fat and thin of the small arm cannot be changed. In fact, such many different parameters are designed, so that the defect of training a human body model by big data can be avoided, the human body model is accurately controlled in more dimensions, the parameters are not limited to some indexes such as height and the like, and the modeling effect is greatly improved.

At the time of designing each basis, it is necessary to set parameters of at least two sets of the basis. At least two groups of parameters are set for representing the human body model of the design end point of the base, for example, the base of the forearm length, one group of the parameters is the shape parameters when the forearm length is 10 centimeters, the other group of the parameters is the shape parameters when the forearm length is 30 centimeters, when the length of the forearm of the human body is between the two extreme values, the parameters corresponding to the forearm length can be selected according to the forearm length of a target human body when the human body faces a target human body, the control variable takes any value between 0 and 1, the parameters of the shape base of the target human body are further obtained, and the parameters are substituted into the model to obtain the grid model of the forearm. If the cost and the calculation force allow, more groups of base parameters can be set, and the details of the human body model can be more vivid and closer to reality.

The skeleton consists of 170 skeletons, and the physique base and the skeleton are artificially manufactured according to the data statistics result of the body type of the Asian. We also used big data for analysis, but we did not use these data to train neural network models, but found intervals that fit most asian body types, including parameters such as height, weight, length, etc., and used these indices to accurately describe a human body, especially asian human body. Based on a set of basic bones, indexes such as positions, lengths and the like of the bones are adjusted, so that the bones are more attractive, and the skin driving process is more flexible to manipulate without too much redundancy. Only on the premise of self-building a form base, the setting of such many independent control parameters has practical significance, and the two are not available for meeting the requirements of designers on the standard.

The third part is fitting the parameters of the manikin to the manikin. As shown in fig. 3, the method comprises the following substeps of corresponding the obtained three-dimensional human body posture and body type parameters to a plurality of basic and skeleton parameters of a three-dimensional standard human body model; inputting the obtained groups of base and skeleton parameters into a standard three-dimensional standard human body model for fitting; the three-dimensional human body model has a mathematical weight relation between the skeleton points and the model grid, and the determination of the skeleton points can be associated with the human body model for determining the target human body posture.

In this section, the two parameters generated in the previous section are used to substitute the pre-designed human body model for the construction of the 3D human body model. The two types of parameters are similar to the names of the human body SMPL model parameters of Mapu, but the actual contained contents are different greatly. Because the basis of the two models is different, namely, the self-made three-dimensional standard human body model (basic mannequin) is adopted in the method, the SMPL model of Mapu adopts the standard human body model generated by big data training, the two models have different generation and calculation modes, and although the two models are finally embodied as the generated 3D human body model, the connotation is larger. After this step, a preliminary 3D phantom is obtained, including the bone positions and the mesh (mesh) of the phantom with long short messages.

In the part, a three-dimensional human body model which is subjected to fitting is obtained, however, when errors occur in data calculation and processing in any human body model, if the posture of a target human body is complex, accurate fitting is completed in a multi-fitting mode. However, the fitting is completed in multiple levels, the concept of an average value is not achieved, a multi-level iteration mode is adopted, the first level drives the skeleton to be close to the target posture quickly, the high efficiency in speed is guaranteed, and the second level is finely adjusted according to an algorithm.

In the first fitting, only the key parts or parts with large influence on the posture, such as the rotation parameters of bones of the neck, the shoulder, the elbow, the waist, the crotch, the knee, the ankle and the like, are fitted, and because the parameters participating in calculation are few, an approximate posture of the target human body model can be quickly fitted, and the posture of the human body model can be ensured to be basically consistent with the target human body in the step. Because we have designed two levels of deep learning neural networks in the previous steps in order to generate the pose and body type parameters of the human model. In the first fitting, we can obtain the parameters we want at this step by adjusting the settings of the neural network once. For example, an original human body image, a bone posture image, bone joint points and a semantic segmentation image can be input into a neural network as condition information, and target posture rotation state parameter information of 30 basic skeletons in a self-made standard human body model is obtained through prediction. According to the information, the approximate position of the target human body model can be fitted quickly, and calculation, simulation and time can be greatly saved due to small calculation amount. Here, the basic skeleton is actually artificially defined, mainly including arms, legs, and shoulders, but the number of the basic skeleton may be other than 30.

Because our model is a self-made model, can have as many as 170 bones, can represent the details of many human postures, and is far superior to the Mapuji model, we can more accurately represent the human postures by using other bone information on the basis of the basic skeleton. Therefore, in order to improve the accuracy of the posture fitting of the human body model, the human body model can be subjected to secondary fitting according to the requirement, other parameters are totally or partially substituted into the model for calculation, the degree of freedom of other bones of the standard human body model is opened through secondary fitting, the parameters of other bones are obtained, conditions such as edge contour and the like are input, and the fitting is performed on the result after the first-step fitting in an iterative fitting mode. Since the model has been moved into position substantially, and the human body pose of the real picture has been aligned substantially in the first step, the fitting calculation in the second step does not need much, and although the parameters involved in this step are increased a lot, the calculation amount is greatly reduced because only the pose is fine-tuned. After this we have basically obtained a target phantom with a consistency of over 95%.

If the effect does not meet the requirement, a third fine fitting can be performed, wherein fitting is performed on the basis of the previous fitting according to the key parts, the key parts and the interested areas, and imperfect places are basically eliminated. If the fitting accuracy of the finger level is achieved, a third step of refined fitting needs to be introduced, information of the hand or other important parts to be concerned is input, and further fitting is carried out on the basis of the second step. However, when the third fitting is performed, special attention is paid to the influence on the simulation calculation time, and we actually measure that after the third fitting, the model really rises further, but the total processing time reaches the level of single full parameter fitting basically. In general, this practice of the invention generally saves more than half the time compared to performing the simulation calculations for all of the pose parameters from start to finish. After actual measurement, the fact that the fitting reality degree of the model is very high is found, the effect of simulating the clothes and the human body postures is equivalent to that of the traditional method, and most importantly, the equivalent processing time can be shortened. The multi-stage iteration method can utilize the advantages of the self-built standard human body model provided by the invention to the maximum extent, each skeleton is independently controlled, the basic large skeleton and other skeletons are separately fitted, and the second fitting process can be well completed by means of refined skeleton distribution. Therefore, the scheme of firstly thickening and then thinning can ensure the fitting precision and simultaneously achieve the aim of fast convergence fitting.

In this section, the mannequin also performs a change from the initial pose to the target pose. Since we input only one photo, the target body posture on the photo is usually different from the basic body posture, and then, in order to fit the target body posture, the change from the initial posture to the target posture is completed. In order to simulate the fitting of the human body model more vividly, when a plurality of groups of base and skeleton parameters are fitted in the standard three-dimensional standard human body model, the method also comprises the following steps,

1) obtaining position coordinates of an initial posture and a target posture; the initial attitude parameters are obtained by the initial parameters of the standard mannequin model, and the bone information of the target attitude is obtained by the regression prediction of the neural network model.

2) Generating an animation sequence moving from an initial pose to a target pose; after the initial state of the bone information and the state parameter of the target posture are obtained, a time sequence of the bone information from the initial posture to the target posture is formed through frame interpolation modes such as linear interpolation, nearest neighbor interpolation and the like. In the driving process, according to the number of the bones driven by each frame, two modes of global linear interpolation, precursor movement of father nodes, secondary driving of child nodes and the like can be adopted, the driving state in the real simulation physical world is considered, the latter mode is adopted, the father bone nodes are precursor movement, and the secondary driving of the child bone nodes is performed, so that the motion of the animation sequence interpolated by the method is more fit with the real physical world, and the simulated effect is better.

3) In the process of generating the animation sequence, processing in a grid mesh frame inserting mode; after each frame drives the skeleton to move, the vertex of the human body model in the current state, namely the face information, is obtained through the calculation of the weight parameters of the standard human body platform, and the mesh state of the current human body model is updated, recorded and stored.

4) The frame interpolation speed is set to be slow in the positions of the front and back distance initial points and the target point and fast in the middle movement process; the method adopts a non-uniform frame interpolation rate, namely a mode that the moving amplitude of a single frame is small in the process of initial action and ending action and the moving amplitude of the moving intermediate process is large. The initial state of the physical action in the simulated real world has a certain acceleration process, the higher inter-frame displacement distance is kept in the motion process, and the driving speed is reduced to the end of the motion.

5) And (5) keeping static for a plurality of frames when the target gesture is driven to the final target gesture, and obtaining the whole animation sequence. Compared with the method of inserting frames at a constant speed, the method is closer to the real physical world motion law, and the simulated effect is better.

6) Completing the driving of the bone from the initial pose to the target pose.

Because the skeleton data and the grid mesh information and data are obtained, the skeleton driving becomes easier under the condition, an LBS algorithm, a DQS algorithm and a collision body are adopted, and because the model of the standard mannequin is in a three-dimensional standard posture, unreasonable interpenetration can be caused between the grid meshes of the human body model due to the change from an initial posture to a target posture, and the defect of interpenetration between the grids can be ensured only by combining the collision body.

The method of generating a three-dimensional human model according to an embodiment of the present invention described in conjunction with fig. 1 to 3 may be implemented by a human image processing apparatus. Fig. 4 is a diagram illustrating a hardware configuration 300 of an apparatus for processing a human body image according to an embodiment of the present invention.

The invention also discloses a human body model building system, which comprises: 1) a two-dimensional human body image acquisition module; 2) a node map obtaining module; 3) a semantic segmentation map obtaining module; 4) a regression module of human body posture and body type parameters; 5) a standard three-dimensional standard human body model fitting module; 6) a three-dimensional human body model data processing and refining module; 7) and a target human body model grid output module.

And a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the modeling method and steps described above.

The electronic equipment comprises a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing the communication between the processor and the memory through the communication bus; a memory for storing a computer program; a processor for implementing the modeling methods and steps described above when executing the program stored in the memory.

As shown in fig. 4, the apparatus 300 for implementing human body modeling in the present embodiment includes: the device comprises a processor 301, a memory 302, a communication interface 303 and a bus 310, wherein the processor 301, the memory 302 and the communication interface 303 are connected through the bus 310 and complete mutual communication.

In particular, the processor 301 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured as one or more integrated circuits implementing an embodiment of the present invention.

Memory 302 may include mass storage for data or instructions. By way of example, and not limitation, memory 302 may include an HDD, a floppy disk drive, flash memory, an optical disk, a magneto-optical disk, magnetic tape, or a Universal Serial Bus (USB) drive or a combination of two or more of these. Memory 302 may include removable or non-removable (or fixed) media, where appropriate. The memory 302 may be internal or external to the human image processing apparatus 300, where appropriate. In a particular embodiment, the memory 302 is a non-volatile solid-state memory. In a particular embodiment, the memory 302 includes Read Only Memory (ROM). Where appropriate, the ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory or a combination of two or more of these.

The communication interface 303 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiment of the present invention.

The bus 310 includes hardware, software, or both to couple the components of the apparatus 300 for processing human body images to each other. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hypertransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. Bus 310 may include one or more buses, where appropriate. Although specific buses have been described and shown in the embodiments of the invention, any suitable buses or interconnects are contemplated by the invention.

That is, the apparatus 300 for processing a human body image shown in fig. 4 may be implemented to include: a processor 301, a memory 302, a communication interface 303, and a bus 310. The processor 301, memory 302 and communication interface 303 are coupled by a bus 310 and communicate with each other. The memory 302 is used to store program code; the processor 301 executes a program corresponding to the executable program code by reading the executable program code stored in the memory 302 for executing the method of three-dimensional human body modeling in any embodiment of the present invention, thereby implementing the method and apparatus of three-dimensional human body modeling described in conjunction with fig. 1 to 3.

The embodiment of the invention also provides a computer storage medium, wherein the computer storage medium is stored with computer program instructions; the computer program instructions, when executed by a processor, implement the method for processing human body images provided by the embodiments of the present invention.

It is to be understood that the invention is not limited to the specific arrangements and instrumentality described above and shown in the drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions or change the order between the steps after comprehending the spirit of the present invention.

The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.

It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.

As described above, only the specific embodiments of the present invention are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims

1. A method of mannequin construction, the method comprising:

1) acquiring a two-dimensional image of a target human body;

2) processing to obtain a two-dimensional human body outline image of the target human body;

3) acquiring three-dimensional human body parameters of a target human body model through a two-stage neural network, wherein the three-dimensional human body parameters comprise three-dimensional human body posture parameters and three-dimensional human body shape parameters;

4) combining the mathematical model to construct a three-dimensional standard human body model;

6) the obtained three-dimensional human body posture and body type parameters correspond to a plurality of basic and skeleton parameters of the three-dimensional standard human body model;

7) inputting the obtained groups of base and skeleton parameters into a standard three-dimensional standard human body model for fitting;

8) and obtaining a three-dimensional target human body model grid with the same posture and body type as the target human body.

2. The method of claim 1, wherein the acquisition of the two-dimensional body contour image utilizes a target detection algorithm that is a target area fast generation network based on a convolutional neural network.

3. The method of claim 1, further comprising a process of training the neural network before inputting the two-dimensional human image into the first neural network model, the training sample comprising a standard two-dimensional human image labeling original joint point positions, the original joint point positions being labeled by an artificial on the two-dimensional human image with high accuracy.

4. The method of claim 1, wherein the step of obtaining the three-dimensional body parameters of the target human body model further comprises substituting the two-dimensional body contour image into the first neural network subjected to deep learning to perform joint point regression.

5. The method of claim 1, wherein the step of obtaining three-dimensional body parameters of the target human body model further comprises obtaining a joint point map of the target human body; obtaining semantic segmentation maps of all parts of a human body; body key points and/or body skeletal points.

6. The method of claim 5, further comprising substituting the generated joint point map, semantic segmentation map, body skeleton points and key point information of the target human body into the second neural network subjected to deep learning to perform regression of human body posture and body type parameters.

7. The method according to claim 1, wherein the three-dimensional standard human body model is composed of parameters of a plurality of shape bases and parameters of a plurality of bones, the plurality of shape bases form the whole human body model, and each shape base is independently controlled and changed by the parameters of the base without influencing each other.

8. The method of claim 1, further comprising the steps of, 1) obtaining position coordinates of an initial pose and a target pose; 2) generating an animation sequence moving from an initial pose to a target pose; 3) in the process of generating the animation sequence, processing in a grid mesh frame inserting mode; 4) the frame interpolation speed adopts non-uniform frame interpolation speed, and is set to be slow in position from an initial point and a target point and fast in the middle movement process; 5) reducing the driving speed to the end of the movement, and keeping still for several frames at the final target posture to obtain the whole animation sequence; 6) completing the driving of the bone from the initial pose to the target pose.

9. The method of claim 1, wherein if the target human body posture is complex, the accurate fitting is performed by means of multi-stage fitting.

10. The method of claim 1, wherein the three-dimensional body model has a mathematical weight relationship of bone points and a model mesh, and wherein the determination of the bone points correlates the body model to determine the target body pose.

11. A mannequin building system comprising: 1) a two-dimensional human body image acquisition module; 2) a node map obtaining module; 3) a semantic segmentation map obtaining module; 4) a regression module of human body posture and body type parameters; 5) a standard three-dimensional standard human body model fitting module; 6) a three-dimensional human body model data processing and refining module; 7) and a target human body model grid output module.

12. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-10.

13. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any of claims 1-10 when executing a program stored in the memory.